Skip to main content

Agent Configuration

Akio is configured entirely through CLI flags — no config files, no environment variables, no API keys.

Model Selection

Specify which GGUF model to use with the -m flag:

akio run -m Qwen3-8B-Q4_K_M.gguf

Context Window

Control the context window size (in tokens) with -c:

akio run -m Qwen3-8B-Q4_K_M.gguf -c 65536

GPU Offloading

Use --ngl to control how many model layers are offloaded to the GPU. Set to 0 for CPU-only inference:

akio run -m Qwen3-8B-Q4_K_M.gguf --ngl 99

On macOS, GPU acceleration uses the Metal framework.
On Linux, CPU parallelization uses OpenMP.

Model Management

akio pull <repo>       # Download a model from Hugging Face
akio list              # List locally cached models
akio list --all        # List all available models
akio rm <repo>         # Remove a downloaded model

Models are stored in ~/.akio/models/.

Model Selection
Context Window
GPU Offloading
Model Management