Agent Configuration
Akio is configured entirely through CLI flags — no config files, no environment variables, no API keys.
Model Selection
Specify which GGUF model to use with the -m flag:
akio run -m Qwen3-8B-Q4_K_M.gguf
Context Window
Control the context window size (in tokens) with -c:
akio run -m Qwen3-8B-Q4_K_M.gguf -c 65536
GPU Offloading
Use --ngl to control how many model layers are offloaded to the GPU. Set to 0 for CPU-only inference:
akio run -m Qwen3-8B-Q4_K_M.gguf --ngl 99
- On macOS, GPU acceleration uses the Metal framework.
- On Linux, CPU parallelization uses OpenMP.
Model Management
akio pull <repo> # Download a model from Hugging Face
akio list # List locally cached models
akio list --all # List all available models
akio rm <repo> # Remove a downloaded model
Models are stored in ~/.akio/models/.