Skip to main content

Agent Configuration

Akio is configured entirely through CLI flags — no config files, no environment variables, no API keys.


Model Selection

Specify which GGUF model to use with the -m flag:

akio run -m Qwen3-8B-Q4_K_M.gguf

Context Window

Control the context window size (in tokens) with -c:

akio run -m Qwen3-8B-Q4_K_M.gguf -c 65536

GPU Offloading

Use --ngl to control how many model layers are offloaded to the GPU. Set to 0 for CPU-only inference:

akio run -m Qwen3-8B-Q4_K_M.gguf --ngl 99
  • On macOS, GPU acceleration uses the Metal framework.
  • On Linux, CPU parallelization uses OpenMP.

Model Management

akio pull <repo> # Download a model from Hugging Face
akio list # List locally cached models
akio list --all # List all available models
akio rm <repo> # Remove a downloaded model

Models are stored in ~/.akio/models/.