Multi-Model Configuration
Commander uses 5 model roles across 3 providers, automatically selecting the right model for each workflow step.
Model Roles
| Role | Default Model | Provider | Purpose | Cost |
|---|---|---|---|---|
| π§ Thinking | Claude Opus | Anthropic | Analysis, planning, debugging | $$$ |
| π» Coding | Claude Sonnet | Anthropic | Code generation, fixes | $$ |
| β‘ Task | Gemini Flash | OpenRouter | Search, classification | $ |
| π Embedding | nomic-embed-text | Ollama | Vector embeddings | Free |
| π Auditor | llama3.2:3b | Ollama | Security review | Free |
How Routing Works
Step type β Model role β Provider:
step_type: analyze β Thinking β Claude Opus
step_type: code β Coding β Claude Sonnet
step_type: search β Task β Gemini Flash
step_type: execute β (shell) β No model needed
View Configuration
murc models
π€ Model Configuration
auditor Ollama / llama3.2:3b (max $0.00/call, temp 0.1)
coding Anthropic / claude-sonnet-4 (max $2.00/call, temp 0.2)
embedding Ollama / nomic-embed-text (max $0.00/call, temp 0.0)
task OpenRouter / gemini-2.0-flash (max $0.50/call, temp 0.3)
thinking Anthropic / claude-opus-4 (max $5.00/call, temp 0.3)
Cost Tracking
murc models cost --period 7d
Commander tracks every API call's cost. The constitution enforces limits:
max_api_cost_per_runβ per workflow executionmax_api_cost_per_dayβ daily cap across all workflows
Environment Variables
| Variable | Required For |
|---|---|
ANTHROPIC_API_KEY | Thinking + Coding models |
OPENROUTER_API_KEY | Task model (Gemini Flash) |
OLLAMA_BASE_URL | Custom Ollama endpoint (default: localhost:11434) |
Three-Layer Cache
To minimize API costs, Commander uses a 3-layer cache:
- L1 (Struct) β Pre-loaded workflows, patterns, config
- L2 (Query) β Search results and injection results
- L3 (LLM) β Promptβresponse pairs with TTL
murc cache stats