Council v2 is the default chat mode (CLAUDE.md invariant #1). This shows the full pipeline: user message → pre-LLM short-circuits (canonical intent regex, forced-tool allowlist, out-of-scope guard) → role gate (X-Role-Id palette) + auto-context (entity summaries) → 3 LLMs in parallel (Claude Sonnet lead + Kimi peer + Llama peer) → anonymized peer review → chairman synthesis → tool dispatch + withSources wrapping → response, with telemetry rows written at every layer.
| Model | Provider | Role | Cost / query |
|---|---|---|---|
claude-sonnet-4.6 | Anthropic | Lead | ~$0.003 |
kimi-k2.5 | Moonshot (via Cursor) | Peer reviewer | $0.000 (free tier) |
@cf/meta/llama-3.3-70b-instruct | Cloudflare Workers AI | Peer reviewer | ~$0.0001 |
claude-sonnet-4.6 (2nd call) | Anthropic | Chairman synthesis | ~$0.003 |
| Component | Avg cost / query | Notes |
|---|---|---|
| Lead model (Claude Sonnet) | $0.003 | ~5k input + ~500 output tokens |
| Kimi K2.5 peer | $0.000 | Cursor free tier |
| Llama 3.3 70B peer | $0.0001 | Workers AI billing |
| Chairman synthesis (Claude) | $0.003 | Same model, 2nd call |
| Tool calls (D1 reads) | ~$0.0002 | Mostly within free tier |
| Embedding for retrieval | ~$0.00005 | ~3 embedding calls |
| Total avg | ~$0.007 | Per chat turn |
| Color | Meaning |
|---|---|
| frontend | User-facing surface (chat UI, admin HTML pages) |
| backend | Worker logic / agent code / business rules |
| database | D1 table / R2 object / KV key / Vectorize index |
| cloud | External system (NetSuite, Anthropic, etc.) |
| security | Gate / policy / HITL approval / kill switch |
| messagebus | Event ledger, Queues, async fan-out |
| external | Inbound source (email, webhook, cron tick, user input) |
| → solid | Synchronous call (request → response) |
| → green | Approved / happy-path |
| → red dashed | Policy or security check |
| → grey dashed | Optional / conditional / async |