Chat — Council v2 routing pipeline Council v2replaces R91

POST /api/chat · 3 LLMs in parallel · anonymized peer review · ~$0.007/query

Council v2 is the default chat mode (CLAUDE.md invariant #1). This shows the full pipeline: user message → pre-LLM short-circuits (canonical intent regex, forced-tool allowlist, out-of-scope guard) → role gate (X-Role-Id palette) + auto-context (entity summaries) → 3 LLMs in parallel (Claude Sonnet lead + Kimi peer + Llama peer) → anonymized peer review → chairman synthesis → tool dispatch + withSources wrapping → response, with telemetry rows written at every layer.

0 · Visual flow 7 lanes · 17 nodes

System flow

1 · What this is

goal

Document the Council v2 chat routing pipeline that replaced the R91 single-model flow.

layout

vertical layered (user → short-circuits → role gate → council → chairman → tools → telemetry)

lanes

7 lanes · 17 nodes

cost

~$0.007/query average (Claude Sonnet $0.003 + Kimi $0 + Llama $0.0001 + chairman $0.003 + tools)

replaces

flows-diagrams/chat-pipeline.html (R91, stale)

2 · The 3 council models

Model	Provider	Role	Cost / query
`claude-sonnet-4.6`	Anthropic	Lead	~$0.003
`kimi-k2.5`	Moonshot (via Cursor)	Peer reviewer	$0.000 (free tier)
`@cf/meta/llama-3.3-70b-instruct`	Cloudflare Workers AI	Peer reviewer	~$0.0001
`claude-sonnet-4.6` (2nd call)	Anthropic	Chairman synthesis	~$0.003

3 · Cost breakdown ~$0.007/query target

Component	Avg cost / query	Notes
Lead model (Claude Sonnet)	$0.003	~5k input + ~500 output tokens
Kimi K2.5 peer	$0.000	Cursor free tier
Llama 3.3 70B peer	$0.0001	Workers AI billing
Chairman synthesis (Claude)	$0.003	Same model, 2nd call
Tool calls (D1 reads)	~$0.0002	Mostly within free tier
Embedding for retrieval	~$0.00005	~3 embedding calls
Total avg	~$0.007	Per chat turn

4 · How to read it

Color	Meaning
frontend	User-facing surface (chat UI, admin HTML pages)
backend	Worker logic / agent code / business rules
database	D1 table / R2 object / KV key / Vectorize index
cloud	External system (NetSuite, Anthropic, etc.)
security	Gate / policy / HITL approval / kill switch
messagebus	Event ledger, Queues, async fan-out
external	Inbound source (email, webhook, cron tick, user input)
→ solid	Synchronous call (request → response)
→ green	Approved / happy-path
→ red dashed	Policy or security check
→ grey dashed	Optional / conditional / async