Chat pipeline — Council v2 refresh

1 · Lane detail · 9 stages

01 User input SSE

User posts a message to POST /api/chat from chat.html. X-Role-Id header determines tool palette.

Endpoint

POST /api/chat

Headers

X-Role-Id (admin · pricing · ar · bid · nutrition · production · ops · relationship · order_mgmt · all)

Body

{ messages, session_id, attachment_refs }

02 Canonical regex pre-classifier R556

~25 regex patterns. Match → invoke forced tool, skip LLM entirely. Cheapest path; saves ~$0.007 per matched query.

Table

canonical_intents

Effect

short-circuit LLM, jump straight to executeChatTool

Cite

R556

03 SYSTEM_FORCED_TOOLS allowlist

Tools that the LLM is guaranteed to be allowed to call. Prevents the LLM from suppressing critical lookups.

Source

src/chat_tools/prompt.ts

Effect

bypasses scope guard for listed tools

04 Role gate + auto-context 2 parallel checks

Role gate filters tools palette via X-Role-Id. Auto-context regex-extracts entity names and pulls cached entity summary (5min KV TTL) into system prompt.

Role table

tool_role_palettes · 10 roles

Auto-context

src/lib/auto_context.ts · KV 5min TTL

Function

getCachedAutoContext()

05 3-model dispatch Council v2

Claude Sonnet 4.6 (lead) + Kimi K2.5 (peer, free) + Workers AI Llama 3.3 70B (peer). All run in parallel with the same input.

Lead

claude-sonnet-4.6 · Anthropic · ~$0.003/q

Peer · Kimi

kimi-k2.5 · Moonshot via Cursor · $0 (free tier)

Peer · Llama

@cf/meta/llama-3.3-70b-instruct · ~$0.0001/q

Cite

R39 commit

06 Anonymized peer review R39

After parallel answers, each model receives the other two answers WITHOUT attribution. Rates which is best + revises. Prevents groupthink.

Anonymization

model names stripped from peer outputs

Goal

independent reasoning + groupthink prevention

Cite

R39 commit (council v2 + peer review)

07 Chairman synthesis 2nd Claude call

A 4th call (Claude Sonnet) sees the 3 peer-reviewed answers + ratings, picks the best or synthesizes a hybrid. Emits final answer + tool_calls[].

Model

claude-sonnet-4.6 (2nd call)

Input

3 peer-reviewed answers + ratings

Output

final answer text + tool_calls[]

Cost

~$0.003/query (the 2nd Claude call · biggest single cost driver)

08 executeChatTool + withSources 175+ tools

Each tool_call dispatched via executeChatTool. Returns are wrapped with withSources for citation chips. HITL writes land in proposed_actions not executed inline.

Source

src/chat_tools/impls.ts

Catalog

175+ tools, role-gated

withSources fields

_meta.sources[] · _meta.as_of · _meta.retrieval_path

09 Response + telemetry SSE

JSON streamed back via SSE. One row per chat call in chat_decision_audit + 8-12 rows in routing_layer_telemetry + 2 rows in chat_messages.

Stream

SSE · { answer, tool_results, sources, _meta }

Telemetry tables

chat_decision_audit · routing_layer_telemetry · chat_messages

Used by

GET /api/ai/metrics · admin dashboard · ai-cost-surface-live

R91 (stale)	R39 Council v2 (current)
Single LLM call	3 parallel models + chairman
No peer review	Anonymized peer review pass
No regex pre-classifier	Canonical regex short-circuits LLM
No forced-tool allowlist	SYSTEM_FORCED_TOOLS guarantees critical lookups
Static role check	filterToolsForRole + 10-role palette
No auto-context	getCachedAutoContext() injects entity summary
Plain return	withSources wrap → citation chips
Single audit row	chat_decision_audit + routing_layer_telemetry (8-12 rows)

Chat pipeline — Council v2 refresh

0 · Council v2 pipeline 9 lanes · 14 nodes · vertical layered

Cost target $0.007/query · 7-day live council_v2 avg $0.0165

1 · Lane detail · 9 stages

01 User input SSE

02 Canonical regex pre-classifier R556

03 SYSTEM_FORCED_TOOLS allowlist

04 Role gate + auto-context 2 parallel checks

05 3-model dispatch Council v2

06 Anonymized peer review R39

07 Chairman synthesis 2nd Claude call

08 executeChatTool + withSources 175+ tools

09 Response + telemetry SSE

2 · 3 council models · cost reference

3 · What changed from R91

4 · Open gaps

Model	Provider	Role	Cost / query
`claude-sonnet-4.6`	Anthropic	Lead	~$0.003
`kimi-k2.5`	Moonshot (via Cursor)	Peer reviewer	$0.000 (free tier)
`@cf/meta/llama-3.3-70b-instruct`	Cloudflare Workers AI	Peer reviewer	~$0.0001
`claude-sonnet-4.6` (chairman)	Anthropic	Synthesis (2nd Claude call)	~$0.003