Chat Pipeline (R91) Diagram

Closed-loop guarantees

• PII never enters LLM context (scrub before model)
• Tools return data · LLM narrates (Pattern #12)
• Citations are validated against tool results (R76)
• Memory rules enforced post-gen (R87)
• Every request logged to chat_messages + ai_audit_log

Pre-LLM short-circuits

• Canonical pattern match → forced tool, skip classifier
• Off-topic / injection → reject before model spend
• Decision few-shot grounds output in past precedents
• Vector retrieval narrows context to relevant namespace

Cost & latency

• Council v2 default: ~$0.007/query
• Single-model role: ~$0.001/query
• Reflexion adds ~1.5s on SuiteQL fail (rare)
• Streaming reduces perceived latency

Failure modes caught

• Hallucination → caught by citation validator
• Wrong tool selection → forced by canonical pattern
• Stale data → recency ranking (R71) surfaces newest
• Tool exec failure → Reflexion single retry
• Memory violation → Constitutional review blocks