Chat Pipeline (R91)

End-to-end chat handler · Guardrails → Pattern → Few-shot → Retrieval → Council v2 → Tools → Constitutional → Citations

01 Input 02 Pre-checks (Guardrails) 03 Retrieval (Pattern + Few-shot + Vector) 04 Generation (Council + Tools) 05 Post-checks (Validation) 06 Output User question chat.html · role selected Pillar resolution R85 ROLE_FILTERS lookup POST /api/chat src/index.ts handler PII scrub R75-D · redact phone/SSN/etc Prompt injection screen R75-D · jailbreak detect Off-topic redirect R75-D · scope check Canonical pattern (R61) regex pre-classifier · forced tool Decision few- shot (R89) top-5 precedents by recency+confidence Vectorize (8 ns) + hybrid spec · pricing · contacts · bids … Council v2 (R39) 3 models // anon peer review Tool execution role-gated R85 · scoped subset Reflexion retry (R87) single SuiteQL retry on failure Multi-source validate R71 recency + R72 master + invoice Constitutional review (R87) checks against memory rules Citation validator (R76) tool-result match + char-precise Citations API (R75-B) char-precise refs in stream Streaming SSE / response user sees answer Audit logs chat_messages · ai_audit_log · counci… Legend User UI Worker / agent D1 table / store CF binding Tool / action External system Policy / gate

Closed-loop guarantees

  • • PII never enters LLM context (scrub before model)
  • • Tools return data · LLM narrates (Pattern #12)
  • • Citations are validated against tool results (R76)
  • • Memory rules enforced post-gen (R87)
  • • Every request logged to chat_messages + ai_audit_log

Pre-LLM short-circuits

  • • Canonical pattern match → forced tool, skip classifier
  • • Off-topic / injection → reject before model spend
  • • Decision few-shot grounds output in past precedents
  • • Vector retrieval narrows context to relevant namespace

Cost & latency

  • • Council v2 default: ~$0.007/query
  • • Single-model role: ~$0.001/query
  • • Reflexion adds ~1.5s on SuiteQL fail (rare)
  • • Streaming reduces perceived latency

Failure modes caught

  • • Hallucination → caught by citation validator
  • • Wrong tool selection → forced by canonical pattern
  • • Stale data → recency ranking (R71) surfaces newest
  • • Tool exec failure → Reflexion single retry
  • • Memory violation → Constitutional review blocks