System memory + learning — what AI remembers, how it gets curated Memory5 layers

5 memory layers · 4 writers · 5 readers · status state-machine in the middle

Mike asked: memory + learning system — show me what the AI remembers and how it gets curated. 5 memory layers (reflexion, decision corpus, wiki, training loop, knowledge chunks) with the status state-machine in the middle. The curation status is what prevents poison: only approved_for_reuse rows feed retrieval. Write paths up top, read paths bottom, stacked into the LLM context block.

0 · Visual flow 4 lanes · 20 nodes

System flow
01 / Write paths (where memory comes from) 02 / Memory layers (5 stores) 03 / Curation cycle (status state-machine for reflexion) 04 / Read paths (how chat retrieves) Every workflow run writes a reflexion_log row with tags + narrative when contract.reflexion_enabled=1. Default-on; trivial workflows opt out. WRITE: reflexion writer source: executeWorkflowContract table: reflexion_log trigger: contract.reflexion_enabled=1 payload: entity_type, entity_id, observation, tags, run_id, status='needs_review' i Workflow runner every run On approve in /proposed-actions.html, the system writes a decision_corpus row with (situation_json, action_json, rationale). The triple becomes vector-searchable and used in few-shot for future similar situations. WRITE: decision_corpus writer source: HITL approval handler table: decision_corpus vectorized: yes (embedding of situation + action) effect: future chat retrieves these as few-shot examples i HITL approval mike approves Nightly cron summarizes the last 24h of reflexion + decision_corpus into a section of docs/SYSTEM_WIKI.md using Claude Haiku. Output written to llm_wiki_log table + md file. WRITE: llm_wiki writer source: cron @ 03:00 UTC model: Claude Haiku inputs: reflexion_log + decision_corpus + chat_messages (last 24h) outputs: docs/SYSTEM_WIKI.md + llm_wiki_log i Nightly wiki cron R84 weekly Haiku distill Karpathy-style continuous eval. Every 4h cron picks 5 questions from eval/golden-set, runs them through chat, has Haiku grade, writes training_loop_runs row. WRITE: training_loop writer source: cron every 4h count: 5 questions / tick grader: Claude Haiku table: training_loop_runs i Training loop cron R259 every 4h Per-run observations the AI writes. Each row has entity_type, entity_id, observation text, tags, and a curation status. Status prevents the 'poison memory' problem — only approved_for_reuse rows are pulled into future retrievals. LAYER 1: reflexion_log schema: entity_type, entity_id, observation, tags, status, run_id, created_at status_values: needs_review, approved_for_reuse, rejected, expired, one_time_exception, sensitive, superseded writer: workflow runner reader: chat retrieval (status='approved_for_reuse' only) i reflexion_log AI run observations R552 substrate · status-curated (situation_json, action_json, rationale) triples written on HITL approval. Vectorized via Vectorize. Pulled in by similarity for few-shot prompting on similar future situations. LAYER 2: decision_corpus schema: situation_json, action_json, rationale, embedding_vector, created_at, approved_by writer: HITL approval reader: chat retrieval by cosine similarity (topK=5) index: decision_corpus Vectorize i decision_corpus Validated learnings (HITL-approved) few-shot source · vectorized Weekly+nightly Haiku-distilled summaries of activity. Written to docs/SYSTEM_WIKI.md (markdown source-of-truth) + llm_wiki_log (D1 row per generation for audit). LAYER 3: llm_wiki schema (D1): wiki_section, generated_at, source_count, summary_md, model='claude-haiku' surface: docs/SYSTEM_WIKI.md (md), llm_wiki_log (D1) writer: nightly cron reader: chat retrieval grep by topic i llm_wiki_log + docs/SYSTEM_WIKI.md Haiku-distilled knowledge R84 nightly cron 5 questions per 4h cron. Each run: question, chat answer, Haiku grade, reasoning, pass/fail. Becomes both audit + feedback signal. LAYER 4: training_loop_runs schema: question_id, answer, grade, reasoning, run_at, model, latency_ms cadence: 5 questions / 4h source: eval/golden-set/*.yaml reader: weekly grade-trend dashboards + tech-debt prioritization i training_loop_runs Karpathy eval cycle R259 continuous eval All long-form platform knowledge chunked + embedded into Vectorize ns_knowledge index. 3,360 chunks today: PLATFORM_INVENTORY.md sections, all 36 ADRs, SYSTEM_WIKI sections, saved_searches catalog, methodology doc. LAYER 5: knowledge_chunks index: ns_knowledge (Vectorize) count: 3,360 chunks (R555) sources: manifest, ADRs, system_wiki, saved_searches, methodology reader: chat retrieval (topK=8) for off-the-shelf context i knowledge_chunks (ns_knowledge Vectorize) Manifest + ADRs + guide + saved searches R555 · 3,360 chunks Reflexion curation state. CURATION: needs_review meaning: Mike triages next_state: depends on Mike's triage or auto-rules retrieval_eligible: no i needs_review Mike triages Reflexion curation state. CURATION: approved_for_reuse meaning: used in retrieval next_state: depends on Mike's triage or auto-rules retrieval_eligible: yes i approved_for_reuse used in retrieval Reflexion curation state. CURATION: rejected meaning: never retrieved next_state: depends on Mike's triage or auto-rules retrieval_eligible: no i rejected never retrieved Reflexion curation state. CURATION: expired meaning: auto-archived next_state: depends on Mike's triage or auto-rules retrieval_eligible: no i expired auto-archived Reflexion curation state. CURATION: sensitive meaning: redacted from chat next_state: depends on Mike's triage or auto-rules retrieval_eligible: no i sensitive redacted from chat READ: Recent reflexion filter: WHERE status=approved_for_reuse stacked: yes — appended to LLM context in fixed order budget: ~2k tokens per layer (truncated) i Recent reflexion WHERE status=approved_for_reuse READ: decision_corpus filter: topK=5 by cosine stacked: yes — appended to LLM context in fixed order budget: ~2k tokens per layer (truncated) i decision_corpus topK=5 by cosine READ: wiki sections filter: grep by topic stacked: yes — appended to LLM context in fixed order budget: ~2k tokens per layer (truncated) i wiki sections grep by topic READ: training_loop_runs filter: last 24h grades stacked: yes — appended to LLM context in fixed order budget: ~2k tokens per layer (truncated) i training_loop_runs last 24h grades READ: knowledge_chunks filter: topK=8 by cosine stacked: yes — appended to LLM context in fixed order budget: ~2k tokens per layer (truncated) i knowledge_chunks topK=8 by cosine Final stacking: results from all 5 read paths concatenated in fixed order into the system prompt. Each layer has a token budget (~2k); excess truncated. CONTEXT: LLM stack order: 1) recent reflexion, 2) decision_corpus, 3) wiki sections, 4) training stats, 5) knowledge chunks budget_per_layer: ~2k tokens total_context: ~12k tokens reserved before user message i LLM context stack reflexion + decision + wiki + training + knowledge → prompt ordering by recency, then similarity

1 · What this is

goal
Explain how AI memory works in the platform: what gets recorded, how it gets curated, how chat retrieves it.
layout
vertical layered (write paths up top, memory layers in the middle, curation + read paths at bottom)
layers
5 memory stores · status-curated reflexion is the keystone
substrate
R552 (reflexion) + R555 (knowledge_chunks) + R259 (training_loop)

2 · The 5 memory layers

LayerPurposeWriterRetrieval
reflexion_logPer-run AI observations, status-curated to prevent poisonWorkflow runner (R552)WHERE status=approved_for_reuse
decision_corpus(situation, action, rationale) triples from approved HITLHITL approval handlercosine similarity topK=5
llm_wiki_log + docs/SYSTEM_WIKI.mdHaiku-distilled weekly knowledgeR84 nightly crongrep by topic
training_loop_runsKarpathy eval cycle — 5 questions / 4hR259 training cronrecent-N grade trends
knowledge_chunks (ns_knowledge)Vectorized manifest + ADRs + guide + saved searchesBulk ingest + nightly incrementalcosine similarity topK=8

3 · Curation status state machine

Reflexion entries flow through these statuses. The status field is what makes reflexion safe: chat retrieval filters on status='approved_for_reuse'. The other states protect future runs from learning the wrong thing.

StatusMeaningRetrieval eligible?
needs_reviewDefault on insert. Awaiting Mike's triage in proposed-actions / training UI.No
approved_for_reuseMike confirmed this is a real learning. Active in retrieval.Yes
rejectedMike marked it as wrong / noise. Permanent skip.No
expiredAuto-archived after N days without being approved.No
one_time_exceptionReal outcome, but not generalizable. Keep for audit; don't replay.No
sensitiveContains PII / vendor cost / restricted info. Redacted in retrieval.No
supersededA later, better reflexion replaced this one. Kept for history only.No

4 · Read path: how chat assembles context

Every chat call runs the 5 read paths in parallel and stacks them in fixed order into the system prompt:

  1. Recent reflexion — last N approved_for_reuse rows touching the entity
  2. decision_corpus — top-5 nearest neighbors by cosine to the question
  3. wiki sections — grep by detected topic keywords
  4. training_loop_runs — last-24h grade trends (used to bias which tools are emphasized)
  5. knowledge_chunks — top-8 nearest neighbors from ns_knowledge

Each layer has a ~2k token budget; layers exceeding budget are truncated tail-first. Total memory context: ~12k tokens reserved before the user message.

5 · How to read it

ColorMeaning
frontendUser-facing surface (chat UI, admin HTML pages)
backendWorker logic / agent code / business rules
databaseD1 table / R2 object / KV key / Vectorize index
cloudExternal system (NetSuite, Anthropic, etc.)
securityGate / policy / HITL approval / kill switch
messagebusEvent ledger, Queues, async fan-out
externalInbound source (email, webhook, cron tick, user input)
→ solidSynchronous call (request → response)
→ greenApproved / happy-path
→ red dashedPolicy or security check
→ grey dashedOptional / conditional / async