Every AI write goes through a human first
The HITL lifecycle is the proposed_actions state machine. It's the platform's load-bearing safety invariant: no AI-proposed write reaches NetSuite without Mike approving it. ADR-031 codifies this; every tool that mutates business state stages a row here first.
Pre-HITL, AI tools wrote directly. The Cal-Maine incident (silent fraud flag, wrong customer flagged on a phone call) made this non-negotiable. Now the state machine is: pending (AI proposed it) → approved or rejected (Mike decided) → applied (drainer pushed to NS). Every transition writes an audit row.
The diagram lives at substrate-hitl-lifecycle.html. Schema in migration 113 (R537) with risk tier columns. The R560 fix added the atomic UPDATE...RETURNING that closes the double-approve race.
The three-state lifecycle
-
01
pending — AI proposed, awaiting human
A chat tool or workflow fan-out emits a row with
status='pending'. Payload contains the full intended write (entity_type, entity_ref, action_type, payload_json, risk_level). It lands at/proposed-actions.html, which polls every 10s and surfaces it to Mike with full context. -
02
approved or rejected — the atomic claim
Mike taps Approve or Reject. The decide endpoint runs an atomic
UPDATE proposed_actions SET status=?, decided_by=?, decided_at=? WHERE id=? AND status='pending' RETURNING *. TheWHERE status='pending'clause is the R560 fix. If two requests race (Mike double-taps the bulk-approve button, or a network retry fires twice), only the first seesRETURNINGnon-empty. The second becomes a no-op and the UI shows "already decided". -
03
applied — drainer pushed to NetSuite
Approved rows drop into
ns_pending_pushes. The push drainer (cron-triggered or queue-fed) reads pending rows, transforms payload to the NS RESTlet contract, calls thecustomscript_gfs_platform_queryRESTlet via OAuth1, and on success flipsproposed_actions.status='applied'. Rejected rows skip the drainer;status='rejected'is terminal.
L1–L5 — not all writes are equal
Migration 113 (R537) added the L1–L5 risk tier system. The tier determines the UI band color, whether Mike's typical "approve" gesture suffices, or whether a multi-step confirmation is required. Tiers are set by the 114_risk_level_trigger.sql trigger when the row inserts — not by the proposing tool.
| Tier | Meaning | Examples | Approval UX |
|---|---|---|---|
| L1 | Trivial | email draft, KV cache bust, note attachment | auto-approve eligible |
| L2 | Low risk | SO line price change < $50, AR note | single tap |
| L3 | Medium | bid line price change, inventory adjustment < $1K, BOM variance | single tap with context preview |
| L4 | High | large inventory adjustment, new customer credit limit, assembly build > $5K | tap + confirm modal |
| L5 | Critical | customer credit revoke, bulk price roll, vendor cost roll | X-Edit-Token + multi-step |
Three big red buttons
Mike can disable categories of writes globally via flags in the kill_switches table (or via env vars at startup). Each switch is a hard precondition the decide endpoint and the drainer both check.
ns_writes— master switch. When off, the drainer halts and nons_pending_pushesrows are picked up. Approvals still land but pile up in pending-apply.proposed_apply— affects the decide endpoint. When off, Mike can still see and triage cards but the Approve button is disabled. Useful for "read-only mode" during incident response.high_risk_ops— auto-rejects any incoming proposed_action withrisk_level ≥ 4. Lets L1–L3 ops continue while we pause L4–L5.
During incident response, mass-recovery, audit windows, or right after a code deploy that touched the HITL path. The R560 audit found the kill switches were never tested end-to-end; we now have a smoke check that toggles each one and confirms the expected halts.
Mike approves Driscoll's $0.06 bump
Tuesday 14:00. propose_price_change chat tool fires for SKU 10472, $1.42 → $1.48 on B5875. INSERT into proposed_actions with action_type='bid_price_update', entity_type='item', entity_ref='10472', payload_json={..., bid_id:'B5875'}, status='pending'. The 114_risk_level_trigger evaluates: under 5% movement, dollar impact < $1K cumulative cap → risk_level=3.
Mike sees the card with yellow band at 14:04. Taps Approve. The decide endpoint fires UPDATE proposed_actions SET status='approved', decided_by='mike', decided_at=NOW() WHERE id=8421 AND status='pending' RETURNING *. Returns the row — first hit wins. (If Mike's network blipped and the front-end retried, the second UPDATE returns empty — UI shows "already approved", no double-cascade.) The endpoint then enqueues a row in ns_pending_pushes targeting the NS pricing record.
PushMutexDO picks up the row 200ms later. Calls the NS RESTlet via OAuth1. NS returns 200 with the updated record. The drainer flips proposed_actions.status='applied' and writes a row to reflexion_log with the approved_at ↔ applied_at delta. Event hitl.approved fires onto the event ledger. Customer health watcher consumes it (Driscoll is 36.4% of revenue; price moves matter). Hub KV cache busts. Total clock: ~12 seconds approve-to-applied.
What the invariant gives us
- Every NS write is traceable to a Mike approval timestamp.
- Bulk decide (R532) lets Mike clear 20 cards in a tap when the context is uniform.
- The Cal-Maine-class bug (silent wrong-customer write) is structurally impossible.
- Kill switches give us a recovery surface that doesn't require code deploy.
What can go wrong
proposed_actions=approved but drainer can't reach NS. Retry policy: 3 attempts exponential. After exhaustion, status stays 'approved' and the recon cron at 0 */15 * * * re-enqueues. Manual recovery: POST /admin/ns-push/retry?action_id=<id>.
Before R560, two concurrent approves could each see status='pending' and both fire the push. R560's atomic UPDATE…WHERE status='pending' RETURNING guarantees only one wins. Second sees empty RETURNING — no-op.
A proposed_action references an entity that's since been deleted/modified in NS. Drainer detects and marks status='stale'. Mike sees a "stale" filter on /proposed-actions.html.
Mike out for the day; cards accumulate. Detection: pending count on admin-dashboard. Recovery: bulk-decide pattern (R532) when context permits; defer kill switches for batches that can wait.
Adjacent substrate
Code paths + invariants
| Concern | Where |
|---|---|
| Schema | migrations/schema/113_proposed_actions_risk_tier.sql (R537) |
| Risk tier trigger | migrations/schema/114_risk_level_trigger.sql |
| Atomic claim | src/index.ts decide handler — UPDATE…WHERE status='pending' RETURNING |
| Bulk decide | R532 — POST /api/proposed-actions/bulk-decide |
| Drainer | PushMutexDO + ns_pending_pushes → NS_PUSH_QUEUE |
| HITL invariant | ADR-031 (data/decisions.json) |
| Tool template | src/index.ts ~line 3947 — "HITL TOOL TEMPLATE" |
| Kill switches | kill_switches table: ns_writes, proposed_apply, high_risk_ops |
| Event emit | events.event_type='hitl.approved' / 'hitl.rejected' / 'hitl.applied' |
| Audit | reflexion_log entity_type='proposed_action' |