Mike is always the loop step
HITL means human-in-the-loop. Every AI-proposed action that would write to NetSuite, send a customer email, place an order hold, or change business state must first land in proposed_actions and wait for Mike's approval. This is the load-bearing invariant of the platform — encoded in ADR-031 and surfaced via /proposed-actions.html.
The reason is not paranoia about AI. The reason is that GFS runs on Mike's judgment. A 22% gross margin floor is a Mike rule. Which customers get softer collections treatment is a Mike rule. When to push back on a vendor versus eat a 5-cent cost increase is a Mike rule. The platform exists to amplify Mike's throughput — not to replace his judgment with a model's confidence interval.
Practically: every chat tool that mutates state writes a row to proposed_actions with a structured preview of what would change, a calculated risk_level, and a list of cascade targets. The work doesn't happen until Mike approves.
What requires approval
- Any push to NetSuite — price updates, customer record changes, vendor updates, item changes.
- Any outbound customer email — quotes, dunning letters, statements, notifications.
- Order holds and credit changes — anything that affects a customer's ability to transact.
- Vendor cost commits — they cascade too far to auto-apply.
- Spec deviation acceptances — these affect bid eligibility.
What does not require approval: read-only chat queries, internal D1 syncs from NS, sync reconciliation, log writes, event emission, draft generation (the draft itself is the proposed action — sending it is the gated step).
L1 through L5 + cumulative cap
| Tier | What it means | UX treatment | Bulk-decide allowed? |
|---|---|---|---|
| L1 | Trivial reversible (draft email created) | Green band, single-tap | Yes — unlimited |
| L2 | Low-impact reversible (quote line edit) | Blue band, single-tap | Yes — up to 20 |
| L3 | Medium (single price change < $1K impact) | Amber band, confirm dialog | Yes — up to 10 |
| L4 | High (vendor cost cascade, large quote) | Orange band, typed-confirmation | No — single only |
| L5 | Critical (credit hold lift, NS record delete) | Red band, typed-confirmation + X-Edit-Token | No — single only |
Even when bulk-decide is allowed, the sum of approval impact must stay under CUMULATIVE_CAP_USD (currently $50K). If a single bulk-decide tap would exceed it, the UI breaks the batch into chunks and requires a second tap for each chunk. This is the guardrail against catastrophic fat-finger events.
How Mike approves things
The page polls GET /api/proposed-actions?status=pending every 10 seconds. Pending rows render as cards stacked by risk tier — L5 at top, L1 at bottom. Each card shows:
- Header — workflow type, customer/vendor/entity name, time created, risk badge.
- Diff preview — current value vs proposed value, side-by-side for prices; full email body for outbound.
- Cascade targets — list of tables and external systems that would receive the change.
- Edit link — for emails and quotes, Mike can edit the draft inline before approving.
- Buttons — Approve, Reject, Defer (writes a note), Edit (opens inline editor).
A floating "select all in tier" control lets Mike bulk-approve same-tier rows when appropriate. Keyboard shortcuts: A approve, R reject, D defer, E edit, ↑/↓ navigate, Cmd+Enter confirm typed-confirmation modal.
Single-decide vs bulk-decide
Monday 8am. Mike opens /proposed-actions.html. There are 23 pending rows: 14 are L1 (overnight email drafts from the inbound triage), 6 are L2 (quote line adjustments from the weekend draft_quote runs), 2 are L3 (vendor cost changes from Bongards and Driscoll), 1 is L5 (a request to lift Cardinal's credit hold).
Mike taps "select all L1", reviews the 14 email subjects in a single condensed list, taps Approve — they fire as a batch. Then "select all L2", same drill. Then the two L3 rows individually because they're cost cascades — he scans the impact reports, approves Bongards, defers Driscoll pending a phone call. The L5 row he treats deliberately: types CONFIRM, enters X-Edit-Token, approves. Total time: 6 minutes. Pre-platform, this morning was 90 minutes.
From propose to commit
-
01
Chat tool writes the proposed action
Any chat tool that mutates state calls the shared
writeProposedActionhelper rather than the mutator directly. The helper inserts a row with action_type, payload_json (full preview), proposer (tool name), entity refs, and a calculated risk_level via the BEFORE INSERT trigger from114_risk_level_trigger.sql. -
02
UI polls + renders
/proposed-actions.htmlpolls every 10s and renders pending rows. New rows fade in with a tufts-bordered ring for the first 30 seconds so Mike spots fresh items. -
03
Mike reviews + decides
Single-decide or bulk-decide. For L4/L5, a typed-confirmation modal prevents accidental approval. X-Edit-Token (from Mike's session) must accompany L5 approvals.
-
04
Atomic claim (R560)
The approve handler does
UPDATE proposed_actions SET status='approved', claimed_by=?, claimed_at=? WHERE id=? AND status='pending'. TheAND status='pending'is the load-bearing clause — it prevents the double-approve race that hit us in R559 when Mike fat-fingered bulk-decide. -
05
Cascade executes
The originating workflow's
on_approvalhandler fires: NS push enqueued, D1 writes, KV invalidations, R2 artifact regen, event emission. The handler is idempotent onproposed_action_id— if it runs twice, the second run is a no-op. -
06
Audit + event
Approval is logged to
approval_auditwith full payload. An eventhitl.approved(orhitl.rejected,hitl.deferred) is emitted with proposed_action_id. Reflexion picks it up to evaluate the quality of the AI's proposal.
What the substrate guarantees
- Every state-changing action has a traceable approval row with timestamp, approver, and full pre-state.
- No double-execution possible — the atomic claim guarantees exactly-once application.
- Rejections inform reflexion which steers future AI proposals away from rejected patterns.
- Compliance review can replay the entire decision history from
approval_audit.
What can go wrong and how to recover
Approval succeeded but the NS push or D1 cascade errored. The action sits in status='approved' AND cascade_status='failed'. Reconciliation cron retries hourly; manual retry via POST /admin/proposed-actions/.
If pending count exceeds 50, the daily digest highlights it. The longer Mike defers, the more workflows stall behind their HITL gate. Cleanup: triage in bulk first, individual decisions last.
The "HITL TOOL TEMPLATE" near line 3947 of src/index.ts enforces the pattern at code-review time. New write tools that don't call writeProposedAction get rejected in PR.
Adjacent substrate + workflows
Code paths + invariants
| Concern | Where |
|---|---|
| Doctrine | data/decisions.json ADR-031 |
| Schema | migrations/schema/113_risk_tiers_proposed_actions.sql |
| Risk trigger | migrations/schema/114_risk_level_trigger.sql |
| HITL TOOL TEMPLATE | src/index.ts ~line 3947 |
| Atomic claim | R560 — approve handler in src/index.ts |
| UI | /proposed-actions.html at repo root |
| Edit token | X-Edit-Token header — required for L5 |
| Preview/confirm | CLAUDE.md invariant #2 — ?preview=true|confirm=true |