Wiki · Substrate piece

HITL approval flow

Every write to NetSuite or business state passes through Mike. The proposed_actions table is the inbox; /proposed-actions.html is the UI; ADR-031 is the doctrine. This is how the platform stays trustworthy.

Real · Load-bearing invariant
What this is

Mike is always the loop step

HITL means human-in-the-loop. Every AI-proposed action that would write to NetSuite, send a customer email, place an order hold, or change business state must first land in proposed_actions and wait for Mike's approval. This is the load-bearing invariant of the platform — encoded in ADR-031 and surfaced via /proposed-actions.html.

The reason is not paranoia about AI. The reason is that GFS runs on Mike's judgment. A 22% gross margin floor is a Mike rule. Which customers get softer collections treatment is a Mike rule. When to push back on a vendor versus eat a 5-cent cost increase is a Mike rule. The platform exists to amplify Mike's throughput — not to replace his judgment with a model's confidence interval.

Practically: every chat tool that mutates state writes a row to proposed_actions with a structured preview of what would change, a calculated risk_level, and a list of cascade targets. The work doesn't happen until Mike approves.

When it engages

What requires approval

What does not require approval: read-only chat queries, internal D1 syncs from NS, sync reconciliation, log writes, event emission, draft generation (the draft itself is the proposed action — sending it is the gated step).

Risk model

L1 through L5 + cumulative cap

TierWhat it meansUX treatmentBulk-decide allowed?
L1Trivial reversible (draft email created)Green band, single-tapYes — unlimited
L2Low-impact reversible (quote line edit)Blue band, single-tapYes — up to 20
L3Medium (single price change < $1K impact)Amber band, confirm dialogYes — up to 10
L4High (vendor cost cascade, large quote)Orange band, typed-confirmationNo — single only
L5Critical (credit hold lift, NS record delete)Red band, typed-confirmation + X-Edit-TokenNo — single only
Cumulative cap

Even when bulk-decide is allowed, the sum of approval impact must stay under CUMULATIVE_CAP_USD (currently $50K). If a single bulk-decide tap would exceed it, the UI breaks the batch into chunks and requires a second tap for each chunk. This is the guardrail against catastrophic fat-finger events.

The /proposed-actions.html UX

How Mike approves things

The page polls GET /api/proposed-actions?status=pending every 10 seconds. Pending rows render as cards stacked by risk tier — L5 at top, L1 at bottom. Each card shows:

A floating "select all in tier" control lets Mike bulk-approve same-tier rows when appropriate. Keyboard shortcuts: A approve, R reject, D defer, E edit, ↑/↓ navigate, Cmd+Enter confirm typed-confirmation modal.

Worked example

Single-decide vs bulk-decide

Scenario

Monday 8am. Mike opens /proposed-actions.html. There are 23 pending rows: 14 are L1 (overnight email drafts from the inbound triage), 6 are L2 (quote line adjustments from the weekend draft_quote runs), 2 are L3 (vendor cost changes from Bongards and Driscoll), 1 is L5 (a request to lift Cardinal's credit hold).

Mike taps "select all L1", reviews the 14 email subjects in a single condensed list, taps Approve — they fire as a batch. Then "select all L2", same drill. Then the two L3 rows individually because they're cost cascades — he scans the impact reports, approves Bongards, defers Driscoll pending a phone call. The L5 row he treats deliberately: types CONFIRM, enters X-Edit-Token, approves. Total time: 6 minutes. Pre-platform, this morning was 90 minutes.

Step-by-step what happens

From propose to commit

  1. 01

    Chat tool writes the proposed action

    Any chat tool that mutates state calls the shared writeProposedAction helper rather than the mutator directly. The helper inserts a row with action_type, payload_json (full preview), proposer (tool name), entity refs, and a calculated risk_level via the BEFORE INSERT trigger from 114_risk_level_trigger.sql.

    Writes proposed_actions (status='pending')
    Time ~100ms
  2. 02

    UI polls + renders

    /proposed-actions.html polls every 10s and renders pending rows. New rows fade in with a tufts-bordered ring for the first 30 seconds so Mike spots fresh items.

    Reads GET /api/proposed-actions?status=pending
  3. 03

    Mike reviews + decides

    Single-decide or bulk-decide. For L4/L5, a typed-confirmation modal prevents accidental approval. X-Edit-Token (from Mike's session) must accompany L5 approvals.

    UI /proposed-actions.html
    Time seconds to minutes
  4. 04

    Atomic claim (R560)

    The approve handler does UPDATE proposed_actions SET status='approved', claimed_by=?, claimed_at=? WHERE id=? AND status='pending'. The AND status='pending' is the load-bearing clause — it prevents the double-approve race that hit us in R559 when Mike fat-fingered bulk-decide.

    Writes proposed_actions.status='approved'
    Invariant R560 atomic claim
  5. 05

    Cascade executes

    The originating workflow's on_approval handler fires: NS push enqueued, D1 writes, KV invalidations, R2 artifact regen, event emission. The handler is idempotent on proposed_action_id — if it runs twice, the second run is a no-op.

    Writes per the workflow's cascade_tables_json
  6. 06

    Audit + event

    Approval is logged to approval_audit with full payload. An event hitl.approved (or hitl.rejected, hitl.deferred) is emitted with proposed_action_id. Reflexion picks it up to evaluate the quality of the AI's proposal.

    Writes approval_audit, events
    Emits hitl.approved
Outcomes

What the substrate guarantees

Audit coverage
100%
every write traces to an approval
Race safety
Atomic
R560 claim
Reversibility
Defer ≠ reject
three-way decision
Reflexion
Self-improving
rejections improve future proposals
Failure modes

What can go wrong and how to recover

Approved but cascade fails

Approval succeeded but the NS push or D1 cascade errored. The action sits in status='approved' AND cascade_status='failed'. Reconciliation cron retries hourly; manual retry via POST /admin/proposed-actions//retry-cascade.

Backlog accumulates

If pending count exceeds 50, the daily digest highlights it. The longer Mike defers, the more workflows stall behind their HITL gate. Cleanup: triage in bulk first, individual decisions last.

Tool tries to bypass HITL

The "HITL TOOL TEMPLATE" near line 3947 of src/index.ts enforces the pattern at code-review time. New write tools that don't call writeProposedAction get rejected in PR.

Related

Adjacent substrate + workflows

For developers

Code paths + invariants

ConcernWhere
Doctrinedata/decisions.json ADR-031
Schemamigrations/schema/113_risk_tiers_proposed_actions.sql
Risk triggermigrations/schema/114_risk_level_trigger.sql
HITL TOOL TEMPLATEsrc/index.ts ~line 3947
Atomic claimR560 — approve handler in src/index.ts
UI/proposed-actions.html at repo root
Edit tokenX-Edit-Token header — required for L5
Preview/confirmCLAUDE.md invariant #2 — ?preview=true|confirm=true
// The HITL TOOL TEMPLATE (paraphrased from src/index.ts ~3947) async function myWriteTool(args) { // 1. Compute the proposed change (no writes yet) const preview = computePreview(args); // 2. Write to proposed_actions — risk_level set by trigger const action_id = await writeProposedAction({ action_type: 'my_write', payload_json: preview, cascade_targets: ['pricing_master', 'netsuite'], }); // 3. Return — wait for Mike. Cascade fires from approve handler. return { status: 'pending_approval', action_id }; }