Substrate · HITL Lifecycle · proposed

Schema recap — proposed_actions

column	type	notes
`action_id`	INTEGER PK	autoincrement
`action_type`	TEXT	e.g. `price_change`, `bulk_cost_basis`, `workflow_<type>`, `propose_email_to_customer`
`entity_type`	TEXT	`customer` \| `item` \| `workflow_run` \| etc.
`entity_ref`	TEXT	NS id or run_id; identifies the target of the change
`current_state_json`	TEXT	pre-image (what's there now)
`proposed_change_json`	TEXT	post-image (the diff the agent wants applied)
`rationale`	TEXT	the why-now, surfaced in the queue UI
`status`	TEXT	`pending` → `approved` → `pushing` → `applied` \| `failed`; or `pending` → `rejected`
`risk_level`	INTEGER	1-5, see tier table below (migration 113)
`decided_at` / `decided_by`	TEXT	set atomically in the claim UPDATE
`proposed_by`	TEXT	e.g. `workflow_runner`, `r290:executor`, `chat:role=admin`
`proposed_at`	TEXT	creation timestamp

Risk tiers — migration 113 (R537)

tier	name	example action_types	HITL gate
L1	note / tag	`note`, `tag`, `classification`	auto-approve in workflow_runner (no proposal staged)
L2	safe NS field	`ns_field_update`, `spec_update`, `create_customer_program`, `other`	auto-approve in workflow_runner
L3	medium write	`price_change`, `bid_status_update`, `quote_draft`, `vendor_failover`, `bulk_cost_basis`, `collection_action`	HITL required (`risk_level ≥ 3` gates in runner)
L4	creates new entity	`propose_create_customer`, `propose_create_vendor`, `propose_create_item`, `soft_delete`	HITL required
L5	destructive bulk	`bulk_delete`, `destructive_bulk`, `mass_price_change`	HITL required + cumulative-ceiling guardrails (CostCapDO)

The runner's HITL gate at stage 4 checks risk_level ≥ 3 AND !opts.hitl_approved. L1-L2 contracts skip the gate entirely. The decide endpoint enforces a per-step approval regardless — risk_level is advisory there, not authoritative.

The race fix — R560 / codex audit CRITICAL #1

Two simultaneous approvers (two browser tabs, two admins, or one admin + a programmatic retry) could both claim the same action. The fix moves to an atomic UPDATE...RETURNING that's idempotent under concurrency — only one approver receives the action_id, only one enqueues.

OLD — broken (pre-R560)

// 3-statement D1 batch:
// 1. INSERT ns_pending_pushes  ← both win
// 2. UPDATE proposed_actions
//    SET status='approved' WHERE action_id=?
// 3. INSERT decision_corpus

// Race: two approvers
// each INSERT push row first.
// Then both UPDATE — the loser's
// cleanup DELETE could remove
// the winner's queue row,
// OR both rows could dispatch.
// No SELECT...FOR UPDATE in D1.

NEW — R560 atomic claim

// 1. Atomic claim — only ONE row wins:
const claim = await env.DB.prepare(
  `UPDATE proposed_actions
   SET status='approved',
       decided_at=datetime('now'),
       decided_by='admin:api'
   WHERE action_id=?2 AND status='pending'
   RETURNING action_id`
).bind(notes, actionId).first();

if (!claim?.action_id) {
  // Loser path: 409 already_decided
  return json({ ok:false,
    error:'already_decided',
    current_status: ... }, 409);
}

// 2. Only the winner enqueues:
await env.DB.batch([
  INSERT ns_pending_pushes,
  INSERT decision_corpus,
]);

// 3. If enqueue fails AFTER claim:
//    revert claim (status ← 'pending')
//    so operator can retry.

The decide endpoint — `POST /api/proposed-actions/:id/decide`

Located at src/index.ts:25005-25109. Body: { decision: 'approved' | 'rejected', notes?: string }. Requires X-Edit-Token (R356).

Gate — checkEditToken(request, env). Read-only API key cannot mutate HITL state.
Lookup — SELECT * FROM proposed_actions WHERE action_id=?1. 404 if not found; 400 if status !== 'pending'.
Decision: rejected — single UPDATE...RETURNING, then recordEvent('hitl.rejected'). Done.
Decision: approved — atomic claim (UPDATE...RETURNING). Loser gets 409. Winner proceeds.
Winner: enqueue batch — D1.batch([INSERT ns_pending_pushes, INSERT decision_corpus]).
If batch fails — revert claim (UPDATE status='pending') so operator can retry; return 500 with "claim reverted; safe to retry".
Success — recordEvent('hitl.approved', payload={...risk_level, queued_to:'ns_pending_pushes'}); return 200.

Drainer — ns_pending_pushes → NetSuite

stage	state transition	side effects
1. Approval enqueues	proposed_actions: pending → approved · ns_pending_pushes INSERT (status='queued')	decision_corpus row written
2. Drainer picks up	ns_pending_pushes: queued → picking (picked_at set)	proposed_actions still 'approved' — transition to 'pushing' is implicit via push status
3. NS push (via NS_PUSH_QUEUE / CF Queue consumer)	ns_pending_pushes: picking → sent (sent_at set)	NS RESTlet write; OAuth1 TBA
4a. NS write confirmed	ns_pending_pushes: sent → applied · proposed_actions: approved → applied	+ proposed_actions_applied_mirror INSERT (audit)
4b. NS error after retries	ns_pending_pushes: sent → failed · proposed_actions: approved → failed	last_error stored; DLQ row if configured

Stub note: POST /api/ns-push/drain (src/index.ts:25124) currently has a "dry-run by default" path; production NS RESTlet wiring is gated on TBA token + dedicated restlet build (per inline comment R294).

Mirror writes — who writes what when

storage	row(s)	trigger
`proposed_actions_applied_mirror`	1 row per applied action	NS write confirmed (transition to `applied`)
`decision_corpus`	1 row per approval (pattern_rule)	approve endpoint (in the atomic batch)
`reflexion_log`	1 row per workflow run	workflow_runner only (post_actions stage 7), NOT the decide endpoint
`events`	1 row `hitl.approved` or `hitl.rejected`	decide endpoint after successful state transition

Bulk decide — R532

POST /api/proposed-actions/bulk-decide (src/index.ts:14054) lets the operator apply one decision (approved/rejected) to many action_ids at once. Rate-limited to 30/min. Uses decided_by='api:bulk-decide' in the UPDATE.

Bulk approve enqueues one ns_pending_pushes row per action in a single batch. The race fix applies per-action via the same WHERE status='pending' guard; race losers in the bulk path return as { action_id, skipped: 'already_decided' } entries in the response.

Source files

Decide endpoint: src/index.ts:25005-25109
Bulk decide: src/index.ts:14054-14110 (R532)
NS push drain stub: src/index.ts:25124
Risk tier schema: migrations/113_risk_tiers_proposed_actions.sql
HITL-side event production: src/index.ts:25051 (rejected), :25103 (approved)
Related diagrams: hitl-writeback-flow (sequence view), substrate-event-ledger (consumer side)

HITL Lifecycle — `proposed_actions` state machine

Schema recap — proposed_actions

Risk tiers — migration 113 (R537)

State machine

The race fix — R560 / codex audit CRITICAL #1

OLD — broken (pre-R560)

NEW — R560 atomic claim

The decide endpoint — `POST /api/proposed-actions/:id/decide`

Drainer — ns_pending_pushes → NetSuite

Mirror writes — who writes what when

Bulk decide — R532

HITL invariant — cited references

Source files

Schema recap — proposed_actions

Risk tiers — migration 113 (R537)

State machine

The race fix — R560 / codex audit CRITICAL #1

OLD — broken (pre-R560)

NEW — R560 atomic claim

The decide endpoint — POST /api/proposed-actions/:id/decide

Drainer — ns_pending_pushes → NetSuite

Mirror writes — who writes what when

Bulk decide — R532

HITL invariant — cited references

Source files

The decide endpoint — `POST /api/proposed-actions/:id/decide`