The 5th platform pillar — a platform-level layer that sits between inbound documents and downstream actions. Mike uploads a sample PDF (e.g. Driscoll customer PO), parses to markdown, visually tags each region or span to a target NetSuite field (e.g. SalesOrd.bodyFields.otherrefnum), saves a per-customer template, then future inbound docs from that customer auto-extract using the saved template. Output is clean NS-ready records staged via HITL for write to NetSuite via NS_PUSH_QUEUE. Three worked use cases: Path 1 customer PO → SO (the first deployed use case — Driscoll Foods); Path 2 vendor COA → compliance; Path 3 bid RFP → pipeline (bridges to the Bid Center pillar). Operator modes: visual tagger at /data-tagger.html (Agent BB-2 owns) and chat-driven (Agent BB-3 adds chat tools).
/data-tagger.html, inbound email auto-route (5 mailboxes), or chat upload (Agent BB-3 tools)./data-tagger.html (Agent BB-2)src/email.ts + 5 mailboxesdata_tagger_train, data_tagger_apply, data_tagger_save_template (Agent BB-3)customer_id + doc_type + ns_record_type (e.g. Driscoll Foods / po_inbound / SalesOrd)src/document_converter.tsregex_after_label · regex_before_label · fixed_region · table_with_headers · multi_line_span · whole_section · formula · llm_with_schema · literal_constantNS_PUSH_QUEUE writes, reflexion updates template metrics, events fire.proposed_actions rowhit_count / success_count / miss_count per template| kind | name | purpose |
|---|---|---|
| Live tool | /data-tagger.html | visual tagger UI (Agent BB-2) |
| D1 table | data_tagger_templates | per-(customer,doc,record) template; versioned (migration 142) |
| D1 table | data_tagger_extractions | one row per inbound document processed |
| D1 table | data_tagger_template_corrections | operator edits for reflexion |
| D1 table | data_tagger_doc_types | doc_type -> ns_record_type mapping |
| D1 table | data_tagger_uploads | raw upload audit |
| D1 table | inbound_email_log | existing email audit |
| D1 table | proposed_actions | HITL queue (kind=data_tagger_extraction) |
| D1 table | events | event ledger (R549) |
| R2 bucket | gfs-data-tagger-samples | uploaded sample PDFs |
| R2 bucket | gfs-inbound-attachments | email attachments |
| Endpoint | POST /api/data-tagger/upload | UI upload |
| Endpoint | POST /api/data-tagger/train | save tagged template |
| Endpoint | POST /api/data-tagger/apply | run template against inbound doc |
| Endpoint | POST /api/proposed-actions/decide | approve / reject extraction |
| Endpoint | POST /api/ns/push/sales-order | NS SO write-back (Path 1) |
| Code path | src/document_converter.ts | PDF/DOCX/XLSX -> markdown |
| Code path | src/email.ts | 5-mailbox inbound pipeline |
| Code path | src/chat_tools/impls.ts | data_tagger_* tools (Agent BB-3) |
| Migration | 142_data_tagger.sql | Agent BB-1 - 9 strategies + templates + extractions schema |
| Durable Object | PushMutexDO | per-customer NS write mutex |
train / apply / save_template) not yet registered in src/chat_tools/impls.ts.