Spec sheet generation pipeline

From raw product input to PDF spec sheet on customer-facing surfaces · 9 stages · 8 brand templates

Spec sheets are the contract document for every item we sell. This pipeline takes raw product input (manual paste, inbound email, vendor attachment, or NetSuite item sync) and produces an approved, dated PDF on R2 plus a structured row in spec_items. Workers AI does the heavy parse; Mike approves every diff before publish; Browser Rendering produces the PDF.

8 brand templates HITL-gated Weekly Sunday refresh

Pipeline — 9 stages, left to right

Spec sheet generation pipeline — input sources → AI extract → template merge → HITL → PDF render → R2 → D1 → surfacing → cron rebuild 01 / Sources 02 / Extract 03 / Merge 04 / HITL 05 / Render 06 / Persist 07 / Surface manual paste cost-ingestion.html FRONTEND · admin form inbound email pricerequest@ EXTERNAL · CF Email vendor attachment PDF/DOCX/XLSX EXTERNAL · document_converter.ts NS item record sync items table · cold tier DATABASE · D1 mirror Workers AI extract Llama 3.3 · structured allergens · nutritionals claims · pack_size · brand CLOUD · POST /api/specs/ingest + confidence_score template merge spec_template + extracted fields 8 brand templates BACKEND · PUB / RS / PU AIS / BR / TT / GFS / Harvest HITL review proposed_actions staged diff vs prior SECURITY · risk 3 Mike approves PDF render Browser Rendering API CLOUD · BROWSER binding /api/specs/render R2 storage specs/<item_code>/ <version>.pdf DATABASE · STORAGE spec_items INSERT spec_status='approved' spec_version++ DATABASE · D1 · 136 rows bid-command-center consumes spec_items FRONTEND get_spec_sheet chat tool FRONTEND list_items_by_claim chat tool FRONTEND cron 0 8 * * SUN — weekly rebuild refresh specs > 90 days stale re-render PDFs against current template MESSAGEBUS · CF Cron LEGEND Frontend Cloud / AI Backend Database / R2 HITL Cron / queue External

Stage detail — 9 stages

01 Input sources 4 surfaces

Four ways product data enters the pipeline. Each carries different fidelity and trust signal.
Manual paste
cost-ingestion.html — admin paste form · highest fidelity
Inbound email
pricerequest@ai-globalfoodsolutions.co — CF Email Routing → src/email.ts
Vendor attachment
PDF/DOCX/XLSX via src/document_converter.ts → Markdown
NS item sync
cold-tier sync from items table; trigger on new item_id

02 Workers AI extract REAL

Workers AI parses the input and returns structured spec fields with a confidence score.
Model
Workers AI · Llama 3.3 · structured-output JSON mode
Fields
allergens · nutritionals · claims · pack_size · brand · GTIN · case_dims
Endpoint
POST /api/specs/ingest
Output
JSON payload + confidence_score ∈ [0,1]

03 Template merge 8 brands

Brand-prefix routing maps the extracted brand to one of 8 brand-specific spec templates.
Templates
PUB (Melt Mates / Power Up) · RS (Right Start Foods) · PU · AIS · BR · TT · GFS · Harvest Promise
Source
~/Desktop/GFS-NetSuite-Cloudflare/source-documents/Specification-Sheet/
Coverage
124+ products mirrored in spec_items (136 rows current)

04 HITL review risk 3

Proposed spec stages as a row in proposed_actions. Mike sees the diff against the prior spec_version and decides.
Table
proposed_actions · status='pending'
Diff
field-level diff vs prior approved spec; flagged where confidence_score < 0.7
Surface
/proposed-actions.html · X-Edit-Token required to approve
Race fix
R560 atomic UPDATE...RETURNING claim → one winner per row

05 PDF render REAL

Browser Rendering API generates the PDF from the merged HTML template.
Binding
CF BROWSER · Puppeteer-compatible API
Endpoint
POST /api/specs/render
Output
Letter-size PDF, embedded fonts, GFS v9 design

06 R2 storage versioned

Spec sheet PDFs stored at versioned paths in R2 STORAGE bucket.
Path
R2://specs/<item_code>/<version>.pdf
Binding
env.STORAGE
Retention
all versions kept; never deleted — supports auditability

07 D1 record spec_items

After R2 write succeeds, a row lands in spec_items with the new version number.
Tables
spec_items (current) · spec_versions (history) · item_annotations (flags/notes)
Fields
spec_status='approved' · spec_version++ · approved_at · approved_by
Endpoint
GET /api/specs/<item_code> → latest + signed R2 URL

08 Surfacing 3 consumers

Approved specs surface across customer-facing tools and chat.
bid-command-center
Loads spec rows when matching bid_lines.item_code to spec_items
Chat tool
get_spec_sheet({ item_code }) → spec data + signed R2 PDF URL
Chat tool
list_items_by_claim({ claim }) → items matching nutritional/allergen/marketing claim

09 Cron rebuild weekly

Specs older than 90 days are re-rendered weekly so template improvements propagate.
Cron
0 8 * * SUN — Sunday 08:00 UTC
Selector
WHERE julianday('now') - julianday(rendered_at) > 90
Behavior
re-render PDF only; no new spec_version unless field diff detected

Tables, endpoints, chat tools

kindnamepurpose
tablespec_itemscurrent spec per item_code · 136 rows · PK item_code
tablespec_versionsappend-only history of every approved spec version
tableitem_annotationsper-item flags, custom notes, manual overrides
endpointPOST /api/specs/ingestraw input → AI extract → stage proposed_action
endpointPOST /api/specs/renderapproved spec → PDF via Browser Rendering → R2
endpointGET /api/specs/<item_code>latest spec JSON + signed R2 PDF URL
chat toolget_spec_sheetretrieve current spec for one item by code or name
chat toollist_items_by_claimfilter spec catalog by claim (kosher, gluten-free, halal, etc.)

Open gaps — honest punch list