Data lineage — from NetSuite to a chat answer data lineage7 stages

7 stages · per-stage watermark · _meta.as_of on every tool return

Every fact in a chat answer can be traced back to NetSuite via 7 lineage stages: SOR → sync → D1 mirror → derived → Vectorize → tool exec + withSources wrap → chat answer. Each stage records its own watermark. _meta.as_of on every tool return surfaces the freshest contributing watermark, so the user sees how fresh the answer is. Stale answers trigger a staleness banner.

0 · Visual flow 7 lanes · 15 nodes

System flow
01 / NetSuite — source of record (as-of: live) 02 / Sync engine — hot 5m · warm 15m · cold 60m · CDC (as-of: sync_log.completed_at) 03 / D1 mirror — 162 tables (as-of: sync watermark) 04 / Derived data — D1-only computation (as-of: derived.computed_at) 05 / Vectorize indexes — embedded knowledge (as-of: chunk_indexed_at) 06 / Chat context — withSources wrap (as-of: _meta.as_of) 07 / Chat answer — what the user sees (as-of: rendered_at) NetSuite is the canonical source of record. All sync is read-driven by 28 SuiteQL spec definitions (sync_specs table) covering customers, transactions, items, vendors, etc. LINEAGE: NetSuite surface: customscript_gfs_platform_query RESTlet auth: TBA OAuth1 spec_count: 28 watermark_source: NS-side modification_date / lastmodifieddate i NetSuite SuiteQL system of record · 28 SuiteQL specs customscript_gfs_platform_query NetSuite SystemNote feeds Change Data Capture. Every record modification logs a row that the webhook receives. Drives reactive re-sync of just-modified records (faster than tiered cron). LINEAGE: SystemNote CDC source: NS SystemNote endpoint: /api/ns-webhook auth: HMAC verify trigger: any NS record modification i NS SystemNote CDC record-level change events → /api/ns-webhook (HMAC) Hot tier syncs every 5 minutes. Covers transactions, transaction_lines, customer_payment, item_fulfillment (move with biz cadence). LINEAGE: sync hot tier cadence: 5min cron specs: 7 hot-tier specs watermark: per-spec last_synced_at i Hot tier · 5min transactions + lines syncTable + syncLineTable Warm tier syncs every 15 minutes. Covers customers, vendors, items, contacts. LINEAGE: sync warm tier cadence: 15min cron specs: 8 warm-tier specs i Warm tier · 15min customers + vendors + items medium-cadence records Cold tier syncs hourly. Covers departments, locations, accounts, subsidiaries (rarely changed but needed for joins). LINEAGE: sync cold tier cadence: 60min cron specs: 13 cold-tier specs i Cold tier · 60min departments + locations + accounts rarely-changed reference data 162 D1 tables. Synced tables use INSERT OR REPLACE keyed by NS id; derived tables use explicit writes. sync_log table tracks every sync round with completed_at — this is the high-water mark surfaced as 'as of'. LINEAGE: D1 mirror tables: 162 (R563) rows: ~311,000 watermark_table: sync_log watermark_field: completed_at i D1 — 162 tables · ~311k rows INSERT OR REPLACE on NS id (synced) · explicit writes (derived) sync_log row per sync round · sync_log.completed_at = watermark Per-assembly cost rollup computed from BOM components × current vendor_costs. Recomputed when a vendor cost changes (workflow_vendor_cost_update fan-out #2). LINEAGE: assembly_cost_rollup table: assembly_cost_rollup source: bom_components + vendor_costs trigger: vendor_cost_update workflow i assembly_cost_rollup BOM × vendor_costs D1-only computation Per-customer health 0-100 + tier band. 6 signals: AR aging, disputes, payment lateness trend, order freq drop, credit memos, contact churn. Nightly cron. LINEAGE: customer_health_scores table: customer_health_scores source: 6 D1 signals cadence: nightly cron i customer_health_scores 6 signals → 0-100 + tier D1-only computation pricing_master: per-customer per-item active pricing. ar_buckets: view aggregating customer_invoice by aging bucket. LINEAGE: derived pricing_master: per-customer per-item ar_buckets: VIEW over customer_invoice i pricing_master + ar_buckets price catalog + aging view D1-only computation Knowledge corpus: manifest + ADRs + system guide + saved searches embedded. LINEAGE: ns_knowledge chunks: 3,360 (R555) sources: PLATFORM_INVENTORY.md, ADRs, SYSTEM_WIKI, saved_searches i ns_knowledge 3,360 chunks (R555) Approved HITL decisions → embedded → searched as few-shot for similar situations. LINEAGE: decision_corpus source: HITL approvals query_pattern: cosine topK=5 i decision_corpus (situation, action, rationale) Named saved queries embedded so chat can retrieve a relevant SuiteQL snippet by description. LINEAGE: suiteql_corpus source: saved_searches catalog query_pattern: cosine topK=3 i suiteql_corpus saved SQL queries Tool runs. Each tool queries D1, derived tables, and Vectorize. Returns { data, _meta } where _meta.sources lists contributing tables and _meta.as_of carries the freshest watermark. LINEAGE: tool execution tools: 175+ inputs: D1, derived, Vectorize output: { data, _meta } i Tool execution 175+ tools · D1 queries · Vectorize lookups returns { data, _meta } Every tool result wrapped with withSources(). _meta.sources = [{table, ref, ...}]; _meta.as_of = MAX(sync_log.completed_at over contributing tables); _meta.retrieval_path = lineage of queries. LINEAGE: withSources fields: _meta.sources[], _meta.as_of, _meta.retrieval_path drives: citation chips + audit i withSources wrap _meta.sources + as_of + retrieval_path audit trail attached to every result Final rendered answer in chat.html. Each fact gets a citation chip linking back to its source table. The 'as of' stamp shows the watermark for the freshest contributing data. LINEAGE: chat answer surface: chat.html rendering: citation chips, as-of stamp fallback_if_stale: shows warning banner i Chat answer + citations rendered with source chips + "as of" stamp user sees lineage at a glance

1 · The 7 stages

StageWatermark source"As of" field
1 · NetSuite SORNS lastmodifieddatelive
2 · Sync enginesync_log.completed_at~5min lag (hot) / ~60min (cold)
3 · D1 mirrorsync_log per tablelast sync round
4 · Derived dataderived.computed_at fieldlast compute
5 · Vectorizechunk.indexed_atlast index
6 · Chat context_meta.as_of (MAX over inputs)freshest contributing watermark
7 · Chat answerrendered_atwhen user saw it

2 · Watermark logic

Every contributing read records a (table, last_synced_at) pair. When withSources() wraps a tool result, _meta.as_of = MAX(last_synced_at over contributing tables). If that timestamp is older than the user's tolerance window (or older than the freshness SLO for that table tier), the chat UI renders a yellow staleness banner before the answer.

This is the lineage guarantee: every fact in a chat answer can be traced to a source table + a watermark.

3 · How to read it

ColorMeaning
frontendUser-facing surface (chat UI, admin HTML pages)
backendWorker logic / agent code / business rules
databaseD1 table / R2 object / KV key / Vectorize index
cloudExternal system (NetSuite, Anthropic, etc.)
securityGate / policy / HITL approval / kill switch
messagebusEvent ledger, Queues, async fan-out
externalInbound source (email, webhook, cron tick, user input)
→ solidSynchronous call (request → response)
→ greenApproved / happy-path
→ red dashedPolicy or security check
→ grey dashedOptional / conditional / async