The architecture
BetaRisk Engine
We already have the graph.
Most risk-intelligence platforms build a graph of suppliers, products, and geographies from a corpus of news and documents. Autonomy starts from the other direction. The canonical supply-chain DAG — sites, lanes, BOMs, suppliers, commitments, capacities — already exists as the system of record. Risk extraction is the inverse of the typical problem: we already have the graph; we need to derive which relationships in it are dangerous.
The single load-bearing claim
Risks are derived edges over the canonical supply-chain DAG, produced by deterministic algorithms first and learned models second, with LLMs used only to narrate, never to assert, the resulting facts.
Three corollaries follow. The technique stack inverts relative to third-party risk-intelligence platforms. Classical graph algorithms come first because they are deterministic, explainable, and training-free. And graph-algorithm outputs are humanly opaque, which is why narration exists — and why the structured risk edge, not the prose, is always the source of truth.
Four extractor layers
Cheap-and-explainable first. Learned-and-opaque last. The order is the routing logic: rule outputs can act automatically; learned outputs route to inspect.
Rules
Threshold and pattern rules over canonical state. Deterministic, low compute, fully explainable. Most business-logic risks fit here. Suitable for action-by-rule.
Graph algorithms
Twelve named algorithms across five canonical graph views, each producing a typed risk predicate. Conformal-bounded and time-respecting variants are first-class.
NN (learned)
Neural Net on the integrated DAG, trained on historical disruption outcomes. Captures correlations classical algorithms miss — combinations of stresses that no individual algorithm flags.
LLM canonical-linker
Resolves Context Engine signals from coarse tags to canonical entity IDs. The LLM does entity resolution, not text mining — every fact already exists in the upstream signal.
Layers compound. A risk edge can be produced by one extractor and corroborated by another, and the value-scoring layer rewards corroboration.
The (graph, weight, algorithm) unit
Same algorithm, different weight, different risk. Shortest-path on cost-weighted lanes surfaces routing risk; shortest-path on conformal-P90-lead-time-weighted lanes surfaces commitment-reachability risk. Same graph, same algorithm, completely different business meaning.
| Algorithm | Risk predicate | Surfaces |
|---|---|---|
| Shortest path (cost / time / risk-weighted) | primaryPathFor | Current best route under chosen objective |
| k-shortest paths | alternateRouteCount | Resilience: how many alternates exist |
| Edge betweenness | criticalLaneFor | Lanes whose failure displaces the most flow |
| Vertex betweenness | criticalSiteFor / criticalSupplierFor | Bridge nodes |
| Min-cut / max-flow | capacityBottleneckBetween | Throughput-constraining edge sets |
| k-edge / k-vertex connectivity | fragileDependencyOf | Single-failure-disconnects pairs |
| Topological sort on bom_dag | shortageCascadeOrder | Sequence of finished-good shortages |
| PageRank (revenue-weighted) | revenueImportance | Which risks to surface first |
| Time-respecting paths | commitmentReachable | Can a commitment be met given cumulative lead times |
| Iterative removal + reroute | cascadeImpact | Failure-propagation footprint |
Twelve algorithms, five canonical graph views, an open registry of weight functions. Products register the weights their plane cares about; Core owns the algorithm wrappers.
Two LLM roles, separated by design
Conflating these is the failure mode of every LLM-backed knowledge graph. The Risk Engine separates them sharply.
Entity resolution, not text mining
Input: a Context Engine signal that's already been ingested, scored, and tagged.
Output: candidate risk edges grounded in that signal, with subject and object resolved from coarse tags ("auto-parts", "southeast-US") to canonical entity IDs.
The trust boundary: the LLM is doing entity resolution. Every fact has already been asserted by the upstream signal. The linker just maps a string to an ID.
Bounded paraphrase, not fact assertion
Input: a structured risk-edge record produced by any of the four extractor layers.
Output: human-readable explanation tiered by audience — executive, planner, analyst, auditor.
The trust boundary: every entity, number, and date in the narration must appear in the structured record. Post-hoc validation enforces this. The structured record is always the source of truth.
One risk, four narrations
Same structured record. Different framings for different readers. The narration chooses which facts to surface and how to phrase them — never which facts exist.
"$4.2M of committed revenue is exposed via a single-carrier dependency on Lane LAN-447 over the next 14 days."
"Lane LAN-447 is your most fragile lane this week. If it goes down, 14 customer commitments worth $4.2M would reroute through paths averaging 2.3 days longer than today's lead time — long enough to breach 6 of those commitments. You have one alternate carrier on file, contracted for 40% of the volume."
"LAN-447 has edge-betweenness 0.42 (top 1% of all lanes; next-highest 0.31). Removing it from the lane graph and re-running shortest-path with conformal P90 lead-time weights gives a path-time distribution whose P90 exceeds 6 customer commitment deadlines. k-edge-connectivity to those 6 customers drops to 1 without an additional carrier on file."
Risk edge criticalLaneFor produced by algo:edge_betweenness:lane_graph:cost_weighted:v1.3 against snapshot 2026-04-30T05:00Z, plus algo:k_shortest_paths:lane_graph:lead_time_p90:v2.1 for impact estimation. Inputs: 482 lanes, 6,391 commitments, conformal calibration set 2026-Q1. Reviewed by: planner@tenant on 2026-04-30T11:14Z. Override status: none.
The structured record contains every fact. The narration chooses which facts to surface and how to phrase them. Stale narration past valid_until is regenerated; never cached past expiry.
Components and shipping status
Honest about what's in production today and what's sequenced behind the Context Engine event-awareness upgrade.
Rule extractors
BetaThreshold and pattern rules over canonical state. singleSourcedFor when supplier count = 1; concentrationExceeds when share > 80%; commitmentExceedsCapacity when sum > capacity.
Graph-algorithm extractors
BetaNetworkX-backed algorithm wrappers over five canonical graph views — bom_dag, lane_graph, supplier_product, integrated, time-expanded. Twelve algorithms producing controlled-vocabulary risk predicates.
NN extractors
RoadmapNeural Net on the integrated DAG, trained on historical disruption outcomes. Captures correlated failure patterns no individual algorithm flags.
LLM canonical-linker
RoadmapResolves Context Engine signals from coarse tags ("auto-parts", "southeast-US") to canonical entity IDs. Emits candidate risk edges with verbatim evidence quotes; never asserts a fact not present in the source signal.
Audience-tiered narration
BetaSame risk edge, four narrations: executive (one sentence, dollar-anchored), planner (action-anchored), analyst (structural), auditor (full provenance). Bounded paraphrase only — every fact must appear in the structured record.
Value scoring
BetaOrthogonal to confidence. Combines confidence × source authority × business impact × reuse × freshness × specificity. The Decision Stream sorts by value, not by confidence.
PROV-O provenance
BetaEvery risk edge carries enough lineage to answer six questions: which source, when, by what method, verified by whom, what evidence, has the source changed.
The framework is designed; the rules / graph / scoring / narration / provenance layers are in beta. The NN and LLM linker layers wait on the Context Engine's event-awareness P0 — events as first-class entities, structured FK arrays, geometry, source tier — without which the linker would over-fetch and the NN would train on noisy entity tags.
What's different about this
Inverted technique stack
Most risk-intelligence platforms build a graph from text. Autonomy starts from canonical state — the platform is already the system of record for sites, lanes, BOMs, suppliers, commitments, capacities — and overlays external evidence on top. That inversion changes the technique stack from "extract entities from documents" to "rank relationships in a known graph."
LLMs at the edges, not the centre
LLMs play two narrow, structured roles: linker (string → canonical ID) and narrator (record → prose). They never assert facts. Every entity, number, and date in narration is validated against the structured record. The architectural choice that distinguishes the engine from products that let an LLM assert and post-hoc justify.
Value, not confidence, drives action
The Decision Stream sorts risk edges by value_score, which combines confidence with source authority, business impact, reuse, freshness, and predicate specificity. A high-confidence trivial risk doesn't beat a moderate-confidence revenue-anchored one.
Provenance is first-class
Every risk edge carries enough lineage to answer six questions: which source, when extracted, by what method, verified by whom, what evidence supports it, has the source changed since. PROV-O-aligned, exportable as RDF when consumers ask.
The Risk Engine is the downstream half of the Context Engine. Together they are how external evidence becomes ranked operator action.
See how risk edges become decisions
The Risk Engine produces structured records. The Decision Stream sorts them by value and routes them to inform, inspect, or act. One coherent path from supply-chain physics to operator screen.