Azirella
← Back to Learn Framework

Hierarchical Agent Supervision

Autonomy is built as a hierarchical management of context, guardrails, and targets across agents operating at the strategic, tactical, operational, and execution levels. Context flows down; escalation and feedback flow up. Each tier reasons at its own horizon and scope, and only at its own horizon and scope.

The result: a 10-to-100-agent workforce that decides without committee, with a chain of authority that matches how supply chain teams already think about their planning calendar.

The four tiers

Same agent workforce; different cadence, different scope, different decision shape.

L4 Strategic
weekly · whole network

Policy Optimisation

Network-wide policy, KPI targets, and the guardrail envelope every other tier reasons inside. Re-runs weekly off rolled-up consensus; sets the shape of "good" before any tactical or execution decision is made.

L3 Tactical
daily · cross-entity

Domain-Model Reconciliation

The four planning domains (Demand, Supply, Inventory, Capacity) negotiate against each other to converge on a feasible plan-of-record for the next horizon. Daily, against the L4 envelope, with explicit reconciliation when domains disagree.

L2 Operational
hourly · one node

Node Coordination

At each site, hourly: intra-site urgency modulation across the execution agents, MO disaggregation, always-on coordination of what the site's execution layer should attend to next.

L1 Execution
<10 ms · one decision

Atomic Decisions

11+ task-specific execution agents per site, conformal-bounded, sub-10 ms inference. The per-decision-class layer: PO creation, MO release / execution, transfer dispatch / execution, ATP allocation, inventory buffering, rebalancing, forecast adjustment, quality, maintenance, subcontract, order tracking.

Two flows, one substrate

What each tier emits, and what each tier consumes.

Down: context, guardrails, targets
  • L4 to L3: network-wide policy, KPI targets, guardrail envelope.
  • L3 to L2: domain-reconciled plan-of-record, exception thresholds, priority weights.
  • L2 to L1: site-level urgency modulation, per-decision-class governance gates.
Up: escalation and feedback
  • L1 to L2: decisions actioned, exceptions surfaced, override capture.
  • L2 to L3: site-health signals, structural drift, leading indicators of policy stress.
  • L3 to L4: domain-reconciliation tensions that the current envelope cannot resolve; trigger for policy rework.

Each tier only ever sees the surface above and the surface below it. No tier is asked to reason at someone else's horizon.

Why a hierarchy, not a flat agent swarm

A flat workforce of equally-empowered agents cannot make a network-wide policy choice and a sub-second execution call with the same surface. The policy choice needs weeks of context; the execution call has milliseconds. Mixing them invites either over-cautious execution (every L1 decision waits on a network re-plan) or under-constrained execution (every L1 agent makes a strategically incoherent local choice).

The hierarchy resolves this by making the L4 envelope the only surface that holds network-wide policy authority, and making the L1 agents the only surface that holds sub-second decision authority. Each is decoupled from the other except through the envelope-and-feedback contract. The L1 agent does not need to know how the envelope was set; it needs to know what the envelope is, right now. The L4 agent does not need to know which L1 atomic decision the network just made; it needs to know whether the policy is holding up against observed reality.

Crucially, every layer is governed by the same AI·IO·ML operating model. Agents always act; humans inspect, inform, or override. The hierarchical structure is what makes that operating model coherent across cadences.