Intent Routing Architecture
human.ask("I want to reduce agent noise") looks like a chat-completion call. It is not. Behind that single tool invocation is a multi-stage pipeline that classifies the message, shapes a brief, compiles a capability resolution plan, runs through autonomy and risk gates, and emits an intent_action block your agent can show, approve, and execute. This post walks the whole pipeline.
The three classification modes
The Companion's first job on every turn is to classify the user's message into one of three modes:
| Mode | What it means | Output shape |
|---|---|---|
| QUESTION | The user wants information. | text answer with citations. |
| CONTEXT | The user is sharing context for later use (no action expected). | Acknowledgment, persisted to session memory. |
| INTENT | The user wants something done. | text summary + intent_action block. |
Within INTENT, there's a sub-mode: fuzzy (the brief isn't yet specific enough to compile) and clear (compile-ready).
Classification happens in the LLM call itself — the system prompt instructs the model to emit a structured response with classification, text, and (if applicable) intent_action. The companion-agent layer parses the structured output and routes from there.
Step 1 — Fuzzy intent shaping
When classification: "intent" and the brief is fuzzy, the Companion has two options:
- Ask a clarifying question. Multi-turn shaping. "When you say 'reduce noise,' do you mean fewer notifications, fewer agent runs, or different routing?"
- Make a confident inference. If session context strongly suggests the answer (e.g. recent signals show notification fatigue), the Companion can shape the brief itself and ask for confirmation.
This is the part most "agent" demos skip. They go straight from prompt to action. HUMΛN bakes in shaping because real intents are rarely clear on first try.
Step 2 — Clear intent → intent_action
Once the brief is clear, the Companion emits:
{
"classification": "intent",
"text": "I'll set up a workforce schedule that mutes non-critical agent signals on weekends.",
"intent_action": {
"tool_id": "workforce.schedule.create",
"params": { "rule": "mute non-critical", "window": "weekends" },
"autonomy": "propose",
"requires_approval": true,
"reversible": true,
"human_readable": "Mute non-critical agent signals on weekends",
"consequence": "Notifications suppressed Sat/Sun until next change.",
"provenance_scope": "workforce:schedule:write"
}
}
Every field matters:
tool_id— identifies the capability to invoke. Resolved against the capability graph.params— the input shape the capability expects. Validated against the capability's schema.autonomy—observe,propose, orauto. Drives whether the action runs immediately or surfaces for approval.requires_approval— set by the risk gate, not the LLM. The LLM cannot bypass it.reversible— affects whatautois allowed to do (only reversible actions can run autonomously).human_readable— the line your UI shows in the approval prompt.consequence— the line that explains what happens after.provenance_scope— the scope the executor needs to actually run this.
Step 3 — /v1/intent compiles the brief
When your agent calls human.intent (or when human.call runs the proposed action), the API endpoint POST /v1/intent does the heavy lifting:
- Brief validation. The brief schema is checked.
- Capability discovery. The capability graph returns candidates that match the brief's verbs, nouns, and context.
- Constraint solving. The compiler picks the candidate set that satisfies the brief's constraints (timing, scope, reversibility).
- Plan emission. A
capability_resolution_planis built — an ordered list of capability calls with their bound parameters. - Autonomy resolution. The org's autonomy profile + the user's delegation determines the final
autonomyfor each step. - Risk gating. Each step is run through the risk gate (kb/22 §HumanOS). Irreversible or high-blast-radius steps get
requires_approval: trueregardless of what the LLM said.
The plan is durable: it's stored, versioned, and can be replayed.
Step 4 — human.call executes (with the human in the loop)
human.call doesn't blindly run anything. The flow:
- Scope check. Does the calling delegation include
provenance_scope? If not,403. - Approval check. Is
requires_approval: true? If yes, the call returnspending_approvalwith an approval ID. The human approves out-of-band (Console, push notification, Companion turn). - Pre-flight risk gate. Re-runs the risk evaluation against the current state, not the plan-time state. (If the org's policy changed in the last hour, the risk gate sees that.)
- Execution. The capability runs.
- Provenance. The receipt is written: who called, what scope, what params, what outcome, what duration.
- Result. Returned to the caller along with a provenance ID.
If anything fails, the response is RFC 7807 application/problem+json with a remediation field telling the caller what to do next.
Step 5 — Autonomy: observe, propose, auto
Autonomy is set per-capability per-org via the autonomy profile, but the Companion can also propose a stricter setting at runtime:
- observe — log only, never execute. Used during onboarding.
- propose — surface as
intent_action, require approval. The default. - auto — execute without approval, but only if
reversible: true. The risk gate enforces this; an LLM cannot upgrade itself toautofor an irreversible action.
Combined with the org's policy and the user's delegation, the system computes the most restrictive autonomy that everyone agrees on. The user can always add friction (require approval on something the org allows auto for); they can never remove it.
Step 6 — Provenance at every step
Every intent flow leaves a chain:
- The original user message (hashed, optionally redacted).
- The Companion's classification + brief.
- The compiled plan ID.
- Every
human.callinvocation with delegation, scope, params hash, outcome. - Every approval (who approved, when).
- The final result.
The chain is signed and append-only. An auditor can ask "show me everything that happened from this user's message" and get a single, ordered, cryptographically verifiable trace.
A real-world walkthrough
A user typing into Cursor: "scaffold a Stripe webhook connector for me"
- Classification:
intent, fuzzy. Companion asks: "Do you want a connector that consumes Stripe webhooks (events flow in) or one that emits them (events flow out)?" - Refinement: User: "Consume." Now the brief is clear.
- Compile:
/v1/intent/compilereturns a plan with three steps: scaffold the package, register the route, generate the manifest. - Risk gate: Scaffolding is reversible; auto-allowed if the org's autonomy profile permits. The user's delegation is read-heavy (e.g.
companion:chat+kb:read:public) withouthuman_api:agents:invoke, sohuman.callnever auto-runs — the action surfaces for approval and/or a wider mint. intent_actionblock appears in Cursor with a "Run scaffolding" button.- User approves. Cursor calls
human.call. The HUMΛN side runs the three steps in order, each with provenance. - Result: The new connector skeleton lands in the user's repo. Companion follows up: "Done. Want me to also add the webhook secret to your connector config?"
The Companion can keep proposing follow-ups because the session carries the intent state forward. The user is in control at every step. Provenance is automatic.
What this enables
This architecture is what makes "AI does things" governable:
- The LLM never bypasses the scope check.
- The risk gate is enforced server-side, not by the LLM's promises.
- Approval is real, not theatrical.
- Provenance is automatic, not opt-in.
- Reversibility is a hard gate on autonomy, not a vibe.
What's next
- AI corpus:
/ai/articles/companion-intent-modes.md— INTENT/QUESTION/CONTEXT,intent_actionschema. - AI corpus:
/ai/articles/guardrails-and-boundary-contracts.md— autonomy + approval semantics. - KB:
kb/22_humanos_orchestration_core.md— the risk gate in detail.