105. HUMAN AGENT SDK ARCHITECTURE
The Blueprint for Building HAIO-Compliant Agents
Version: 1.0
Status: Canonical
Priority: Must-Ship Year 1
Classification: Internal (SDK will be Open Source)
Related Must-Ships: Developer Experience (43_haio_developer_architecture.md), Meeting Muscles v0.2 (104_companion_meeting_muscles_spec.md)
CORE PHILOSOPHY: INFRASTRUCTURE IS INVISIBLE
Developers write business logic. HUMAN owns everything else.
This is not a convenience — it's a requirement for global scale (P13), security by default, and exquisite developer experience (P10).
The Principle
┌─────────────────────────────────────────────────────────────────┐
│ WHAT DEVELOPERS OWN │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ BUSINESS LOGIC │ │
│ │ │ │
│ │ • What the agent does │ │
│ │ • Domain rules │ │
│ │ • Input/output contracts │ │
│ │ • Prompts and reasoning │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────┤
│ WHAT HUMAN OWNS │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ EVERYTHING ELSE │ │
│ │ │ │
│ │ • Scaling (serverless, auto, predictive) │ │
│ │ • Security (zero-trust, encrypted, audited) │ │
│ │ • Storage (vaults, databases, caching) │ │
│ │ • Networking (routing, load balancing, edge) │ │
│ │ • Observability (metrics, logs, traces) │ │
│ │ • Cost optimization (model selection, right-sizing) │ │
│ │ • Compliance (retention, PII, audit trails) │ │
│ │ • Deployment (CI/CD, preview, rollback) │ │
│ │ • Identity (Passport, delegation, verification) │ │
│ │ • Provenance (logging, signing, attestation) │ │
│ │ • Reasoning (AI model routing, provider abstraction) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Why This Matters
| Without This | With This |
|---|---|
| Developers configure scaling thresholds | HUMAN scales automatically |
| Developers manage secrets rotation | HUMAN rotates secrets |
| Developers set up monitoring | HUMAN monitors everything |
| Developers estimate database sizes | HUMAN right-sizes infrastructure |
| Developers debug distributed systems | HUMAN provides time-travel debugging |
| Developers think about cold starts | HUMAN pre-warms intelligently |
| Developers worry about security | HUMAN is secure by default |
Design Principles
-
Declare outcomes, not mechanisms
- ❌
scale_threshold: 10 - ✅
slo: { latency: { p99: 500ms } }
- ❌
-
Secure by default, not opt-in
- ❌
encryption: true - ✅ Everything encrypted. Always.
- ❌
-
Scale-to-zero by default
- ❌
min_instances: 2 - ✅ Serverless. Pay for what you use.
- ❌
-
Smart defaults that learn
- ❌ Static thresholds
- ✅ Adaptive based on observed behavior
-
Infrastructure appears when needed
- ❌
database: { type: postgres, size: small } - ✅ Use
ctx.db→ infrastructure auto-provisions
- ❌
-
Progressive permission acquisition
- ❌ Declare all scopes upfront
- ✅ Request scopes when needed
The Minimal Manifest
Most agents need only this:
name: invoice-processor
version: 1.0.0
capabilities: [finance/invoice/process]
Everything else has smart defaults. Optional overrides only when needed:
# Only if you have specific requirements
slo:
latency:
p99: 200ms # Stricter than default
budget:
daily: $50 # Cost cap
OVERVIEW
The HUMAN Agent SDK is the developer toolkit for building agents that operate within the HAIO protocol. It extracts the patterns from the HUMAN Companion into a reusable framework that any developer can use to create identity-aware, capability-verified, safety-bounded agents.
Strategic Importance: The Agent SDK is not a "nice to have" — it's the primary mechanism for growing the HAIO ecosystem. Every agent built on the SDK strengthens the protocol's network effects.
Core Philosophy: Developers write business logic. HUMAN owns everything else. Infrastructure is invisible.
STRATEGIC RATIONALE
Why an Agent SDK?
- Ecosystem Growth: Developers build agents → agents need identity/capability → HAIO adoption grows
- Protocol Validation: Third-party agents prove HAIO works beyond HUMAN's own products
- Network Effects: More agents → more Workforce Cloud demand → more humans trained → stronger protocol
- Revenue Path: SDK is free; infrastructure/services are paid
The Companion as Blueprint
The HUMAN Companion is the reference implementation:
- Demonstrates all SDK patterns in production
- Proves the architecture works at scale
- Provides code that can be extracted into SDK
HUMAN Companion (Production)
↓
Pattern Extraction
↓
@human/agent-sdk (Open Source)
↓
Third-Party Agents (Ecosystem)
The Human App Store
The Agent SDK enables HUMAN to become the App Store for human-AI collaboration - a vibrant marketplace where developers build and monetize across all five pillars: Companion Pillar (Agent SDK focus):
- Capabilities (what AI can do: contract review, sentiment analysis, medical triage)
- Connectors (data sources: Salesforce, Epic EHR, Stripe, SAP)
- Extensions (UI enhancements: Gmail sidebar, Calendar overlay, Slack bot)
- Complete Agents (packaged capability + connectors + UX)
Other Pillars (additional SDKs):
- Passport Apps - Credential issuers, identity verifiers, vault providers, auth methods
- Capability Graph Apps - Evidence providers, assessment tools, capability readers
- Academy Apps - Course publishers, training modules, simulations, certifications
- Workforce Cloud Apps - Task publishers, workflow integrators, review interfaces, QA tools
The Vision:
Year 1: HUMAN builds 24 institutional agents (seed catalog - Companion pillar only) → $7.6M ARR
Year 2: Human App Store expands (Companion + Workforce Cloud + Academy open) → $50M GMV
Year 3: Full five-pillar ecosystem (all pillars open, 90% third-party) → $850M+ GMV
For Consumers:
- Browse Human App Store across all five pillars
- One-click install: Companion capabilities, Academy courses, Workforce Cloud tasks, Passport tools, Capability Graph assessments
- Access through appropriate interfaces (Companion for AI, Academy for learning, etc.)
- Free tier + paid tier (70% to developer, 30% to HUMAN)
For Developers:
- Build using Agent SDK (free & open source)
- Publish to marketplace (free or paid)
- 70/30 revenue share (developer/HUMAN)
- HUMAN handles: payments, certification, distribution, support
- Potential to earn $100k+/year from popular capabilities
For Enterprises:
- Install Human App Store apps across all pillars for teams
- Build internal-only apps with same SDKs (Companion agents, Academy training, Workforce workflows)
- Or monetize proprietary apps externally in Human App Store
- Volume licensing and enterprise deployment options
The Network Effects Flywheel:
More Capabilities Built (SDK)
↓
More Organizations Install HUMAN
↓
More Developers See Opportunity
↓
More Capabilities Built
↓
Better Coverage of Enterprise Needs
↓
More Organizations Install HUMAN
(Flywheel accelerates)
Strategic Benefits:
- Network effects: Each app across five pillars increases platform value
- Revenue: 30% of Human App Store GMV (projected $255M+ by Year 3 across all pillars)
- Ecosystem lock-in: Organizations invest in app collections across Passport, Graph, Academy, Workforce, Companion
- Category leadership: Become the standard for human-AI collaboration across the full stack
- Validation: Third-party success proves HAIO architecture at scale
Certification Process: Every marketplace submission undergoes:
- Automated security scan
- Permission audit
- HAIO compliance check
- Manual review for paid items
- Ongoing monitoring
For complete Agent Store strategy, see: 111_consumer_companion_and_agent_store.md
MIGRATION & INTEROPERABILITY
Making migration from existing agent frameworks really easy is a core GTM lever. The SDK provides patterns for wrapping external executors while adding HUMAN's trust layer.
The Migration Philosophy
- Import in 5 minutes — their workflow, your governance
- Run hybrid for 30 days — their executor, your wrapper
- Migrate to native when ready — optional, but better UX
Developers shouldn't have to rewrite their workflows. They should get governance for free.
The Muscle Adapter Pattern
Key insight: Wrap their executor, govern with HUMAN.
A "muscle adapter" wraps external execution engines (n8n, LangChain, CrewAI, etc.) while routing all calls through MARA's policy engine.
import { muscle } from '@human/agent-sdk';
import { N8nClient } from '@human/muscles-n8n'; // Framework adapter
export const legacyInvoiceProcessor = muscle({
id: 'legacy_invoice',
// Their workflow still executes
executor: N8nClient.fromWorkflow('invoice-processor-v3'),
// HUMAN adds governance
governance: {
approval: {
threshold: '$1000',
requires: 'ctx.oversight.approve'
},
audit: 'full', // Every execution gets provenance
delegation: true, // Passport-scoped
},
});
// Usage: governed external execution
await ctx.call.muscle('legacy_invoice', { invoice });
How Muscle Adapters Work
┌─────────────────────────────────────────────────────────────────┐
│ ctx.call.muscle() │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ MARA Policy Engine │ │
│ │ • Validates delegation (P1) │ │
│ │ • Classifies risk (P5) │ │
│ │ • Routes to approval if needed (P5) │ │
│ │ • Pre-persists execution record (P7) │ │
│ └───────────────────────┬───────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Muscle Adapter │ │
│ │ • Translates ctx context to external format │ │
│ │ • Injects X-Human-* headers where possible │ │
│ │ • Captures input hash for attestation │ │
│ └───────────────────────┬───────────────────────────────┘ │
│ │ │
│ ════════════╪════════════ │
│ TRUST BOUNDARY │
│ ════════════╪════════════ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ External Executor │ │
│ │ (n8n, LangChain, CrewAI, Zapier, etc.) │ │
│ │ ⚠️ Internal execution not cryptographically verified │ │
│ └───────────────────────┬───────────────────────────────┘ │
│ │ │
│ ════════════╪════════════ │
│ TRUST BOUNDARY │
│ ════════════╪════════════ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Egress Processing │ │
│ │ • Captures output hash for attestation │ │
│ │ • Generates HUMAN attestation (gateway-level) │ │
│ │ • Logs provenance │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
What Muscle Adapters CAN Do
| Capability | How |
|---|---|
| Validate delegation at boundary | MARA ingress |
| Classify risk at boundary | MARA policy engine |
| Request approval before forwarding | ctx.oversight integration |
| Log provenance at boundaries | Input/output hash attestation |
| Revoke before external call starts | Delegation check |
What Muscle Adapters CANNOT Do
| Capability | Why Not |
|---|---|
| Enforce delegation scope within external system | External system ignores HUMAN delegation |
| Guarantee device-first operation | External system may require its control plane |
| Pause mid-execution for approval | Most external systems don't support it |
| Verify what happened internally | External execution is opaque |
| Revoke in-flight external executions | Most external systems lack revocation |
Attestation Model for Muscles
Because we can't verify internal execution, muscle attestations are explicitly marked:
{
type: 'muscle_execution',
attestationLevel: 'gateway', // Not 'full'
// What we CAN attest
inputHash: 'sha256:...',
outputHash: 'sha256:...',
delegation: { chain: ['user → agent → muscle'] },
timestamp: '2025-12-17T10:00:00Z',
policyApplied: 'risk-high-approval-required',
// Explicit limitation
limitation: 'Internal execution in external system not cryptographically verified',
externalSystem: 'n8n',
externalWorkflowId: 'invoice-processor-v3',
}
Planned Framework Adapters
| Framework | Package | Status | Notes |
|---|---|---|---|
| n8n | @human/muscles-n8n |
Planned | Workflow JSON import |
| LangChain | @human/muscles-langchain |
Planned | Chain wrapper |
| CrewAI | @human/muscles-crewai |
Planned | Crew wrapper |
| AutoGen | @human/muscles-autogen |
Planned | Agent wrapper |
| Zapier | @human/muscles-zapier |
Planned | Webhook bridge |
CLI Import Commands (Planned)
# Import n8n workflow → generates HUMAN agent with muscle adapter
human-agent import n8n workflow.json --name invoice-processor
# Import LangChain agent → generates wrapped agent
human-agent import langchain agent.py --preserve-prompts
# Import CrewAI crew → generates agent suite with muscles
human-agent import crewai crew.py --name research-team
What import does:
- Parses workflow/agent definition
- Generates HUMAN agent scaffold
- Wires external steps as muscle calls
- Auto-detects human intervention points →
ctx.oversight - Preserves prompts as
ctx.prompts.load()
AgentField Interoperability Note
AgentField's trust model (control-plane-first) is fundamentally incompatible with HUMAN's (device-first). A full adapter cannot enforce P1 (Sovereignty) or P4 (Distributed).
Recommendation: Gateway pattern only, with explicit limitation disclosure.
See setup/agentfield_adapter_spec_v0.1.md for detailed analysis.
Migration Path to Native
The goal is Tier 0 (HUMAN-native). Muscle adapters provide a bridge:
1. Import existing workflow → muscle adapter wraps it
2. Run with HUMAN governance for 30 days
3. Identify highest-value steps
4. Rewrite those steps native (ctx.llm, ctx.call, etc.)
5. Eventually: fully native, no muscle dependencies
Incentive: Native agents get full attestation, better UX, marketplace eligibility, and "HUMAN Certified" badge.
For complete migration strategy, see: 107_developer_adoption_playbook.md
OPEN SOURCE STRATEGY
Licensing Model
| Component | License | Rationale |
|---|---|---|
@human/agent-sdk |
Apache 2.0 | Core framework, maximum adoption |
@human/agent-muscles |
MIT | Muscle implementations, permissive |
@human/companion-framework |
Apache 2.0 | Reference implementation |
| Example agents | MIT | Educational, forkable |
| HUMAN Companion config | Proprietary | Our tuning, personality, enterprise features |
What's Open
Framework (Apache 2.0):
- Agent base classes and interfaces
- Muscle interface specifications
- Memory/Vault binding patterns
- Safety boundary enforcement
- Audit logging infrastructure
- Multi-agent coordination patterns
Muscles (MIT):
- Calendar muscle reference implementation
- Video conference abstraction layer
- Notes/summarization patterns
- Generic task routing
- Notification patterns
Examples (MIT):
- Meeting facilitator agent (simplified)
- Document reviewer agent
- Task coordinator agent
- Research assistant agent
What's Proprietary
- HUMAN Companion system prompts
- HUMAN-specific personality tuning
- Enterprise-grade muscle implementations
- Production Recall.ai integration
- Advanced facilitation logic
- Internal HUMAN workflows
DEVELOPER-FIRST EXPERIENCE
Building multi-agent systems should feel as easy as building microservices—but with trust built in.
The 10-Minute Agent
From idea to deployed agent in under 10 minutes:
# 1. Initialize (30 seconds)
human-agent init invoice-processor
cd invoice-processor
# 2. Customize handler (5 minutes)
code src/handlers/process.ts
# 3. Test locally (2 minutes)
human-agent dev
# 4. Deploy (1 minute)
human-agent deploy
What you get immediately:
- Ed25519 keypair for agent identity
- Delegation support with mock Passport
- Dev tools (DAG visualizer, approval tester)
- Production-ready manifest
CLI Commands
| Command | What It Does |
|---|---|
human-agent init <name> |
Scaffold agent with identity, delegation, manifest |
human-agent init <name> --suite |
Scaffold multi-agent suite |
human-agent dev |
Hot reload + mock Passport + DAG visualizer + cost tracking |
human-agent dev --share |
Instant shareable tunnel URL |
human-agent test |
Run tests with semantic assertions + golden output comparison |
human-agent test --refresh-fixtures |
Re-record LLM fixtures for deterministic CI |
human-agent deploy |
One command to Workforce Cloud |
human-agent deploy --all |
Deploy entire suite |
human-agent clone <catalog-agent> |
Clone from sample catalog |
human-agent catalog list |
Browse available sample agents |
human-agent vault set KEY=value |
Store secret in agent vault |
human-agent vault list |
List secrets (masked) |
human-agent prompts list <id> |
List prompt versions |
human-agent prompts deploy <id>@v2 |
Deploy specific prompt version |
human-agent prompts rollback <id> |
Rollback to previous version |
human-agent prompts test <id>@v2 --against v1 |
A/B test prompt versions |
human-agent replay <exec-id> |
Replay execution for debugging |
human-agent golden approve <id> |
Approve golden output for tests |
human-agent docs <topic> |
Show docs for SDK topic (e.g., ctx.llm) |
human-agent ctx |
Show all available ctx.* resources |
Agent Suites (Monorepo for Agents)
For multi-agent systems, use suites:
human-agent init research-suite --suite
Generated structure:
research-suite/
├── human-suite.yaml # Suite manifest
├── agents/
│ ├── planner/
│ │ ├── human-agent.yaml
│ │ └── src/handlers/
│ ├── searcher/
│ │ ├── human-agent.yaml
│ │ └── src/handlers/
│ └── synthesizer/
│ ├── human-agent.yaml
│ └── src/handlers/
├── shared/ # Shared types, utilities
│ └── types.ts
└── tests/
└── integration.test.ts
Deploy entire suite:
human-agent deploy --all # All agents, one command
Deployment Profiles: Roll Your Own or Just Works
HUMAN supports three deployment profiles to fit every operational model—from rapid prototyping to air-gapped data centers:
| Profile | Best For | Setup Time | Monitoring | Data Sovereignty |
|---|---|---|---|---|
| Hosted | Startups, rapid deployment | 5 minutes | Fully managed | HUMAN Cloud |
| Hybrid | Enterprises, data residency | 1-2 days | Managed or self-hosted | Customer VPC |
| Self-Hosted | Regulated industries, full control | 1-2 weeks | Self-hosted | Customer infrastructure |
Hosted Profile: Zero Config
For teams who want "it just works":
$ human-agent deploy # That's it
🚀 Deploying: invoice-processor
Auto-configured:
✅ Organization: Acme Corp (from your Passport)
✅ Agent DID: did:human:agent:acme:invoice-processor
✅ Monitoring: https://dashboard.human.cloud
✅ Audit logs: Enabled
✅ Deployed! (38 seconds)
HUMAN manages:
- Infrastructure (Kubernetes, databases, object storage)
- Monitoring (Prometheus, Grafana, alerts)
- Security (TLS, backups, disaster recovery)
- Compliance (SOC 2, GDPR, audit trails)
You control:
- Agent code
- Risk policies
- Approval workflows
- Cost visibility
Best for: Teams <200, focus on product not infrastructure
Hybrid Profile: Data Stays in Your VPC
For regulated industries needing data locality:
# Infrastructure team sets up once
$ human-agent hybrid setup
🔐 Installing secure tunnel agent in your VPC...
✅ Connected to HUMAN Cloud control plane
✅ Monitoring configured (choose: push or self-hosted)
# Developers deploy normally
$ human-agent deploy --profile hybrid
🚀 Deploying to: Customer VPC (us-west-2)
📊 Dashboard: https://dashboard.human.cloud
🔐 Data location: Your PostgreSQL
Architecture:
- Control plane: HUMAN Cloud (managed)
- Agent execution: Customer VPC
- Data storage: Customer infrastructure
- Monitoring: Push to HUMAN Cloud OR self-hosted Prometheus
Data residency:
- ✅ Stays in your VPC: Execution data, agent memory, audit logs, LLM prompts
- ✅ HUMAN Cloud only stores: Agent metadata, execution status (no payload data)
Secure tunnel:
- Customer-initiated connection (no inbound firewall rules)
- mTLS with certificate pinning
- Instant revocation capability
Best for: HIPAA/GDPR requirements, on-prem system integration
Self-Hosted Profile: Full Control
For air-gapped environments and maximum control:
# Install HUMAN control plane in your infrastructure (one-time)
$ helm install human-control-plane human/control-plane \
--namespace human \
--values values.yaml
# Configure CLI
$ human-agent config set control-plane https://api.human.acme.internal
# Deploy agents
$ human-agent deploy
🚀 Deploying to: Self-hosted control plane
📊 Dashboard: https://dashboard.human.acme.internal
🔐 All data: Your infrastructure
You manage:
- Control plane (Helm chart)
- Agent runtime (Kubernetes)
- Databases (PostgreSQL)
- Monitoring (Prometheus/Grafana)
- Ledger nodes (attestation storage)
HUMAN provides:
- Helm charts and Terraform modules
- Reference Prometheus/Grafana configs
- Migration tools
- Optional support contracts
Air-gapped support:
- Internal image registry
- No external connectivity required
- Manual updates via tarball
Best for: FedRAMP, DoD, financial institutions, on-premises requirements
Choosing a Deployment Profile
Start with Hosted, migrate later:
# Start with Hosted (zero config)
$ human-agent deploy
# Migrate to Hybrid when you need data sovereignty
$ human-agent migrate --to hybrid
📤 Exporting data from HUMAN Cloud...
📥 Importing to your VPC...
✅ Migration complete (zero downtime)
# Migrate to Self-Hosted for full control
$ human-agent migrate --to selfhosted \
--control-plane https://api.acme.internal
✅ Control plane switched
Decision matrix:
| Requirement | Profile |
|---|---|
| Fastest deployment | Hosted |
| Data must stay in EU/US/region | Hybrid or Self-Hosted |
| HIPAA/GDPR compliance | Hybrid or Self-Hosted |
| Air-gapped network | Self-Hosted |
| <200 team members | Hosted |
| Custom infrastructure | Self-Hosted |
| Want managed monitoring | Hosted or Hybrid (push mode) |
| Full Prometheus control | Hybrid (scrape) or Self-Hosted |
For detailed configuration:
- Hosted setup: See KB 108, Setup:
setup/agent_deployment_hosted_spec.md - Hybrid setup: See KB 108, Setup:
setup/agent_deployment_hybrid_spec.md - Self-Hosted setup: See KB 108, Setup:
setup/agent_deployment_selfhosted_spec.md - Monitoring configs: Setup:
setup/monitoring_configurations.md
The human.call() Primitive
human.call()is the universal invocation primitive in HumanOS.It invokes a capability (not a specific model, agent, or human) under explicit delegation, risk, and policy constraints, producing a verifiable execution record (provenance + attestation) by default.
Invariants:
- Delegation validated (scope, expiry, revocation)
- Risk evaluated against policy
- Execution recorded (pre-persist + completion attestation)
- Routed capability-first (humans/agents/models chosen by capability, then cost/constraints)
- Human override is always available (escalate/defer/refuse as allowed)
Every agent-to-agent call uses the unified human.call() primitive:
import { human } from '@human/agent-sdk';
// Simple call
const result = await human.call({
target: 'agent://invoice.parser.parse',
input: { documentId: 'doc-123' },
});
// With delegation control
const result = await human.call({
target: 'agent://payments.transfer',
input: { amount: 5000, to: 'vendor-456' },
delegation: passport.delegate({
scope: ['write:payments'],
expires: '1h',
}),
risk: 'high', // Triggers approval
});
// Parallel calls
const results = await Promise.all([
human.call({ target: 'agent://suite.validator.check', input }),
human.call({ target: 'agent://suite.scanner.scan', input }),
human.call({ target: 'agent://suite.classifier.classify', input }),
]);
// Async with callback
const executionId = await human.call({
target: 'agent://research.deep_analysis',
input: { topic: 'quantum computing' },
async: true,
callback: 'https://my-app.com/webhooks/research',
});
Parameter Semantics
target (direct) vs capability (indirect)
target: Direct agent/resource identifier (e.g.,agent://invoice.parser.parse)capability: HumanOS discovers capable resources automatically (preferred for flexibility)
input and schema expectations
- Input must match the target's expected schema (validated before execution)
- Schema defined in agent manifest or capability registration
delegation (required for sensitive actions)
- Explicit delegation object with scope, expiry, and revocation status
- Required for actions that modify state or access sensitive data
- Validated before execution; call fails if delegation invalid or expired
risk levels and default policy behavior
low: No approval required, standard loggingmedium: May require approval based on policyhigh: Requires human approval before executioncritical: Multi-human approval required
async + callback
async: true: Returns execution ID immediately, result delivered via callbackcallback: Webhook URL or event handler to receive completion notification- Use for long-running operations or fire-and-forget patterns
idempotencyKey (recommended)
- Prevents duplicate execution of the same operation
- If call with same key exists, returns existing result instead of re-executing
- Critical for retry-safe operations
traceparent / correlation (optional but recommended)
- W3C Trace Context header for distributed tracing
- Links calls across service boundaries for observability
- Format:
00-<trace-id>-<parent-id>-<trace-flags>
policyContext (jurisdiction/domain/user prefs)
- Additional context for policy evaluation (jurisdiction, domain, user preferences)
- Enables fine-grained policy decisions beyond basic risk levels
Error Model
delegation_invalid
- Delegation missing, expired, revoked, or insufficient scope
- Resolution: Request new delegation with appropriate scope
policy_denied
- Policy engine determined action is not allowed
- Resolution: Review policy rules, escalate if needed
no_qualified_executor
- No resource found with required capability
- Resolution: Register capable resource or adjust capability requirements
executor_timeout
- Executor did not respond within timeout window
- Resolution: Retry with longer timeout or use async mode
requires_human_approval
- Risk level or policy requires human approval before execution
- Resolution: Human reviews and approves/rejects via Passport interface
Agent Discovery
Find agents by capability, not hardcoded IDs:
// Discover by capability
const agent = await human.discover({
capability: 'document/invoice/parse',
minConfidence: 0.9,
});
await human.call({
target: agent.id,
input: { document },
});
Manifest Format
Human-readable YAML configuration with all SDK features:
# human-agent.yaml
name: invoice-processor
version: 1.0.0
description: Parse, validate, and route invoices
# Capabilities (registered to Capability Graph)
capabilities:
- capability: finance/invoice-processing
evidence:
- type: test_coverage
value: 95%
# What the agent needs permission to do
delegation:
required_scope:
- read:documents
- write:accounting
max_risk: high
# Handlers with explicit secret scoping
handlers:
process:
entrypoint: src/handlers/process.ts
risk: medium
secrets: [STRIPE_KEY] # Only this handler can access
passport_scopes: [salesforce.read] # User credentials needed
parse:
entrypoint: src/handlers/parse.ts
risk: low
# Credential management
secrets:
agent: # Shared across all handlers
- OPENAI_API_KEY
- DATABASE_URL
handlers: # Handler-specific (see above)
process: [STRIPE_KEY]
# LLM configuration
llm:
default_tier: balanced # fast | balanced | powerful
providers: [openai, anthropic] # Preference order
fallback_strategy: cascade # Try next on rate limit
# Cost controls
cost_controls:
daily_limit: 50.00
dev:
at_limit: warn_and_continue
prod:
thresholds:
- percent: 80
action: notify_developer
- percent: 95
action: escalate_to_passport
- percent: 100
action: hard_stop
circuit_breaker:
trigger: 5x_normal_rate
action: pause_and_escalate
# Debugging configuration
debugging:
default_retention: metadata_only
full_capture:
handlers: [parse]
environments: [development, staging]
retention:
metadata: 90d
full_data: 7d
pii:
mode: redact
fields: [email, phone, ssn]
# Infrastructure provisioning
infrastructure:
database:
type: postgres
size: small
cache:
type: redis
storage:
type: s3
bucket: invoices
# Preview deployments
preview:
database: seeded
seed_file: ./fixtures/seed.sql
Dev Mode Features
$ human-agent dev
🚀 Starting HUMAN Agent: invoice-processor
📍 Local: http://localhost:3001
🔑 Agent ID: agent://invoice-processor.local
🎭 Passport: Using mock Passport (dev mode)
Handlers:
• process → POST /handlers/process
• parse → POST /handlers/parse
Dev Tools:
• Delegation Tester: http://localhost:3001/__dev__/delegation
• Approval Queue: http://localhost:3001/__dev__/approvals
• Execution DAG: http://localhost:3001/__dev__/dag
⌨️ Press 'r' to reload | 'q' to quit
Dev tools included:
- Mock Passport - Test delegation without real Passport
- Delegation Tester - Create test tokens, simulate scope
- Approval Queue - Test approval flows locally
- DAG Visualizer - See execution tree in real-time
The Killer Feature Matrix
| What Developers Get | Without HUMAN | With HUMAN SDK |
|---|---|---|
| Create agent | Write boilerplate | human-agent init |
| Agent-to-agent calls | Build routing | human.call() |
| Deploy | Docker + K8s config | human-agent deploy |
| Trust/delegation | Build from scratch | Built-in |
| Human approval | Build UI + queue | Passport notification |
| Audit trail | Build logging | Automatic attestations |
| Revocation | Build kill-switch | One Passport tap |
| Clone templates | Copy-paste | human-agent clone |
Learning Path
- Quick Start: Clone a sample agent →
human-agent clone deep-research my-agent - Patterns: Study design patterns → KB 130
- Examples: Browse sample catalog → KB 131
- Build: Create your agent suite
- Deploy:
human-agent deploy
CORE SDK PRIMITIVES
The HUMAN Agent SDK is built around a simple, unified programming model: everything is accessible through ctx (the execution context). This design makes the SDK discoverable, consistent, and easy to use across all languages.
Critical design principle: ctx is also the audit boundary. Every ctx method is automatically instrumented for provenance. Developers cannot bypass ctx — the runtime is sandboxed.
The Execution Context Pattern
Every handler receives a ctx object that provides access to all HUMAN capabilities:
import { handler } from '@human/agent-sdk';
export const processInvoice = handler({
id: 'process_invoice',
capabilities: ['finance/invoice/process'],
requires: {
scope: ['read:documents', 'write:accounting'],
vaults: ['vault://*/finance'],
},
async execute(ctx, input: { documentId: string }) {
// All resources accessible via ctx (all auto-logged for provenance)
const doc = await ctx.vaults.get('vault://acme/finance').read(`/invoices/${input.documentId}`);
const analysis = await ctx.llm.complete({ prompt: `Analyze: ${doc.content}` });
await ctx.call.agent('agent://accounting.record', { invoice: doc, analysis });
return analysis;
}
});
The ctx API Reference
| Resource | Purpose | Key Methods | HUMAN System |
|---|---|---|---|
ctx.passport |
Identity & delegation | self, principal, hasScope(), delegate() |
Passport |
ctx.oversight |
Human-in-the-loop (P5) | approve(), decide(), escalate(), notify() |
HumanOS |
ctx.call |
Universal capability-first routing | agent(), route(), withDelegation() |
HumanOS |
ctx.capabilities |
Capability Graph queries | find(), mine(), register(), evidence() |
Capability Graph |
ctx.workforce |
Human worker pool | submit(), status(), await(), cancel() |
Workforce Cloud |
ctx.vaults |
Multi-vault storage | list(), get(), self |
Passport (Vaults) |
ctx.memory |
Convenience over vaults | execution, session, persistent, suite |
— |
ctx.llm |
LLM access with auto-routing | complete(), stream(), embed(), cost |
— |
ctx.db |
Database access | query(), insert(), update() |
— |
ctx.secrets |
Credential cascade | get(), list() |
— |
ctx.events |
Provenance logging | log(), startSpan(), query(), export() |
— |
ctx.files |
File storage | read(), write(), list() |
— |
ctx.queue |
Background jobs | enqueue(), schedule() |
— |
ctx.http |
HTTP client with retries | get(), post(), request() |
— |
ctx.prompts |
Prompt loading | load(), render() |
— |
Key Distinctions
| Concept | Resource | Description |
|---|---|---|
| Identity (static) | ctx.passport |
Who am I? Who delegated? What's my scope? |
| Oversight (dynamic) | ctx.oversight |
Request approval, decisions, escalation from oversight chain |
| Routing | ctx.call |
Route to agents, humans, or models based on capability |
| Worker Pool | ctx.workforce |
Submit tasks to HUMAN's human workforce (not the principal) |
| Capabilities | ctx.capabilities |
Query/register what agents and humans can do |
| Storage | ctx.vaults |
Access purpose-scoped vaults (finance, legal, HR, etc.) |
Handler Definition
The handler() wrapper defines an entry point for agent functionality:
export const analyzeContract = handler({
// Unique identifier
id: 'analyze_contract',
// Version for tracking
version: '1.0.0',
// Capabilities this handler provides (for routing)
capabilities: ['legal/contract/risk_analysis'],
// What delegation this handler requires
requires: {
scope: ['read:contracts'],
riskLevel: 'medium',
},
// Secrets this handler can access (explicit declaration)
secrets: ['LEGAL_API_KEY'],
// User credentials needed (via Passport)
passport_scopes: ['salesforce.read'],
// The implementation
async execute(ctx, input: { contract: string }) {
const analysis = await ctx.llm.complete({
prompt: buildPrompt(input.contract),
tier: 'powerful',
});
return analysis;
}
});
HUMAN-IN-THE-LOOP (ctx.oversight)
The ctx.oversight resource is the SDK surface for P5 (Human-in-the-Loop). It provides interaction with whoever is providing oversight for the current execution — typically the human who delegated, but could also be an organization's policy enforcement or another agent in the chain.
Why "oversight" not "human"?
| Original Term | Problem | Revised Term |
|---|---|---|
ctx.human |
Ambiguous — could mean user, worker, principal | ctx.oversight |
"Oversight" clearly conveys:
- Approval, decisions, escalations
- The entity providing accountability
- Works regardless of whether delegator is human, org, or agent
ctx.oversight API
interface OversightContext {
// Request approval (blocks until response or timeout)
approve(options: {
action: string; // What we want to do
reason: string; // Why approval needed
risk: RiskClass; // How risky
timeout?: number; // Milliseconds before auto-reject
alternatives?: string[]; // Alternative options to show
}): Promise<ApprovalResult>;
// Present decision (multiple choice)
decide(options: {
question: string;
options: { id: string; label: string; description?: string }[];
default?: string;
timeout?: number;
}): Promise<DecisionResult>;
// Escalate to oversight (full handoff with rich context)
escalate(options: EscalateOptions): Promise<EscalationResult>;
// Notify without blocking
notify(message: string, options?: {
urgency?: 'info' | 'warning' | 'alert';
channel?: 'passport' | 'email' | 'slack';
}): Promise<void>;
// Check oversight availability
available(): Promise<{
online: boolean;
responseTimeEstimate?: number;
preferredChannel?: string;
}>;
}
// Rich escalation context
interface EscalateOptions {
// Developer provides: structured handoff context
why: {
reason: string; // Human-readable explanation
category: EscalationCategory; // capability_exceeded | uncertainty | policy | error | review
urgency: 'low' | 'normal' | 'urgent' | 'critical';
};
findings?: {
summary: string;
details: Record<string, unknown>;
confidence: number;
};
recommendation?: {
action: string;
reasoning: string;
alternatives?: string[];
};
question?: string; // What the agent needs answered
handoff?: {
resumeFrom: string; // Where to resume if task returns
state: Record<string, unknown>; // Serialized state
vault?: string; // Vault ref for large state
};
attachments?: AttachmentRef[];
}
// SDK auto-captures (developers don't need to provide)
interface EscalationContext extends EscalateOptions {
// Automatically populated by SDK:
execution: {
id: string;
agentDid: string;
handlerId: string;
startedAt: Date;
duration: number;
};
provenance: {
steps: ProvenanceEvent[];
delegationChain: string[];
decisionPoints: HumanDecision[];
};
cost: {
spent: number;
budget: number;
breakdown: CostBreakdown[];
};
}
Example: High-Risk Approval
export const processPayment = handler({
id: 'process_payment',
requires: { riskLevel: 'high' },
async execute(ctx, input: { amount: number; recipient: string }) {
// Request approval from oversight
const approval = await ctx.oversight.approve({
action: `Transfer $${input.amount} to ${input.recipient}`,
reason: 'Amount exceeds $1,000 threshold',
risk: 'high',
timeout: 300000, // 5 minutes
alternatives: ['Reject', 'Request more info'],
});
if (!approval.approved) {
return { status: 'rejected', reason: approval.reason };
}
// Approved, proceed
const result = await ctx.call.agent('agent://payments.transfer', {
...input,
approvalRef: approval.reference,
});
// Notify completion
await ctx.oversight.notify(`Payment completed: $${input.amount}`, {
urgency: 'info',
});
return result;
}
});
Example: Rich Escalation
export const analyzeContract = handler({
id: 'analyze_contract',
async execute(ctx, input: { contractId: string }) {
const contract = await ctx.vaults.get('vault://acme/legal')
.read(`/contracts/${input.contractId}`);
const analysis = await ctx.llm.complete({
prompt: `Analyze risks in: ${contract}`,
tier: 'powerful',
});
// Agent is uncertain about jurisdiction
if (analysis.confidence < 0.7) {
return ctx.oversight.escalate({
why: {
reason: 'Contract has cross-border implications I cannot assess',
category: 'uncertainty',
urgency: 'normal',
},
findings: {
summary: analysis.summary,
details: analysis,
confidence: analysis.confidence,
},
recommendation: {
action: 'Engage external legal counsel',
reasoning: 'Multiple jurisdictions identified',
alternatives: ['Proceed with standard review', 'Flag for compliance'],
},
question: 'Which jurisdiction should we prioritize?',
handoff: {
resumeFrom: 'jurisdiction_selected',
state: { contractId: input.contractId, analysis },
},
});
}
return analysis;
}
});
Provenance for Oversight Interactions
Every ctx.oversight.* call generates provenance:
{
type: 'oversight.approve',
executionId: 'exec-123',
request: { action: 'Transfer $5000', risk: 'high' },
response: { approved: true, approver: 'did:human:rick' },
timestamp: '2025-12-16T10:00:00Z',
responseTime: 45000, // 45 seconds
channel: 'passport_app',
}
ctx.workforce — Human Worker Pool
ctx.workforce connects to Workforce Cloud, HUMAN's marketplace of verified human workers. This is distinct from ctx.oversight:
| Resource | Who | Purpose |
|---|---|---|
ctx.oversight |
The principal (who delegated) | Approvals, decisions, escalations |
ctx.workforce |
Pool of verified workers | Task execution by humans |
ctx.workforce API
interface WorkforceContext {
// Submit a task to be completed by a human
submit(task: {
type: string; // 'review' | 'label' | 'translate' | 'verify' | custom
capability: string; // Required capability from Capability Graph
input: Record<string, unknown>; // Task data
instructions: string; // What the human should do
priority?: 'low' | 'normal' | 'high';
deadline?: Date;
constraints?: {
minQualification?: string; // Capability level required
region?: string[]; // Geographic restrictions
certifications?: string[]; // Required certs
};
}): Promise<TaskSubmission>;
// Check task status
status(taskId: string): Promise<TaskStatus>;
// Wait for completion
await(taskId: string, options?: { timeout?: number }): Promise<TaskResult>;
// Cancel pending task
cancel(taskId: string, reason: string): Promise<void>;
}
Example: Human Review in Workflow
export const processApplication = handler({
id: 'process_application',
async execute(ctx, input: { applicationId: string }) {
const app = await ctx.db.query('applications', { id: input.applicationId });
const aiAssessment = await ctx.llm.complete({ prompt: `Assess: ${app}` });
// Submit to human workforce for verification
const task = await ctx.workforce.submit({
type: 'review',
capability: 'hr/application/senior_review',
input: { application: app, assessment: aiAssessment },
instructions: 'Verify AI assessment and confirm hire decision',
priority: 'normal',
deadline: new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h
});
// Wait for human to complete
const result = await ctx.workforce.await(task.id, { timeout: 86400000 });
return result.decision;
}
});
ctx.capabilities — Capability Graph
ctx.capabilities provides access to the Capability Graph Engine for querying what agents and humans can do.
ctx.capabilities API
interface CapabilitiesContext {
// Find entities with a capability
find(options: {
capability: string; // e.g., 'legal/contract/review'
minLevel?: number; // Minimum proficiency (0-1)
entityType?: 'agent' | 'human' | 'both';
available?: boolean; // Currently available?
}): Promise<CapabilityMatch[]>;
// Get capabilities of current agent
mine(): Promise<Capability[]>;
// Register/update a capability (with evidence)
register(capability: {
domain: string; // e.g., 'finance/tax/compliance'
evidenceRefs: string[]; // Proofs of capability
confidence: number; // Self-assessed (0-1)
}): Promise<CapabilityRegistration>;
// Add evidence to existing capability
evidence(capabilityId: string, evidence: EvidenceRef): Promise<void>;
// Check if current agent has capability
has(capability: string, minLevel?: number): Promise<boolean>;
}
Example: Capability-Based Routing
export const routeToExpert = handler({
id: 'route_to_expert',
async execute(ctx, input: { taskType: string; document: string }) {
// Find who can handle this
const experts = await ctx.capabilities.find({
capability: input.taskType,
minLevel: 0.8,
available: true,
});
if (experts.length === 0) {
// No one available — escalate
return ctx.oversight.escalate({
why: { reason: 'No experts available', category: 'capability_exceeded', urgency: 'normal' },
});
}
// Route to best match
const best = experts[0]; // Sorted by capability score
return ctx.call.route({
capability: input.taskType,
input: { document: input.document },
});
}
});
UNIVERSAL ROUTING (ctx.call)
ctx.call is the universal routing primitive. It routes to agents, humans, or models based on capability — not explicit targeting.
ctx.call API
interface CallContext {
// Direct agent call (when you know the target)
agent(target: string, input: Record<string, unknown>): Promise<CallResult>;
// Capability-based routing (let HumanOS decide)
route(options: {
capability: string; // What capability is needed
input: Record<string, unknown>;
preferences?: {
preferAgent?: boolean; // Prefer AI over human?
maxCost?: number; // Cost constraint
maxLatency?: number; // Latency constraint
};
}): Promise<CallResult>;
// Wrap call with specific delegation
withDelegation(options: {
scopes: string[];
budget?: number;
expires?: Date;
}): CallContext;
}
Example: Universal Routing
export const handleTask = handler({
id: 'handle_task',
async execute(ctx, input: { capability: string; data: unknown }) {
// Route to whoever can handle it — agent or human
const result = await ctx.call.route({
capability: input.capability,
input: { data: input.data },
preferences: {
preferAgent: true, // Try AI first
maxCost: 1.00, // $1 max
},
});
return result;
}
});
AGENT-TO-AGENT DELEGATION MODEL
When agents call other agents via ctx.call.agent(), delegation follows a chained model with automatic scoping.
The Delegation Chain
User (Rick) grants delegation to Agent A
→ Agent A calls Agent B via ctx.call.agent()
→ SDK auto-scopes: A grants B only what B needs
→ Provenance records: Rick → A → B
How It Works
// Agent A calls Agent B
await ctx.call.agent('agent://accounting.record', invoice);
// SDK automatically:
// 1. Reads B's manifest to see required scopes
// 2. Checks A's delegation allows sub-delegation (canSubDelegate)
// 3. Verifies A has the scopes B needs
// 4. Creates NEW delegation: A → B (not Rick → B directly)
// 5. Scopes to ONLY what B declared it needs (minimal privilege)
// 6. Logs provenance: "A sub-delegated to B under authority from Rick"
Provenance Chain Recorded
{
chain: [
{ grantor: 'did:human:rick', grantee: 'did:human:agent:invoice-processor',
scope: ['read:invoices', 'write:accounting'], canSubDelegate: true },
{ grantor: 'did:human:agent:invoice-processor', grantee: 'did:human:agent:accounting-recorder',
scope: ['write:accounting'], parentDelegation: 'del-abc123' }
],
action: 'accounting.record',
timestamp: '2025-12-16T10:00:00Z',
signature: 'ed25519:...'
}
Explicit Pass-Through (When Required)
For cases where full delegation must pass through:
await ctx.call.withDelegation({
scopes: ctx.passport.delegation.scopes, // Full scope passthrough
}).agent('agent://accounting.record', invoice);
// ⚠️ Still creates provenance showing A delegated to B
Delegation Validation Rules
- Cannot escalate scope — B cannot receive more than A has
- Cannot exceed parent — If Rick didn't grant
canSubDelegate, A cannot pass to B - Time-bounded — Sub-delegation expires when parent expires (or sooner)
- Revocable — Rick revoking A automatically invalidates A→B chain
ctx.passport — Identity Layer
ctx.passport provides access to the Passport identity layer. Every entity in HUMAN — humans, organizations, and agents — has a Passport.
Passport Model
interface PassportContext {
// This agent's identity
self: {
did: string; // e.g., 'did:human:agent:invoice-processor'
kind: 'human' | 'org' | 'agent';
metadata: PassportMetadata;
};
// The principal who delegated (human, org, or agent)
principal: {
did: string; // e.g., 'did:human:rick'
kind: 'human' | 'org' | 'agent';
metadata: PassportMetadata;
};
// The current delegation in effect
delegation: {
id: string; // Delegation token ID
scopes: string[]; // Granted scopes
canSubDelegate: boolean; // Can this agent delegate further?
constraints: DelegationConstraints;
expiresAt?: Date;
revokedAt?: Date;
chain: DelegationChainEntry[]; // Full delegation chain
};
// Check if current delegation includes scope
hasScope(scope: string): boolean;
// Request additional scope from principal
requestScope(options: {
scope: string;
reason: string;
}): Promise<ScopeRequestResult>;
// Create sub-delegation (if canSubDelegate = true)
delegate(options: {
target: string; // Target agent DID
scopes: string[]; // Scopes to grant (subset of own)
constraints?: DelegationConstraints;
expires?: Date;
}): Promise<DelegationToken>;
// Get delegated access to user credential (OAuth, etc.)
getAccess(scope: string): Promise<AccessGrant>;
}
Example: Checking Delegation
export const processFinance = handler({
id: 'process_finance',
async execute(ctx, input) {
// Who am I?
console.log(ctx.passport.self.did);
// → 'did:human:agent:finance-processor'
// Who delegated to me?
console.log(ctx.passport.principal.did);
// → 'did:human:rick' (a human)
// or → 'did:human:org:acme' (an org)
// or → 'did:human:agent:orchestrator' (another agent)
// What can I do?
if (!ctx.passport.hasScope('write:transactions')) {
return ctx.oversight.escalate({
why: { reason: 'Insufficient scope', category: 'capability_exceeded', urgency: 'normal' },
});
}
// Process with granted scope
// ...
}
});
ctx.vaults — Multi-Vault Storage
ctx.vaults provides access to purpose-scoped Vaults. Passport holders (humans, orgs, agents) can own and access multiple Vaults, with access controlled by delegation.
Vault Model
Key points:
- Vaults are many-to-one: A Passport holder can own/access many Vaults
- Vaults are purpose-scoped:
vault://acme/finance,vault://acme/legal,vault://acme/hr - Access is delegation-controlled: Agent only accesses Vaults included in its delegation
- Agent always has its own Vault at
ctx.vaults.self
ctx.vaults API
interface VaultsContext {
// This agent's own vault (always accessible)
self: VaultHandle;
// List vaults accessible via current delegation
list(): Promise<VaultInfo[]>;
// Get handle to a specific vault
get(uri: string): VaultHandle;
// Check if vault is accessible
canAccess(uri: string): boolean;
}
interface VaultHandle {
uri: string;
// Read data
read(path: string): Promise<unknown>;
exists(path: string): Promise<boolean>;
list(prefix?: string): Promise<string[]>;
// Write data (if permitted)
write(path: string, data: unknown, options?: WriteOptions): Promise<void>;
append(path: string, data: unknown): Promise<void>;
delete(path: string): Promise<void>;
// Metadata
metadata(path: string): Promise<VaultEntryMetadata>;
// Versioning (if enabled)
history(path: string): Promise<Version[]>;
restore(path: string, version: string): Promise<void>;
}
interface WriteOptions {
overwrite?: boolean; // Default: false (version if exists)
schema?: string; // Validate against schema
ttl?: number; // Auto-delete after N seconds
encrypt?: boolean; // Encrypt at rest (default: true)
}
Manifest Configuration
# human-agent.yaml
vaults:
# Agent's own vault (auto-created)
self:
paths:
'/cache/*':
ttl: 3600 # Auto-clean after 1h
'/state/*':
schema: 'agent-state'
versioned: true
# Vaults this agent needs access to (via delegation)
requires:
- uri: 'vault://*/finance'
scopes: ['read', 'write']
- uri: 'vault://*/legal'
scopes: ['read']
# Path schemas (enforce structure)
schemas:
agent-state:
type: object
properties:
lastRun: { type: string, format: date-time }
checkpoint: { type: object }
Example: Multi-Vault Access
export const crossDepartmentAnalysis = handler({
id: 'cross_department_analysis',
requires: {
vaults: ['vault://acme/finance', 'vault://acme/legal'],
},
async execute(ctx, input: { reportType: string }) {
// Access finance vault
const financeData = await ctx.vaults.get('vault://acme/finance')
.read('/reports/quarterly');
// Access legal vault
const contracts = await ctx.vaults.get('vault://acme/legal')
.read('/contracts/active');
// Write to agent's own vault
await ctx.vaults.self.write('/analysis/latest', {
finance: financeData,
contracts,
generatedAt: new Date(),
});
return { status: 'complete' };
}
});
Vault Safety Mechanisms
The SDK enforces multiple safety layers on Vault writes:
| Mechanism | Description |
|---|---|
| Path Schemas | Only declared paths can be written |
| Namespace Isolation | Agent writes auto-prefixed with agent ID |
| Type Validation | Data validated against declared schemas |
| Path Sanitization | Prevents ../ injection attacks |
| Quotas | Size limits, key counts, write rate limits |
| Versioning | No silent overwrites (default) |
| PII Detection | Auto-detect/encrypt/reject PII |
| Audit Logging | Every write logged to provenance |
# human-agent.yaml - Safety configuration
vaults:
self:
paths:
# Only these paths allowed
'/cache/*':
writers: [self] # Only this agent
max_size: 1mb
ttl: 3600
'/state/checkpoint':
writers: [self]
schema: checkpoint-schema
versioned: true # Keep history
pii: reject # No PII allowed
quotas:
max_total_size: 100mb
max_keys: 10000
writes_per_minute: 100
ctx.memory — Memory Fabric
ctx.memory is the built-in memory fabric for agents. Storage appears when you use it — no external database setup, no vector DB configuration, no credentials to manage.
Key Design Principle:
Memory is human-owned, not control-plane-owned.
Unlike other agent platforms where the control plane owns agent memory, HUMAN's memory fabric is backed by Passport-linked Vaults. The human (or org) owns their data — portable, encrypted, and sovereign.
Why This Matters
| Without Built-in Memory | With ctx.memory |
|---|---|
| Configure Pinecone/Qdrant separately | Just use ctx.memory |
| Manage vector DB credentials | Credentials handled by platform |
| Different setup for dev/staging/prod | Same code everywhere |
| Agent code imports storage SDKs | Agent uses ctx, platform resolves |
| Data locked in vendor silos | Data portable with Passport |
Scope Mapping
| Scope | Maps To | Lifetime | Shared With |
|---|---|---|---|
execution |
In-memory | Single execution | No one |
session |
ctx.vaults.self /sessions/{id} |
Session TTL | Same session |
persistent |
ctx.vaults.self /persistent/ |
Permanent | Same agent |
suite |
Suite vault | Permanent | All suite agents |
ctx.memory API
interface MemoryContext {
// Per-execution (in-memory, fastest)
execution: ScopedMemory;
// Per-session (persisted, session TTL)
session: ScopedMemory;
// Persistent for this agent
persistent: ScopedMemory;
// Shared across agent suite
suite: ScopedMemory;
}
interface ScopedMemory {
// ═══════════════════════════════════════════════════════════════
// KEY-VALUE STORAGE
// ═══════════════════════════════════════════════════════════════
get<T>(key: string): Promise<T | undefined>;
set<T>(key: string, value: T, options?: { ttl?: number }): Promise<void>;
delete(key: string): Promise<void>;
has(key: string): Promise<boolean>;
keys(prefix?: string): Promise<string[]>;
// ═══════════════════════════════════════════════════════════════
// VECTOR STORAGE (Embeddings & Semantic Search)
// ═══════════════════════════════════════════════════════════════
/**
* Store a vector embedding with associated metadata.
* Platform handles the underlying vector database (no Pinecone/Qdrant setup).
*/
setVector(key: string, embedding: number[], metadata?: Record<string, unknown>): Promise<void>;
/**
* Retrieve a stored vector by key.
*/
getVector(key: string): Promise<VectorEntry | undefined>;
/**
* Semantic similarity search across stored vectors.
* Returns matches sorted by similarity (highest first).
*/
search(embedding: number[], options?: SearchOptions): Promise<VectorMatch[]>;
/**
* Delete a vector by key.
*/
deleteVector(key: string): Promise<void>;
/**
* List all vector keys (optionally filtered by prefix).
*/
vectorKeys(prefix?: string): Promise<string[]>;
}
interface SearchOptions {
topK?: number; // Max results to return (default: 10)
threshold?: number; // Min similarity score 0-1 (default: 0.0)
filter?: Record<string, unknown>; // Metadata filter
includeMetadata?: boolean; // Include metadata in results (default: true)
includeVectors?: boolean; // Include vectors in results (default: false)
}
interface VectorEntry {
key: string;
embedding: number[];
metadata?: Record<string, unknown>;
createdAt: Date;
updatedAt: Date;
}
interface VectorMatch {
key: string;
score: number; // Similarity score 0-1
metadata?: Record<string, unknown>;
embedding?: number[]; // Only if includeVectors: true
}
Example: Memory Scopes (Key-Value)
export const conversationalAgent = handler({
id: 'conversational_agent',
async execute(ctx, input: { message: string }) {
// Execution scope: temp working data (gone after this execution)
await ctx.memory.execution.set('working', { partial: true });
// Session scope: conversation history (lasts session TTL)
const history = await ctx.memory.session.get<Message[]>('history') ?? [];
history.push({ role: 'user', content: input.message });
await ctx.memory.session.set('history', history);
// Persistent: user preferences (permanent)
const prefs = await ctx.memory.persistent.get('user_prefs');
// Suite: shared knowledge (other agents can read)
await ctx.memory.suite.set('last_interaction', {
agentId: ctx.passport.self.did,
timestamp: new Date(),
});
const response = await ctx.llm.complete({
prompt: formatPrompt(history, prefs),
});
history.push({ role: 'assistant', content: response.content });
await ctx.memory.session.set('history', history);
return response;
}
});
Example: Vector Storage (Semantic Search)
export const documentSearchAgent = handler({
id: 'document_search',
capabilities: ['docs/search', 'docs/ingest'],
async execute(ctx, input: { action: 'ingest' | 'search'; content?: string; query?: string }) {
if (input.action === 'ingest' && input.content) {
// ═══════════════════════════════════════════════════════════════
// INGEST: Store document with embedding
// ═══════════════════════════════════════════════════════════════
// Get embedding from LLM (platform handles provider)
const embedding = await ctx.llm.embed(input.content);
// Store in persistent memory — no Pinecone setup needed!
const docId = `doc:${Date.now()}`;
await ctx.memory.persistent.setVector(docId, embedding, {
content: input.content,
ingestedAt: new Date().toISOString(),
source: 'user_upload',
});
return { docId, status: 'ingested' };
}
if (input.action === 'search' && input.query) {
// ═══════════════════════════════════════════════════════════════
// SEARCH: Find similar documents
// ═══════════════════════════════════════════════════════════════
// Embed the query
const queryEmbedding = await ctx.llm.embed(input.query);
// Semantic search — platform handles vector DB
const results = await ctx.memory.persistent.search(queryEmbedding, {
topK: 5,
threshold: 0.7, // Only return good matches
includeMetadata: true,
});
return {
query: input.query,
results: results.map(r => ({
docId: r.key,
score: r.score,
content: r.metadata?.content,
})),
};
}
throw new Error('Invalid action');
}
});
Example: Cross-Agent Knowledge Sharing
// Agent A: Knowledge ingestion agent
export const knowledgeIngester = handler({
id: 'knowledge_ingester',
async execute(ctx, input: { documents: string[] }) {
for (const doc of input.documents) {
const embedding = await ctx.llm.embed(doc);
// Store in SUITE scope — other agents in this suite can search it
await ctx.memory.suite.setVector(`kb:${hash(doc)}`, embedding, {
content: doc,
ingestedBy: ctx.passport.self.did,
ingestedAt: new Date().toISOString(),
});
}
return { ingested: input.documents.length };
}
});
// Agent B: Question answering agent (different agent, same suite)
export const qaAgent = handler({
id: 'qa_agent',
async execute(ctx, input: { question: string }) {
const queryEmbedding = await ctx.llm.embed(input.question);
// Search the SUITE memory — finds docs ingested by any suite agent
const relevantDocs = await ctx.memory.suite.search(queryEmbedding, {
topK: 3,
threshold: 0.75,
});
const context = relevantDocs.map(d => d.metadata?.content).join('\n\n');
const answer = await ctx.llm.complete({
prompt: `Answer based on context:\n\nContext:\n${context}\n\nQuestion: ${input.question}`,
});
return { answer: answer.content, sources: relevantDocs.map(d => d.key) };
}
});
Example: Namespace Pattern (Multi-Tenant)
export const multiTenantSearch = handler({
id: 'multi_tenant_search',
async execute(ctx, input: { namespace: string; query: string }) {
const queryEmbedding = await ctx.llm.embed(input.query);
// Filter by namespace prefix — isolation without separate DBs
const results = await ctx.memory.persistent.search(queryEmbedding, {
topK: 10,
filter: { namespace: input.namespace }, // Metadata filter
});
// Or use key prefix pattern
const allKeys = await ctx.memory.persistent.vectorKeys(`${input.namespace}:`);
return { results, keyCount: allKeys.length };
}
});
PROVENANCE & AUDIT MODEL
ctx serves as the audit boundary. Every ctx method is automatically instrumented for provenance. Developers cannot bypass ctx.
Design Principles
- ctx IS the control point — All access goes through ctx
- Auto-instrumented — Every ctx method logs automatically
- Sandboxed runtime — Handlers cannot bypass ctx
- Hashed sensitive data — Inputs/outputs hashed, not stored raw
- Cryptographically signed — Events signed by agent
What Gets Logged
| Method | Auto-Logged Data |
|---|---|
ctx.llm.complete() |
model, tier, tokens, cost, latency, prompt hash |
ctx.llm.embed() |
model, dimensions, tokens, latency |
ctx.call.agent() |
target, delegation chain, input/output hashes |
ctx.call.route() |
capability, selected resource, routing reason |
ctx.vaults.*.write() |
vault, path, size, schema, version |
ctx.vaults.*.read() |
vault, path, found/not found |
ctx.memory.*.set() |
scope, key, size, ttl |
ctx.memory.*.get() |
scope, key, found/not found |
ctx.memory.*.setVector() |
scope, key, dimensions, metadata keys |
ctx.memory.*.search() |
scope, dimensions, topK, threshold, result count |
ctx.oversight.approve() |
action, risk, approver, decision, response time |
ctx.oversight.escalate() |
reason, category, handoff state hash |
ctx.workforce.submit() |
task type, capability, priority, worker assigned |
ctx.db.query() |
query hash, rows affected, latency |
ctx.http.request() |
url (domain only), method, status, latency |
ctx.secrets.get() |
key name (not value!), source |
ctx.events API
interface EventsContext {
// SDK logs automatically — developers rarely call directly
log(event: ProvenanceEvent): Promise<void>;
// Span tracking (for nested operations)
startSpan(name: string, metadata?: Record<string, unknown>): Span;
// Developer custom events
custom(type: string, data: Record<string, unknown>): Promise<void>;
// Query provenance (for debugging)
query(options: {
executionId?: string;
timeRange?: [Date, Date];
types?: string[];
}): Promise<ProvenanceEvent[]>;
// Export for audit
export(options: ExportOptions): Promise<AuditBundle>;
}
Event Structure
interface ProvenanceEvent {
// Identity
id: string;
executionId: string;
parentSpanId?: string;
// What happened
type: string; // 'llm.complete', 'vault.write', 'oversight.approve'
status: 'started' | 'success' | 'error';
// Context
agentDid: string;
handlerId: string;
delegationChain: string[];
// Data (hashed where sensitive)
input?: string; // Hash of input
output?: string; // Hash of output
metadata: Record<string, unknown>;
// Timing
timestamp: Date;
duration?: number;
// Cost (if applicable)
cost?: { amount: number; currency: 'USD'; type: string };
// Cryptographic proof
signature: string; // Signed by agent
}
Manifest Configuration
# human-agent.yaml
provenance:
# Primary storage
primary:
type: ledger # Append-only, immutable
location: managed # or: self-hosted
retention: 7y # Regulatory compliance
# Real-time streaming
stream:
enabled: true
destinations:
- type: webhook
url: https://audit.acme.com/events
- type: kafka
topic: human-provenance
# What to capture
capture:
# Always (cannot disable)
required:
- call.*
- oversight.*
- workforce.*
- vault.write
# Optional (for debugging)
optional:
- llm.*
- db.*
- http.*
# Data handling
data:
hash_inputs: true
hash_outputs: true
full_capture:
environments: [development]
retention: 24h
How Bypass is Prevented
// ❌ Can't import raw HTTP — SDK doesn't expose it
import fetch from 'node-fetch';
// ❌ Can't access process.env — blocked
const key = process.env.API_KEY;
// ❌ Can't write to filesystem — blocked
import fs from 'fs';
// ✅ Must use ctx
const result = await ctx.http.get('https://api.example.com');
const key = await ctx.secrets.get('API_KEY');
await ctx.vaults.self.write('/data.json', data);
Enforcement:
- Sandboxed runtime — Handler runs in isolated environment
- Import restrictions — Only
@human/agent-sdkavailable - Network policies — Outbound only via
ctx.http - Filesystem isolation — Only
ctx.files/ctx.vaults
CREDENTIAL MANAGEMENT
Philosophy: Progressive permission acquisition. No upfront secret lists.
Zero-Config Secrets
// Developer just uses secrets — no manifest declarations
export const processPayment = handler({
id: 'process_payment',
async execute(ctx, input) {
// Just get the secret you need
const stripeKey = await ctx.secrets.get('STRIPE_KEY');
// Runtime handles:
// - First access: "Allow STRIPE_KEY for process_payment?" (dev mode)
// - Subsequent: Auto-allow (learned pattern)
// - Production: Only allows learned patterns
}
});
Developers don't configure:
- ❌
secrets: [STRIPE_KEY, SENDGRID_KEY] - ❌ Handler-specific secret lists
- ❌ Agent-level secret lists
Runtime learns and enforces automatically.
How It Works
┌─────────────────────────────────────────────────────────────────┐
│ CREDENTIAL CONTROLLER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ DEV MODE (learning): │
│ └── Handler tries: ctx.secrets.get('STRIPE_KEY') │
│ └── Runtime: "Allow? [y/n]" (or auto-allow in dev) │
│ └── Records: "process_payment uses STRIPE_KEY" │
│ │
│ PROD MODE (enforcing): │
│ └── Handler tries: ctx.secrets.get('STRIPE_KEY') │
│ └── Runtime checks: "Is this a learned pattern?" │
│ └── If yes: Allow │
│ └── If no: Deny + alert │
│ │
└─────────────────────────────────────────────────────────────────┘
The Credential Cascade
When ctx.secrets.get('KEY') is called, runtime resolves:
- Passport Keychain — User-owned (OAuth tokens, delegated)
- Agent Vault — Agent's secrets (auto-discovered)
- Org Vault — Org-level secrets (if granted)
- Environment — Dev fallback (
.envfiles)
User Credentials (Progressive)
export const sendEmail = handler({
id: 'send_email',
// No upfront passport_scopes declaration needed
async execute(ctx, input) {
// Request access when you need it (progressive)
const gmailAccess = await ctx.passport.getAccess('gmail.send');
if (!gmailAccess.granted) {
// Runtime prompts user: "Allow email sending?"
return ctx.oversight.escalate({
why: {
reason: 'Need Gmail access to send invoice',
category: 'capability_exceeded',
urgency: 'normal',
},
});
}
// Use delegated access (short-lived, revocable)
await gmail.send(gmailAccess.token, { ... });
}
});
Key principles:
- Agent receives delegation tokens, not raw credentials
- User can revoke anytime via Passport
- No upfront scope declarations — request when needed
Automatic Secret Rotation
Runtime handles rotation automatically:
- Database credentials: Rotated every 30 days
- API keys: Rotated based on provider recommendations
- OAuth tokens: Refreshed before expiry
Developers don't think about rotation.
LLM & COST MANAGEMENT
Philosophy: Declare budget, not thresholds. Runtime optimizes automatically.
Zero-Config LLM
// Developer just calls LLM — no tier, no model selection
const result = await ctx.llm.complete({
prompt: 'Analyze this invoice...',
});
// Runtime automatically:
// - Selects optimal model based on prompt complexity
// - Considers agent's remaining budget
// - Routes to cheapest model that meets quality threshold
// - Falls back if primary provider is down
No tier selection. No model selection. No provider selection.
Optional: Declare Intent (Not Implementation)
// If you have quality requirements, declare INTENT:
const result = await ctx.llm.complete({
prompt: '...',
quality: 'high', // Intent: "I need high quality"
// NOT: tier: 'powerful' (implementation detail)
});
// Runtime decides: Claude Opus? GPT-4? Based on:
// - What's available
// - What's cheapest for this quality level
// - What's within budget
Budget-Based Cost Control
Minimal manifest:
# human-agent.yaml
budget:
daily: $50
That's it. Runtime handles:
- When to warn (learns from your response patterns)
- When to escalate (based on spend velocity)
- Anomaly detection (automatic)
What developers DON'T configure:
- ❌
thresholds: [80%, 95%, 100%]— runtime learns - ❌
circuit_breaker: 5x_normal— automatic - ❌
action: notify_developer— smart defaults
How Budget Works
┌─────────────────────────────────────────────────────────────────┐
│ RUNTIME COST CONTROLLER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Budget: $50/day │
│ │
│ Observed patterns: │
│ - Normal daily spend: $30 │
│ - Developer usually responds to alerts within 2h │
│ - Developer approved 90% of budget increase requests │
│ │
│ Adaptive behavior: │
│ - At $40 (80%): Start preferring cheaper models │
│ - At $45 (90%): Alert developer (learned threshold) │
│ - At $48 (96%): Escalate for approval │
│ - Spike detection: If spending 3x normal rate → alert │
│ │
└─────────────────────────────────────────────────────────────────┘
Cost Observability (Automatic)
Every LLM call returns cost metadata:
const result = await ctx.llm.complete({ prompt: '...' });
// Cost info automatically available
console.log(ctx.llm.cost);
// {
// thisCall: 0.02,
// sessionTotal: 12.50,
// dailyRemaining: 5.00,
// status: 'healthy' // or: 'approaching_limit', 'needs_approval'
// }
Model Selection is HUMAN's Problem
| What Developers Think About | What HUMAN Handles |
|---|---|
| "Analyze this invoice" | Which model? Which provider? |
| "I need high quality" | GPT-4 vs Claude Opus vs Gemini Ultra? |
| "This is simple" | GPT-3.5 vs Claude Haiku vs Gemini Flash? |
| "I have $50/day" | Routing to stay within budget |
| "Something's wrong" | Fallback to backup provider |
REASONING SERVICE & MARKETPLACE CERTIFICATION
Updated: 2025-12-19
Reasoning as First-Class OS Primitive
The HUMAN SDK provides AI reasoning as an OS-level service through ctx.reason(). This replaces direct LLM integration and provides:
- Automatic model selection (capability-based routing)
- Zero configuration (inherits org's keys/policies)
- Governance integration (data tier constraints enforced)
- Provider abstraction (same API across OpenAI, Anthropic, local models)
See comprehensive specification: 141_reasoning_service_architecture.md
Basic Usage
// Simple reasoning call
const result = await ctx.reason({
task: "summarize",
input: document,
preferences: { latency: "interactive" }
});
// Sugar functions
const summary = await ctx.summarize(text);
const classification = await ctx.classify({ items, labels });
What you DON'T do:
- ❌ Import OpenAI/Anthropic SDKs
- ❌ Manage API keys
- ❌ Handle provider differences
- ❌ Configure governance rules
Everything inherits from org context automatically.
Agent Manifest: Declaring Reasoning Requirements
# human-agent.yaml
reasoning_requirements:
capabilities:
- natural_language
- classification
data_tiers:
- Working
supported_profiles:
- standard
- standard_safety
At runtime: HumanOS matches your requirements to org's available models automatically.
Marketplace Certification Tiers
Agents published to marketplace get certification badges based on portability:
🟢 Verified Portable (Preferred)
- Pure capability-based routing
- Works on any org's model setup
- Featured placement in marketplace
- Standard rev share (70% developer / 30% HUMAN)
🟡 Profile Required
- Requires specific reasoning profile (e.g., "high_safety")
- Works if org has compatible profile
- Clear compatibility indicator
- Standard rev share
🔴 Model-Specific (Use Sparingly)
- Pinned to HUMAN alias (e.g.,
haio.sonnet_4_5_strict) - Only works if org has that specific model
- Lower install rate
- Higher HUMAN rev share (60/40)
- Requires justification + manual review
Best Practice: Build Portable
# ✅ GOOD: Portable agent
reasoning_requirements:
capabilities: ["natural_language", "tools"]
min_context: 32000
# ⚠️ OK: Profile-specific (if needed)
reasoning_requirements:
profiles: ["high_safety"]
reason: "Handles PHI, requires HIPAA-compliant models"
# 🚫 AVOID: Model-specific (only if critical)
reasoning_requirements:
model_alias: "haio.sonnet_4_5_strict"
reason: "FDA-certified workflow"
Portable agents install everywhere. Model-specific agents have limited reach.
TESTING PATTERNS
The SDK provides testing utilities designed for non-deterministic LLM outputs.
Semantic Assertions
Test that outputs contain expected concepts, not exact strings:
import { test, expectSemantic } from '@human/agent-sdk/testing';
test('analyzeContract returns risk analysis', async () => {
const result = await analyzeContract({ contract: sampleContract });
// 85% semantic similarity threshold (configurable)
await expectSemantic(result).toContain([
'liability clauses',
'termination rights',
'risk assessment',
]);
// Override threshold for stricter tests
await expectSemantic(result, { threshold: 0.95 }).toContain([...]);
});
Golden Output Testing
Record and compare against approved outputs:
import { test, recordGolden } from '@human/agent-sdk/testing';
test('analyzeContract matches golden output', async () => {
const result = await analyzeContract({ contract: sampleContract });
// First run: Records output, marks as "pending review"
// Subsequent runs: Compares semantically to approved golden
await recordGolden('analyze-contract', result, {
semanticSimilarityThreshold: 0.85,
});
});
Developer workflow:
# First run records output
$ human-agent test
📸 New golden output recorded: analyze-contract.golden.json
Status: PENDING REVIEW
# Developer reviews and approves
$ human-agent golden approve analyze-contract
✅ Golden output approved by rick@human.com
# CI enforces
$ human-agent test
✅ analyze-contract: 91% similar to golden (threshold: 85%)
Deterministic Mode (CI/CD)
Record real LLM calls and replay in CI:
// In dev: Record mode
beforeAll(() => {
ctx.llm.setMode('record'); // Calls real LLM, saves responses
});
// In CI: Replay mode (deterministic)
beforeAll(() => {
ctx.llm.setMode('replay'); // Uses saved responses
});
Fixtures stored in: fixtures/llm-responses/{input-hash}.json
# Refresh fixtures with current LLM
$ human-agent test --refresh-fixtures
🔄 Refreshing 23 fixtures...
✅ Done. Review changes in fixtures/
TIME-TRAVEL DEBUGGING
The SDK records execution history for debugging and replay.
Storage Model
| Mode | What's Stored | Retention | Replay? |
|---|---|---|---|
| Metadata (default) | Hashes, timing, status | 90 days | ❌ No |
| Full Capture (opt-in) | All inputs, outputs, LLM calls | 7 days | ✅ Yes |
Manifest Configuration
# human-agent.yaml
debugging:
# Default: metadata only (privacy-safe)
default_retention: metadata_only
# Opt-in: full data for specific handlers
full_capture:
handlers:
- parse_invoice # Need to debug this one
environments:
- development
- staging
# Never production unless explicit
# Retention periods
retention:
metadata: 90d
full_data: 7d
# PII handling
pii:
mode: redact # Auto-redact detected PII
fields: [email, phone, ssn]
Replay Executions
$ human-agent replay exec-abc123
Replaying execution: exec-abc123 (invoice-processor)
Step 1/5: parser.parse ✅ Completed (234ms)
Step 2/5: validator.check ✅ Completed (45ms)
Step 3/5: router.route ❌ Failed: "No accounting dept found"
Paused at step 3. Options:
[r] Resume [s] Step [e] Edit input [q] Quit
LLM Response Recording
When full capture is enabled, LLM responses are recorded for exact replay:
{
step: 'analyze_contract',
llm: {
provider: 'openai',
model: 'gpt-4-turbo',
prompt: 'Analyze this contract...',
response: 'The contract contains...',
tokens: { input: 1500, output: 300 },
cost: 0.02,
}
}
Provenance vs Debug Data
Provenance (immutable, never deleted):
- Cryptographically signed
- Hashes only (no raw content)
- Proves WHAT happened
Debug data (opt-in, time-limited):
- Full content for replay
- Auto-expires after retention period
- Shows HOW it happened
PROMPT VERSIONING
Prompts are version-controlled in the code repo and published to a runtime registry.
Prompt File Format
<!-- prompts/analyze-contract.md -->
---
id: analyze-contract
version: 2.1.0
description: Analyze legal contracts for risk
author: rick@human.com
---
# Contract Risk Analysis
Analyze the following contract and identify:
1. Liability clauses
2. Termination rights
3. Financial obligations
{{contract}}
Return analysis as JSON with riskLevel, findings, confidence.
CI Publishing
# .github/workflows/prompts.yml
on:
push:
paths: ['prompts/**']
branches: [main]
jobs:
publish-prompts:
steps:
- run: human-agent prompts publish
Handler Usage
export const analyzeContract = handler({
id: 'analyze_contract',
// Pin to specific version (recommended for prod)
prompt: 'prompts/analyze-contract@2.1.0',
async execute(ctx, input) {
const prompt = await ctx.prompts.load('analyze-contract');
return ctx.llm.complete({
prompt: prompt.render({ contract: input.contract })
});
}
});
Version Management
# List versions
$ human-agent prompts versions analyze-contract
v2.1.0 (current) - 2025-12-16 - "Added confidence scoring"
v2.0.0 - 2025-12-01 - "Restructured output format"
v1.0.0 - 2025-11-15 - "Initial version"
# Rollback
$ human-agent prompts rollback analyze-contract --to v2.0.0
⚠️ This will update production. Continue? (y/n): y
✅ Rolled back to v2.0.0
# A/B test
$ human-agent prompts test analyze-contract@v2.1.0 --against v2.0.0
Running 50 test cases...
v2.0.0: 85% quality score, $0.015 avg cost
v2.1.0: 91% quality score, $0.018 avg cost
📊 v2.1.0 is 7% better quality, 20% more expensive
INFRASTRUCTURE PROVISIONING
Philosophy: Infrastructure appears when you use it. No configuration required.
Zero-Config Infrastructure
// Developer writes this:
export const processInvoice = handler({
id: 'process_invoice',
async execute(ctx, input) {
// Use database → it appears
await ctx.db.query('invoices', { id: input.id });
// Use cache → it appears
await ctx.cache.get('recent');
// Use file storage → it appears
await ctx.files.write('/reports/latest.pdf', pdf);
// Use queue → it appears
await ctx.queue.enqueue('process', task);
}
});
HUMAN auto-provisions:
- Database when
ctx.dbis used - Cache when
ctx.cacheis used - Storage when
ctx.filesis used - Queue when
ctx.queueis used
No manifest configuration needed. No sizing. No provisioning.
How It Works
┌─────────────────────────────────────────────────────────────────┐
│ FIRST DEPLOYMENT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. SDK analyzes handler code │
│ - "This handler uses ctx.db and ctx.cache" │
│ │
│ 2. Runtime provisions required infrastructure │
│ - PostgreSQL (right-sized based on usage) │
│ - Redis (right-sized based on usage) │
│ │
│ 3. Auto-scaling based on observed patterns │
│ - DB connections grow/shrink with load │
│ - Cache size adjusts to hit rate │
│ │
└─────────────────────────────────────────────────────────────────┘
Dev Mode
$ human-agent dev
🚀 Starting HUMAN Agent: invoice-processor
📦 Auto-detected infrastructure needs...
✅ PostgreSQL (local container)
✅ Redis (local container)
🔗 Ready at http://localhost:3001
Production Deployment
$ git push origin main
🚀 Deploying invoice-processor...
📦 Provisioning infrastructure...
✅ PostgreSQL (managed, auto-sized)
✅ Redis (managed, auto-sized)
✅ Live at https://invoice-processor.agents.human.dev
Optional: Data Residency Override
# human-agent.yaml (only if required)
compliance:
data_residency: eu # Keep data in EU
# That's it. HUMAN figures out:
# - Which regions to use
# - Which services comply
# - Replication strategy
Preview Deployments
Every branch gets isolated infrastructure automatically:
$ git push origin fix-invoice-bug
🚀 Preview deployment created!
URL: https://invoice-processor-pr-42.agents.human.dev
Infrastructure: Isolated (auto-provisioned)
Data: Synthetic test data (default)
Auto-delete: On branch merge/close
No seed files. No database snapshots. HUMAN generates realistic test data.
Optional override:
# human-agent.yaml (only if needed)
preview:
data: staging_snapshot # Copy from staging (rare)
SCALING & AGENT POOLS
Philosophy: Serverless by default. Scale-to-zero. SLO-driven. No configuration.
For conceptual architecture (logical instances vs physical replicas, workflow-level scaling, queue-based burst handling), see: 55_multi_agent_runtime_architecture.md - Runtime Scaling Architecture section.
This section covers the SDK developer experience for scaling configuration.
Default: Serverless (Scale-to-Zero)
# human-agent.yaml
name: invoice-processor
capabilities: [finance/invoice/process]
# No scaling config needed. Defaults:
# - Serverless (scale-to-zero when idle)
# - Scale up automatically under load
# - Pay only for invocations
Developers don't configure:
- ❌
min_instances: 2 - ❌
max_instances: 20 - ❌
scale_threshold: 10 - ❌
scale_down_delay: 300
Runtime handles all scaling automatically.
SLO-Driven Scaling
# human-agent.yaml (only if specific latency requirements)
slo:
latency:
p99: 200ms # "Keep p99 under 200ms"
Runtime automatically:
- Monitors p50, p95, p99 latency
- Scales up when approaching SLO breach
- Scales down when over-provisioned
- Pre-warms based on traffic prediction
How It Works
┌─────────────────────────────────────────────────────────────────┐
│ RUNTIME SCALING CONTROLLER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ SLO: p99 < 500ms (default) or p99 < 200ms (if specified) │
│ │
│ IDLE STATE: │
│ └── 0 instances running (scale-to-zero) │
│ └── Cold start on first request (~200ms) │
│ │
│ UNDER LOAD: │
│ └── Observed: p99 = 180ms, SLO = 200ms │
│ └── Status: ✅ Healthy (10% headroom) │
│ └── Action: Maintain current instances │
│ │
│ APPROACHING SLO: │
│ └── Observed: p99 = 195ms, SLO = 200ms │
│ └── Status: ⚠️ Degrading │
│ └── Action: Scale up proactively │
│ │
│ OVER-PROVISIONED: │
│ └── Observed: p99 = 50ms, SLO = 200ms │
│ └── Status: 💰 Over-provisioned │
│ └── Action: Scale down to save cost │
│ │
│ TRAFFIC PREDICTION: │
│ └── Learned: "Busy Mon-Fri 9am-5pm, quiet weekends" │
│ └── Action: Pre-warm before predicted peaks │
│ │
└─────────────────────────────────────────────────────────────────┘
Agent Pools (Implicit)
When multiple requests arrive, the runtime creates a pool automatically:
Request 1 ──┐
Request 2 ──┼──► ┌─────────────────────────────────┐
Request 3 ──┤ │ AGENT POOL │
Request 4 ──┤ │ (auto-managed, ephemeral) │
Request 5 ──┘ │ │
│ Instance A ← Processing Req 1 │
│ Instance B ← Processing Req 2 │
│ Instance C ← Processing Req 3 │
│ (more spawn as needed) │
│ │
└─────────────────────────────────────┘
Developers don't:
- Configure pool size
- Manage instance lifecycle
- Think about load balancing
Runtime handles:
- Instance creation/destruction
- Request routing
- Health checks
- Automatic recovery
Optional: Keep Warm (Rare)
For latency-critical agents where cold starts aren't acceptable:
# human-agent.yaml
slo:
latency:
p99: 50ms # Very strict SLO
warmth:
min_warm: 1 # Always keep 1 instance ready
# Note: This costs more (always-on instance)
Optional: Burst Capacity (Rare)
For known high-traffic events:
# human-agent.yaml
scaling:
burst_max: 10000 # Handle up to 10k concurrent
# Runtime pre-provisions capacity for bursts
State in Vaults, Not Instances
Instances are ephemeral. State lives in Vaults:
export const conversationalAgent = handler({
async execute(ctx, input) {
// ✅ State in vault (survives instance death)
const history = await ctx.vaults.self.read('/sessions/' + ctx.session.id);
// ❌ State in memory (lost on scale-down)
// globalHistory[sessionId] = messages; // DON'T DO THIS
// Process...
// Save state
await ctx.vaults.self.write('/sessions/' + ctx.session.id, updatedHistory);
}
});
Cross-Deployment Routing
Agents can call other agents regardless of where they run:
// Agent A (in HUMAN Cloud) calls Agent B (in Acme's VPC)
await ctx.call.agent('agent://acme.invoice-validator', invoice);
// Runtime handles:
// - Registry lookup
// - Cross-network routing
// - Delegation verification
// - Provenance logging
MULTI-LANGUAGE SDK GENERATION
The SDK is auto-generated from protocol definitions for multiple languages.
Protocol Source of Truth
human/
├── protocol/
│ ├── schemas/
│ │ ├── context.proto # Protocol Buffers
│ │ ├── handler.proto
│ │ └── agents.proto
│ ├── openapi/
│ │ └── human-api.yaml # OpenAPI 3.1
│ └── json-schemas/
│ └── *.json
│
├── sdks/ # Auto-generated
│ ├── typescript/ # @human/agent-sdk
│ ├── python/ # human-agent-sdk
│ ├── go/ # github.com/human-protocol/agent-sdk-go
│ └── rust/ # human-agent-sdk (crates.io)
Generated SDK Examples
TypeScript:
import { handler, ExecutionContext } from '@human/agent-sdk';
export const processInvoice = handler({
id: 'process_invoice',
async execute(ctx: ExecutionContext, input: { documentId: string }) {
const analysis = await ctx.llm.complete({ prompt: '...' });
await ctx.call.agent('agent://...', { data: analysis });
return analysis;
}
});
Python:
from human_agent_sdk import handler, ExecutionContext
@handler(id='process_invoice')
async def process_invoice(ctx: ExecutionContext, document_id: str):
analysis = await ctx.llm.complete(prompt='...')
await ctx.call.agent(target='agent://...', input={...})
return analysis
Go:
package main
import human "github.com/human-protocol/agent-sdk-go"
func ProcessInvoice(ctx human.ExecutionContext, input ProcessInvoiceInput) (*Analysis, error) {
analysis, err := ctx.LLM.Complete(human.CompleteRequest{Prompt: "..."})
if err != nil { return nil, err }
_, err = ctx.Call.Agent("agent://...", map[string]interface{}{})
return analysis, err
}
func init() {
human.RegisterHandler("process_invoice", ProcessInvoice)
}
CI/CD Auto-Generation
# .github/workflows/generate-sdks.yml
on:
push:
paths: ['protocol/**']
branches: [main]
jobs:
generate-sdks:
steps:
- name: Generate SDKs
run: |
human-sdk-gen typescript --output sdks/typescript
human-sdk-gen python --output sdks/python
human-sdk-gen go --output sdks/go
human-sdk-gen rust --output sdks/rust
- name: Test all SDKs
run: |
cd sdks/typescript && npm test
cd sdks/python && pytest
cd sdks/go && go test ./...
cd sdks/rust && cargo test
- name: Publish
run: |
cd sdks/typescript && npm publish
cd sdks/python && twine upload dist/*
# Go auto-proxied by proxy.golang.org
cd sdks/rust && cargo publish
AGENT-READABLE DOCUMENTATION
API documentation is published in formats optimized for both humans and AI agents.
Documentation Formats
docs.human.dev/api/v1.0.0/
├── index.html # Human-readable (TypeDoc)
├── openapi.yaml # Machine-readable (OpenAPI 3.1)
├── llms.txt # LLM-optimized summary
└── context.json # Structured for AI parsing
LLM-Optimized Summary (llms.txt)
# HUMAN Agent SDK - API Reference for LLMs
## ctx.llm
- `ctx.llm.complete({ prompt, tier? })` - Complete a prompt
- `ctx.llm.stream({ prompt })` - Stream completion
- `ctx.llm.embed({ text })` - Generate embeddings
## ctx.call
- `ctx.call.agent(target, input)` - Call another agent directly
- `ctx.call.route({ capability, input })` - Capability-based routing (HumanOS decides)
- `ctx.call.withDelegation({ scopes, budget?, expires? })` - Wrap call with specific delegation
## ctx.oversight
- `ctx.oversight.approve({ action, reason, risk })` - Request approval
- `ctx.oversight.decide({ question, options })` - Present decision
- `ctx.oversight.escalate({ why, findings?, recommendation? })` - Full handoff
- `ctx.oversight.notify(message, { urgency?, channel? })` - Non-blocking notification
## ctx.vaults
- `ctx.vaults.self` - Agent's own vault (always accessible)
- `ctx.vaults.list()` - List accessible vaults
- `ctx.vaults.get(uri)` - Get vault handle
- `vault.read(path)`, `vault.write(path, data)`, `vault.list(prefix?)`
## ctx.workforce
- `ctx.workforce.submit({ type, capability, input, instructions })` - Submit to human pool
- `ctx.workforce.await(taskId)` - Wait for completion
## ctx.capabilities
- `ctx.capabilities.find({ capability, minLevel? })` - Find entities with capability
- `ctx.capabilities.mine()` - Get current agent's capabilities
## ctx.secrets
- `ctx.secrets.get(key)` - Get secret (Passport > Vault > Env cascade)
Structured Context (context.json)
{
"sdk_version": "1.0.0",
"primitives": {
"ctx.llm": {
"methods": ["complete", "stream", "embed"],
"complete": {
"signature": "complete(options: CompleteOptions): Promise<CompleteResult>",
"params": {
"prompt": "string (required)",
"tier": "fast | balanced | powerful (default: balanced)"
},
"returns": "{ content: string, cost: CostInfo }",
"example": "await ctx.llm.complete({ prompt: 'Summarize...' })"
}
}
}
}
CLI Docs Lookup
$ human-agent docs ctx.llm.complete
ctx.llm.complete(options)
Complete a prompt using auto-routed LLM.
Options:
prompt: string (required) - The prompt to complete
tier: 'fast' | 'balanced' | 'powerful' - Model tier (default: balanced)
Returns:
{ content: string, cost: CostInfo }
Example:
const result = await ctx.llm.complete({
prompt: 'Summarize this document...',
tier: 'powerful'
});
Agent Access to Docs
Agents can query documentation via MCP or API:
// Companion helping a developer
const docs = await mcp.call('human-sdk-docs', {
query: 'ctx.call.agent',
format: 'structured'
});
SDK ARCHITECTURE
Core Package Structure
@human/agent-sdk/
├── core/
│ ├── agent.ts # Base agent class
│ ├── identity.ts # Passport binding
│ ├── delegation.ts # Authority management
│ ├── memory.ts # Vault-bound memory
│ └── lifecycle.ts # Agent lifecycle management
│
├── muscles/
│ ├── interface.ts # Muscle base interface
│ ├── registry.ts # Muscle registration
│ ├── authorization.ts # Permission checking
│ └── audit.ts # Action logging
│
├── safety/
│ ├── boundaries.ts # Safety boundary definitions
│ ├── escalation.ts # Human escalation triggers
│ ├── guardrails.ts # Action guardrails
│ └── monitoring.ts # Safety monitoring
│
├── orchestration/
│ ├── router.ts # Multi-agent routing
│ ├── handoff.ts # Agent-to-human handoffs
│ ├── coordination.ts # Multi-agent coordination
│ └── provenance.ts # Decision provenance
│
└── integrations/
├── humanos.ts # HumanOS integration
├── passport.ts # Passport API client
├── capability-graph.ts # Capability verification
└── workforce.ts # Workforce Cloud integration
@human/connector-sdk/ # Separate package for connectors
├── core/
│ ├── connector.ts # Base connector interface
│ ├── registry.ts # Connector registration
│ ├── credentials.ts # Credential management
│ └── testing.ts # Test harness for connectors
│
├── interfaces/
│ ├── calendar.ts # CalendarConnector interface
│ ├── videoconf.ts # VideoConfConnector interface
│ ├── transcription.ts # TranscriptionConnector interface
│ ├── scheduling.ts # SchedulingConnector interface
│ ├── notes.ts # NotesConnector interface
│ ├── tasks.ts # TasksConnector interface
│ └── communication.ts # CommunicationConnector interface
│
├── helpers/
│ ├── oauth.ts # OAuth 2.0 helpers
│ ├── webhook.ts # Webhook helpers
│ └── retry.ts # Retry logic
│
└── templates/ # Starter templates for new connectors
├── calendar/
├── videoconf/
└── generic/
Relationship: Agents → Muscles → Connectors
┌─────────────────────────────────────────────────────────────────┐
│ AGENT │
│ (e.g., MeetingFacilitator) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Uses abstract capabilities via MUSCLES │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Calendar │ │ VideoConf │ │ Notes │ │
│ │ Muscle │ │ Muscle │ │ Muscle │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
├─────────┼─────────────────┼─────────────────┼────────────────────┤
│ │ │ │ │
│ Muscles delegate to platform-agnostic CONNECTOR INTERFACES │
│ │ │ │ │
│ ┌──────▼───────┐ ┌──────▼───────┐ ┌──────▼───────┐ │
│ │ Calendar │ │ VideoConf │ │ Notes │ │
│ │ Connector │ │ Connector │ │ Connector │ │
│ │ Interface │ │ Interface │ │ Interface │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
├─────────┼─────────────────┼─────────────────┼────────────────────┤
│ │ │ │ │
│ User configures which VENDOR CONNECTORS to use │
│ │ │ │ │
│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │
│ │ Google │ │ Zoom │ │ Notion │ │
│ │Calendar │ │Connector│ │Connector│ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │Outlook │ │ Google │ │Obsidian │ │
│ │ 365 │ │ Meet │ │Connector│ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
KEY PRINCIPLE: Agents and muscles are vendor-agnostic.
Connectors are vendor-specific.
Users choose their connectors.
Base Agent Interface
// @human/agent-sdk/core/agent.ts
import type { PassportId, DelegationScope } from "@human/passport-domain";
import type { VaultRef } from "@human/vault";
import type { AuditLogger } from "./audit";
import type { MuscleRegistry } from "../muscles/registry";
import type { BoundaryPolicy } from "../safety/boundaries";
/**
* Base interface for all HAIO-compliant agents
*/
export interface HumanAgent {
/** Unique agent identifier */
readonly agentId: string;
/** Human-readable agent name */
readonly name: string;
/** Agent version */
readonly version: string;
// ═══════════════════════════════════════════════════════════════
// IDENTITY & AUTHORITY
// ═══════════════════════════════════════════════════════════════
/**
* The Passport this agent operates on behalf of.
* All actions are attributed to this identity.
*/
readonly passport: PassportBinding;
/**
* Delegations granted to this agent.
* Defines what the agent is authorized to do.
*/
readonly delegations: DelegationScope[];
/**
* Check if agent has a specific delegation.
*/
hasAuthority(scope: DelegationScope): boolean;
/**
* Request additional delegation from the user.
*/
requestDelegation(scope: DelegationScope, reason: string): Promise<DelegationResult>;
// ═══════════════════════════════════════════════════════════════
// MEMORY & STATE
// ═══════════════════════════════════════════════════════════════
/**
* Vault reference for persistent memory.
* Agent memory is owned by the Passport, not the agent.
*/
readonly vault: VaultRef;
/**
* Working memory for current conversation/session.
*/
readonly workingMemory: WorkingMemory;
/**
* Save state to vault.
*/
persistMemory(): Promise<void>;
/**
* Load state from vault.
*/
restoreMemory(): Promise<void>;
// ═══════════════════════════════════════════════════════════════
// CAPABILITIES (MUSCLES)
// ═══════════════════════════════════════════════════════════════
/**
* Registry of muscles available to this agent.
*/
readonly muscles: MuscleRegistry;
/**
* Execute a muscle action.
* Automatically checks authorization and logs action.
*/
executeAction<T>(
muscleId: string,
action: string,
params: Record<string, unknown>
): Promise<ActionResult<T>>;
// ═══════════════════════════════════════════════════════════════
// SAFETY & BOUNDARIES
// ═══════════════════════════════════════════════════════════════
/**
* Boundary policies that constrain agent behavior.
*/
readonly boundaries: BoundaryPolicy;
/**
* Audit logger for all agent actions.
*/
readonly auditLog: AuditLogger;
/**
* Escalate to human when agent cannot proceed safely.
*/
escalate(reason: EscalationReason, context: EscalationContext): Promise<void>;
/**
* Hand off to another agent or human.
*/
handoff(to: PassportId | string, context: HandoffContext): Promise<void>;
// ═══════════════════════════════════════════════════════════════
// LIFECYCLE
// ═══════════════════════════════════════════════════════════════
/**
* Initialize agent (called on startup).
*/
initialize(): Promise<void>;
/**
* Process a message/request.
*/
process(input: AgentInput): Promise<AgentOutput>;
/**
* Shutdown agent gracefully.
*/
shutdown(): Promise<void>;
}
/**
* Passport binding for agent identity
*/
export interface PassportBinding {
/** The Passport ID the agent operates under */
passportId: PassportId;
/** The type of entity (Human, LegalEntity, AgentFuture) */
personType: "Human" | "LegalEntity" | "AgentFuture";
/** Whether this is the owner or a delegate */
bindingType: "owner" | "delegate";
/** If delegate, who granted the delegation */
delegatedBy?: PassportId;
/** Expiration of the binding */
expiresAt?: Date;
}
/**
* Working memory for session state
*/
export interface WorkingMemory {
/** Conversation history */
conversation: ConversationTurn[];
/** Current context/state */
context: Record<string, unknown>;
/** Pending actions */
pendingActions: PendingAction[];
/** Clear working memory */
clear(): void;
/** Add to conversation */
addTurn(turn: ConversationTurn): void;
}
Muscle Interface
// @human/agent-sdk/muscles/interface.ts
import type { PassportId, DelegationScope } from "@human/passport-domain";
import type { AuditLogger } from "../core/audit";
/**
* Base interface for all muscles (agent capabilities)
*/
export interface Muscle {
/** Unique muscle identifier */
readonly muscleId: string;
/** Human-readable name */
readonly name: string;
/** Description of what this muscle does */
readonly description: string;
/** Delegations required to use this muscle */
readonly requiredDelegations: DelegationScope[];
/** Actions available on this muscle */
readonly actions: MuscleAction[];
/** Audit logger for muscle actions */
readonly auditLog: AuditLogger;
/**
* Check if the given passport has authority to use this muscle.
*/
checkAuthorization(
actor: PassportId,
action: string,
delegations: DelegationScope[]
): Promise<AuthorizationResult>;
/**
* Execute an action on this muscle.
*/
execute<T>(
action: string,
params: Record<string, unknown>,
context: ExecutionContext
): Promise<ActionResult<T>>;
}
/**
* Defines a single action on a muscle
*/
export interface MuscleAction {
/** Action identifier */
id: string;
/** Human-readable name */
name: string;
/** Description */
description: string;
/** Required delegation scope */
requiredScope: DelegationScope;
/** Parameter schema (JSON Schema) */
parameters: JSONSchema;
/** Return type schema */
returns: JSONSchema;
/** Whether this action requires human confirmation */
requiresConfirmation: boolean;
/** Risk level for safety boundaries */
riskLevel: "low" | "medium" | "high" | "critical";
}
/**
* Result of a muscle action
*/
export interface ActionResult<T> {
success: boolean;
data?: T;
error?: ActionError;
/** Audit trail for this action */
audit: {
actionId: string;
muscleId: string;
action: string;
actor: PassportId;
timestamp: Date;
duration: number;
authorized: boolean;
};
}
Safety Boundaries
// @human/agent-sdk/safety/boundaries.ts
import type { PassportId } from "@human/passport-domain";
/**
* Defines safety boundaries for agent behavior
*/
export interface BoundaryPolicy {
/** Maximum actions per minute */
rateLimits: {
actionsPerMinute: number;
actionsPerHour: number;
actionsPerDay: number;
};
/** Spending limits (if applicable) */
spendingLimits?: {
perAction: number;
perDay: number;
perMonth: number;
currency: string;
};
/** Actions that always require human confirmation */
requireConfirmation: string[];
/** Actions that are completely forbidden */
forbidden: string[];
/** Time windows when agent can operate */
operatingHours?: {
timezone: string;
windows: TimeWindow[];
};
/** Escalation triggers */
escalationTriggers: EscalationTrigger[];
}
/**
* Defines when to escalate to human
*/
export interface EscalationTrigger {
/** Trigger identifier */
id: string;
/** Condition that triggers escalation */
condition: EscalationCondition;
/** Who to escalate to */
escalateTo: PassportId | "owner" | "any_human";
/** Priority of escalation */
priority: "low" | "medium" | "high" | "critical";
/** Maximum time to wait for human response */
timeout?: Duration;
/** What to do if timeout reached */
timeoutAction: "retry" | "abort" | "proceed_with_caution";
}
/**
* Conditions that can trigger escalation
*/
export type EscalationCondition =
| { type: "uncertainty"; threshold: number } // Agent confidence below threshold
| { type: "risk_level"; level: "high" | "critical" }
| { type: "spending"; amount: number }
| { type: "pattern"; pattern: string } // Regex match on action
| { type: "consecutive_errors"; count: number }
| { type: "explicit_request" } // Agent decides to escalate
| { type: "policy_violation"; policyId: string }
| { type: "custom"; evaluator: (context: EscalationContext) => boolean };
Multi-Agent Coordination
// @human/agent-sdk/orchestration/coordination.ts
import type { HumanAgent } from "../core/agent";
import type { PassportId } from "@human/passport-domain";
/**
* Coordinates multiple agents working together
*/
export interface AgentCoordinator {
/** Register an agent with the coordinator */
register(agent: HumanAgent): Promise<void>;
/** Unregister an agent */
unregister(agentId: string): Promise<void>;
/** Route a request to the appropriate agent */
route(input: AgentInput, context: RoutingContext): Promise<RoutingDecision>;
/** Hand off from one agent to another */
handoff(
from: HumanAgent,
to: string | PassportId,
context: HandoffContext
): Promise<HandoffResult>;
/** Broadcast a message to multiple agents */
broadcast(message: CoordinationMessage, targets: string[]): Promise<void>;
/** Get status of all registered agents */
getStatus(): Promise<AgentStatus[]>;
}
/**
* Routing decision for incoming requests
*/
export interface RoutingDecision {
/** Selected agent ID */
agentId: string;
/** Confidence in this routing */
confidence: number;
/** Reasoning for selection */
reasoning: string;
/** Alternative agents that could handle this */
alternatives: Array<{
agentId: string;
confidence: number;
}>;
}
/**
* Context for handing off between agents
*/
export interface HandoffContext {
/** Reason for handoff */
reason: string;
/** Conversation history to transfer */
conversation: ConversationTurn[];
/** Relevant context/state */
context: Record<string, unknown>;
/** Pending actions to transfer */
pendingActions: PendingAction[];
/** Whether receiving agent can hand back */
allowHandback: boolean;
}
USAGE EXAMPLES
Creating a Simple Agent
import { createAgent, defineMuscle } from "@human/agent-sdk";
import type { CalendarMuscle } from "@human/agent-muscles/calendar";
// Define a meeting scheduler agent
const schedulerAgent = createAgent({
name: "Meeting Scheduler",
version: "1.0.0",
// Bind to a Passport
passport: {
passportId: "passport:human:corp",
personType: "LegalEntity",
bindingType: "delegate",
},
// Define required delegations
delegations: [
"calendar.read",
"calendar.write",
"notification.send",
],
// Configure muscles
muscles: {
calendar: new CalendarMuscle({
providers: ["google", "microsoft"],
}),
},
// Define safety boundaries
boundaries: {
rateLimits: {
actionsPerMinute: 10,
actionsPerHour: 100,
actionsPerDay: 500,
},
requireConfirmation: ["calendar.delete"],
forbidden: ["calendar.delete_all"],
escalationTriggers: [
{
id: "double-booking",
condition: { type: "pattern", pattern: "conflict_detected" },
escalateTo: "owner",
priority: "medium",
},
],
},
});
// Process a request
const result = await schedulerAgent.process({
type: "message",
content: "Schedule a meeting with Mike tomorrow at 2pm",
from: "passport:user:rick",
});
Creating a Custom Muscle
import { defineMuscle, type Muscle, type MuscleAction } from "@human/agent-sdk";
// Define a custom research muscle
export const researchMuscle = defineMuscle({
muscleId: "research",
name: "Research Assistant",
description: "Performs research tasks using various sources",
requiredDelegations: ["research.read", "research.summarize"],
actions: [
{
id: "search",
name: "Search",
description: "Search for information on a topic",
requiredScope: "research.read",
parameters: {
type: "object",
properties: {
query: { type: "string" },
sources: { type: "array", items: { type: "string" } },
maxResults: { type: "number", default: 10 },
},
required: ["query"],
},
returns: {
type: "array",
items: { $ref: "#/definitions/SearchResult" },
},
requiresConfirmation: false,
riskLevel: "low",
},
{
id: "summarize",
name: "Summarize",
description: "Summarize research findings",
requiredScope: "research.summarize",
parameters: {
type: "object",
properties: {
content: { type: "string" },
style: { type: "string", enum: ["brief", "detailed", "executive"] },
},
required: ["content"],
},
returns: { type: "string" },
requiresConfirmation: false,
riskLevel: "low",
},
],
// Implementation
async execute(action, params, context) {
switch (action) {
case "search":
return this.performSearch(params.query, params.sources, params.maxResults);
case "summarize":
return this.performSummarize(params.content, params.style);
default:
throw new Error(`Unknown action: ${action}`);
}
},
});
Multi-Agent Coordination
import { createCoordinator, createAgent } from "@human/agent-sdk";
// Create specialized agents
const schedulerAgent = createAgent({ /* ... */ });
const researchAgent = createAgent({ /* ... */ });
const documentAgent = createAgent({ /* ... */ });
// Create coordinator
const coordinator = createCoordinator({
routingStrategy: "capability-based",
defaultAgent: schedulerAgent.agentId,
});
// Register agents
await coordinator.register(schedulerAgent);
await coordinator.register(researchAgent);
await coordinator.register(documentAgent);
// Route incoming requests
const input = {
type: "message",
content: "Research the latest trends in AI safety and schedule a meeting to discuss",
from: "passport:user:rick",
};
// Coordinator decides: research first, then schedule
const routing = await coordinator.route(input, {
preferredAgents: [],
context: {},
});
// Execute with handoff
const researchResult = await researchAgent.process({
...input,
content: "Research the latest trends in AI safety",
});
await coordinator.handoff(researchAgent, schedulerAgent, {
reason: "Research complete, scheduling meeting",
conversation: researchResult.conversation,
context: { researchFindings: researchResult.data },
pendingActions: [],
allowHandback: false,
});
INTEROP SDK: BUILDING HUMAN ADAPTERS FOR EXTERNAL PLATFORMS
Critical Strategic Capability:
The Agent SDK includes patterns for wrapping and governing agents built on other platforms (n8n, LangChain, OpenAI Assistants, etc.) without requiring rewrites.
This is transformative for market adoption: enterprises can gain HAIO's benefits while keeping their existing agent investments.
The Adapter Interface
// @human/agent-sdk/interop/adapter.ts
/**
* Base interface for platform adapters
*/
export interface PlatformAdapter {
/** Platform identifier */
readonly platformId: string;
/** Platform name */
readonly platformName: string;
/** Identity adapter */
identity: IdentityAdapter;
/** Delegation adapter */
delegation: DelegationAdapter;
/** Event/logging adapter */
events: EventAdapter;
/** Policy hooks */
policy: PolicyHooks;
}
/**
* Maps external identities to Passport DIDs
*/
export interface IdentityAdapter {
/**
* Map external user ID to Passport DID
*/
mapUserToPassport(externalUserId: string): Promise<PassportId>;
/**
* Map external agent to HUMAN agent identity
*/
mapAgent(externalAgentId: string): Promise<{
agentDid: PassportId;
principalDid: PassportId; // Who does it act for?
}>;
/**
* Create bidirectional mapping
*/
establishMapping(external: string, passport: PassportId): Promise<void>;
}
/**
* Checks delegations before external agent actions
*/
export interface DelegationAdapter {
/**
* Check if action is allowed under current delegations
*/
checkDelegation(params: {
agentDid: PassportId;
action: string;
context: Record<string, any>;
}): Promise<DelegationDecision>;
/**
* Get required capabilities for an action
*/
getRequiredCapabilities(action: string): string[];
}
/**
* Streams events to HUMAN ledger
*/
export interface EventAdapter {
/**
* Log external agent action to HUMAN ledger
*/
logEvent(event: ExternalAgentEvent): Promise<void>;
/**
* Stream events in real-time
*/
streamEvents(handler: (event: ExternalAgentEvent) => Promise<void>): void;
}
/**
* HAIO policy enforcement hooks
*/
export interface PolicyHooks {
/**
* Called when policy is violated
*/
onPolicyViolation(params: {
agentDid: PassportId;
action: string;
violation: string;
}): Promise<void>;
/**
* Request human approval mid-execution
*/
requestApproval(params: {
agentDid: PassportId;
action: string;
context: Record<string, any>;
requiredCapability: string;
}): Promise<ApprovalResult>;
/**
* Cancel ongoing execution
*/
cancelExecution(agentDid: PassportId, reason: string): Promise<void>;
}
Example: n8n Platform Adapter
// @human/agent-sdk/interop/adapters/n8n.ts
import { PlatformAdapter, IdentityAdapter, DelegationAdapter } from "../adapter";
import { HumanClient } from "../../client";
export class N8nAdapter implements PlatformAdapter {
readonly platformId = "n8n";
readonly platformName = "n8n Workflow Automation";
constructor(
private config: {
n8nApiUrl: string;
n8nApiKey: string;
orgPassportId: PassportId;
},
private humanClient: HumanClient
) {}
// Identity adapter
identity: IdentityAdapter = {
async mapUserToPassport(n8nUserId: string): Promise<PassportId> {
// Check if mapping exists
const existing = await this.humanClient.mappings.get({
externalSystem: "n8n",
externalId: n8nUserId
});
if (existing) {
return existing.passportId;
}
// Create new mapping
const passport = await this.humanClient.passport.resolveOrCreate({
externalId: n8nUserId,
externalSystem: "n8n",
orgDid: this.config.orgPassportId
});
return passport.id;
},
async mapAgent(n8nWorkflowId: string): Promise<{agentDid: PassportId; principalDid: PassportId}> {
// Each n8n workflow becomes a HUMAN agent
const workflow = await this.fetchWorkflow(n8nWorkflowId);
const agentDid = await this.humanClient.agents.registerOrUpdate({
externalId: n8nWorkflowId,
name: workflow.name,
platform: "n8n",
capabilities: this.mapWorkflowToCapabilities(workflow)
});
return {
agentDid,
principalDid: this.config.orgPassportId
};
},
async establishMapping(n8nId: string, passportId: PassportId): Promise<void> {
await this.humanClient.mappings.create({
externalSystem: "n8n",
externalId: n8nId,
passportId
});
}
};
// Delegation adapter
delegation: DelegationAdapter = {
async checkDelegation(params) {
return await this.humanClient.humanos.checkDelegation({
agent: params.agentDid,
action: params.action,
context: params.context,
requiredCapabilities: this.getRequiredCapabilities(params.action)
});
},
getRequiredCapabilities(action: string): string[] {
// Map n8n actions to HAIO capabilities
const mapping: Record<string, string[]> = {
'send_email': ['email_sender'],
'update_database': ['database_writer'],
'call_api': ['api_caller'],
'process_payment': ['payment_processor']
};
return mapping[action] || ['generic_workflow_executor'];
}
};
// Event adapter
events: EventAdapter = {
async logEvent(event: ExternalAgentEvent): Promise<void> {
await this.humanClient.ledger.appendEvent({
eventType: 'external_agent_action',
actorDid: event.agentDid,
action: event.action,
result: event.result,
context: event.context,
timestamp: event.timestamp,
externalSystem: 'n8n',
externalEventId: event.externalEventId,
signature: await this.humanClient.signature.sign(event)
});
},
streamEvents(handler) {
// Subscribe to n8n webhook events
this.subscribeToN8nWebhooks(async (n8nEvent) => {
const mappedEvent = await this.mapN8nEventToHuman(n8nEvent);
await handler(mappedEvent);
});
}
};
// Policy hooks
policy: PolicyHooks = {
async onPolicyViolation(params) {
// Halt n8n workflow execution
await this.pauseWorkflow(params.agentDid, params.violation);
// Log violation
await this.events.logEvent({
agentDid: params.agentDid,
action: params.action,
result: 'policy_violation',
context: { violation: params.violation },
timestamp: new Date(),
externalEventId: `violation-${Date.now()}`
});
},
async requestApproval(params) {
return await this.humanClient.humanos.requestApproval({
agent: params.agentDid,
action: params.action,
context: params.context,
requiredCapability: params.requiredCapability
});
},
async cancelExecution(agentDid, reason) {
await this.pauseWorkflow(agentDid, reason);
}
};
// Helper methods
private async fetchWorkflow(workflowId: string) {
// Call n8n API to get workflow details
// ...implementation...
}
private mapWorkflowToCapabilities(workflow: any): string[] {
// Analyze workflow nodes to determine capabilities
// ...implementation...
}
private async pauseWorkflow(agentDid: PassportId, reason: string) {
// Call n8n API to pause workflow
// ...implementation...
}
private subscribeToN8nWebhooks(handler: (event: any) => Promise<void>) {
// Set up webhook listener for n8n events
// ...implementation...
}
private async mapN8nEventToHuman(n8nEvent: any): Promise<ExternalAgentEvent> {
// Convert n8n event format to HUMAN event format
// ...implementation...
}
}
Example: LangChain Platform Adapter
// @human/agent-sdk/interop/adapters/langchain.ts
import { PlatformAdapter } from "../adapter";
import { HumanClient } from "../../client";
export class LangChainAdapter implements PlatformAdapter {
readonly platformId = "langchain";
readonly platformName = "LangChain Framework";
/**
* Wrap a LangChain agent with HUMAN governance
*/
async wrapAgent(langchainAgent: any, config: {
agentName: string;
principalDid: PassportId;
capabilities: string[];
}) {
// Register agent in HUMAN
const agentDid = await this.humanClient.agents.register({
name: config.agentName,
platform: "langchain",
principalDid: config.principalDid,
capabilities: config.capabilities
});
// Wrap LangChain tool calls with HUMAN checks
const originalTools = langchainAgent.tools;
langchainAgent.tools = originalTools.map(tool => this.wrapTool(tool, agentDid));
// Intercept execution
const originalRun = langchainAgent.run.bind(langchainAgent);
langchainAgent.run = async (input: string) => {
// Log intent
await this.events.logEvent({
agentDid,
action: 'start_execution',
result: 'success',
context: { input },
timestamp: new Date(),
externalEventId: `exec-${Date.now()}`
});
// Execute
const result = await originalRun(input);
// Log completion
await this.events.logEvent({
agentDid,
action: 'complete_execution',
result: 'success',
context: { input, result },
timestamp: new Date(),
externalEventId: `exec-${Date.now()}-complete`
});
return result;
};
return langchainAgent;
}
private wrapTool(tool: any, agentDid: PassportId) {
const originalFunc = tool.func;
tool.func = async (...args: any[]) => {
// Check delegation before tool execution
const decision = await this.delegation.checkDelegation({
agentDid,
action: tool.name,
context: { args }
});
if (!decision.allowed) {
if (decision.requiresHumanApproval) {
// Request approval
const approval = await this.policy.requestApproval({
agentDid,
action: tool.name,
context: { args },
requiredCapability: decision.escalateTo || 'general_approval'
});
if (!approval.approved) {
throw new Error(`Action denied: ${approval.reason}`);
}
} else {
throw new Error(`Action forbidden: ${decision.reason}`);
}
}
// Execute tool
const result = await originalFunc(...args);
// Log execution
await this.events.logEvent({
agentDid,
action: tool.name,
result: 'success',
context: { args, result },
timestamp: new Date(),
externalEventId: `tool-${Date.now()}`
});
return result;
};
return tool;
}
}
Using Adapters in Practice
import { N8nAdapter, LangChainAdapter } from "@human/agent-sdk/interop";
import { HumanClient } from "@human/agent-sdk";
// Initialize HUMAN client
const human = new HumanClient({ apiKey: process.env.HUMAN_API_KEY });
// === Example 1: Wrap existing n8n workflows ===
const n8n = new N8nAdapter(
{
n8nApiUrl: "https://n8n.acme.com",
n8nApiKey: process.env.N8N_API_KEY,
orgPassportId: "did:human:org:acme"
},
human
);
// Register all workflows as HUMAN agents
const workflows = await fetchAllN8nWorkflows();
for (const workflow of workflows) {
const { agentDid } = await n8n.identity.mapAgent(workflow.id);
console.log(`Registered n8n workflow ${workflow.name} as ${agentDid}`);
}
// Now all n8n workflows are governed by HUMAN:
// - Identity: Each workflow has a Passport DID
// - Delegation: Actions checked against policies
// - Logging: All executions logged to ledger
// === Example 2: Wrap LangChain agent ===
import { initializeAgentExecutorWithOptions } from "langchain/agents";
import { OpenAI } from "langchain/llms/openai";
// Create LangChain agent
const langchainAgent = await initializeAgentExecutorWithOptions(
tools,
new OpenAI({ temperature: 0 }),
{ agentType: "zero-shot-react-description" }
);
// Wrap with HUMAN governance
const langchain = new LangChainAdapter({...}, human);
const governedAgent = await langchain.wrapAgent(langchainAgent, {
agentName: "Insurance Claims Processor",
principalDid: "did:human:org:insurance-co",
capabilities: ["claims_review", "payout_approval_under_10k"]
});
// Now LangChain agent is governed by HUMAN
const result = await governedAgent.run("Process claim #12345");
// Every tool call is checked for delegation, logged to ledger
Migration Patterns
Pattern 1: Gradual Migration
// Start: All agents on n8n, zero HUMAN
// Step 1: Wrap with HUMAN-around (identity + logging)
const adapter = new N8nAdapter(config, human);
await adapter.wrapAllWorkflows();
// Step 2: Add policy enforcement
await adapter.policy.enablePolicyChecks(['high-risk-actions']);
// Step 3: Migrate critical workflows to HUMAN-native
const criticalWorkflows = ['payment-processing', 'data-deletion'];
for (const id of criticalWorkflows) {
await migrateTHumanNative(id);
}
// End: Critical on HUMAN-native, rest wrapped
Pattern 2: Hybrid Deployment
// Some agents native, some wrapped
const agents = [
// HUMAN-native agents
await createAgent({ name: "Compliance Reviewer", ... }),
// Wrapped n8n workflows
await n8n.wrapWorkflow("stripe-fulfillment"),
// Wrapped LangChain agents
await langchain.wrapAgent(researchAgent, {...})
];
// Coordinator routes across all of them
const coordinator = createCoordinator({ agents });
Benefits of Interop SDK
For Enterprises:
- No forced migration off existing platforms
- Immediate value: identity, delegation, logging, governance
- Clear path to HUMAN-native when ready
For HUMAN:
- Shorter sales cycles ("we wrap your stack")
- Broader market (works with any platform)
- Long-term stickiness (customers see limits of wrapped platforms, migrate to native)
For Developers:
- Build adapters for any platform
- Consistent interface across all platforms
- Monetization via adapter marketplace
See also:
43_haio_developer_architecture.md- Complete interoperability architecture22_humanos_orchestration_core.md- Orchestration patterns
AGENT AUTHENTICATION
Agents authenticate to HUMAN APIs using the same methods as other clients, but with agent-specific patterns.
How Agents Get Credentials
| Scenario | Authentication Method | How It Works |
|---|---|---|
| Agent on behalf of user | Delegated Token | User grants delegation → agent receives scoped token |
| Agent on behalf of org | Service Account | Org creates service account → agent uses API key |
| Standalone agent | Agent Passport + PAT | Agent has own Passport → generates own tokens |
Delegated Authentication (Most Common)
When an agent acts on behalf of a human or organization:
import { HumanClient, DelegationClient } from "@human/agent-sdk";
// Agent requests delegation from user
const delegation = await DelegationClient.request({
fromPassport: userPassportId, // Who's granting
toAgent: agent.agentId, // Who's receiving
scopes: ["calendar:read", "calendar:write"],
reason: "Schedule meetings on your behalf",
duration: "30d",
});
// User approves in Companion or app
// → Agent receives delegated credentials
// Agent uses delegated credentials
const client = new HumanClient({
delegation: delegation.credentials,
onBehalfOf: userPassportId,
});
// API calls are attributed to user, executed by agent
const events = await client.calendar.list();
// Audit log shows: "Agent X acting on behalf of User Y"
Service Account Authentication (Server-to-Server)
For agents running as backend services:
import { HumanClient } from "@human/agent-sdk";
// Service account key (stored securely, e.g., env var)
const client = new HumanClient({
apiKey: process.env.HUMAN_SERVICE_KEY, // hsk_live_...
type: "service_account",
});
// API calls attributed to service account
const result = await client.capabilities.query({...});
Agent Passport (Autonomous Agents)
Agents can have their own Passports (Person:AgentFuture):
import { AgentPassport, HumanClient } from "@human/agent-sdk";
// Agent with its own identity
const agentPassport = await AgentPassport.create({
name: "Meeting Facilitator Agent",
owner: humanPassportId, // Human who owns/controls this agent
capabilities: ["scheduling", "transcription"],
boundaries: agentBoundaries,
});
// Agent generates its own PAT
const agentToken = await agentPassport.createToken({
scopes: ["calendar:read"],
expiresIn: "24h",
});
// Agent authenticates as itself
const client = new HumanClient({
token: agentToken,
agentPassport: agentPassport.id,
});
Credential Storage in Agents
Agents should never store credentials in code. Use:
// Environment variables (recommended for service accounts)
const client = new HumanClient(); // Auto-reads HUMAN_API_TOKEN
// Vault storage (for delegated credentials)
const credentials = await agent.vault.getCredentials("human_api");
const client = new HumanClient({ delegation: credentials });
// Secure credential manager (for production)
const secret = await secretManager.get("human-api-key");
const client = new HumanClient({ apiKey: secret });
Token Refresh (Automatic)
The SDK automatically refreshes tokens before expiry:
const client = new HumanClient({
delegation: delegatedCredentials,
// SDK automatically:
// - Monitors token expiry
// - Refreshes before expiration
// - Updates stored credentials
// - Retries failed requests after refresh
});
// You never need to handle refresh manually
HAIO INTEGRATION
Passport Integration
All agents must bind to a Passport:
import { PassportClient } from "@human/agent-sdk/integrations/passport";
const passportClient = new PassportClient({
endpoint: "https://api.human.protocol/passport",
credentials: agentCredentials,
});
// Verify agent's Passport binding
const binding = await passportClient.verifyBinding(agent.passport);
// Check delegations
const delegations = await passportClient.getDelegations(
agent.passport.passportId,
agent.agentId
);
// Request new delegation
const result = await passportClient.requestDelegation({
from: agent.passport.passportId,
to: agent.agentId,
scope: "calendar.write",
reason: "Need to create meetings on your behalf",
duration: "30d",
});
Capability Graph Integration
Agents can query and update capabilities:
import { CapabilityClient } from "@human/agent-sdk/integrations/capability-graph";
const capabilityClient = new CapabilityClient({
endpoint: "https://api.human.protocol/capability-graph",
});
// Query capabilities for task matching
const capabilities = await capabilityClient.query({
passport: userPassportId,
domain: "software-engineering",
minLevel: 0.7,
});
// Submit evidence of capability demonstration
await capabilityClient.submitEvidence({
passport: userPassportId,
capability: "meeting-facilitation",
evidence: {
type: "task-completion",
taskId: meetingId,
outcome: "successful",
observedBy: agent.agentId,
},
});
HumanOS Integration
Agents register with HumanOS for orchestration:
import { HumanOSClient } from "@human/agent-sdk/integrations/humanos";
const humanosClient = new HumanOSClient({
endpoint: "https://api.human.protocol/humanos",
});
// Register agent with HumanOS
await humanosClient.registerAgent({
agentId: agent.agentId,
name: agent.name,
capabilities: agent.muscles.listCapabilities(),
boundaries: agent.boundaries,
passport: agent.passport,
});
// Report action for provenance
await humanosClient.reportAction({
agentId: agent.agentId,
action: "calendar.create",
input: { title: "Meeting", participants: [...] },
output: { eventId: "..." },
timestamp: new Date(),
});
// Request human escalation
const escalation = await humanosClient.escalate({
agentId: agent.agentId,
reason: "Uncertainty about meeting conflict resolution",
context: { conflictingEvents: [...] },
priority: "medium",
escalateTo: agent.passport.passportId,
});
DOCUMENTATION REQUIREMENTS
For SDK Release
- Getting Started Guide - 15 minutes to first agent
- Core Concepts - Identity, muscles, safety, coordination
- API Reference - Complete TypeScript docs
- Example Agents - 5+ reference implementations
- Best Practices - Security, performance, testing
- Migration Guide - From other agent frameworks
Example Agent Library
| Agent | Complexity | Demonstrates |
|---|---|---|
| Echo Agent | Trivial | Basic structure |
| Calendar Agent | Simple | Single muscle |
| Meeting Facilitator | Medium | Multiple muscles, coordination |
| Research Assistant | Medium | External APIs, summarization |
| Document Reviewer | Complex | Multi-step workflows |
| Workflow Coordinator | Complex | Multi-agent orchestration |
RELEASE ROADMAP
Phase 1: Core Primitives (Month 1-2)
- Protocol definitions (protobuf + OpenAPI for ctx API)
- Unified ctx pattern implementation (llm, agents, db, secrets, memory, etc.)
- handler() wrapper with capabilities and delegation
- SDK generator for TypeScript (source language)
- CLI tool (init, dev, deploy, test, vault, prompts, replay)
- Credential cascade (Passport > Vault > Env)
- Handler-level secret scoping
Phase 2: Trust & Cost (Month 2-3)
- Agent-to-agent delegation chain (auto-scoping, chained provenance)
- ctx.call.agent() with delegation validation
- LLM tier-based routing (fast/balanced/powerful)
- Cost controls with tiered thresholds
- Human escalation at budget limits
- Passport integration for user credentials
Phase 3: Testing & Debugging (Month 3-4)
- Semantic test assertions (expectSemantic)
- Golden output recording and approval
- LLM fixture recording (record/replay modes)
- Time-travel debugging (metadata default, full capture opt-in)
- Execution replay CLI
- Prompt versioning (repo → registry)
- Prompt A/B testing
Phase 4: Infrastructure & Multi-Language (Month 4-5)
- Declarative infrastructure (postgres, redis, s3, queue, vector)
- Preview deployments (seeded DB, branch URLs)
- SDK generator for Python, Go, Rust
- CI/CD pipeline for SDK auto-generation
- Multi-language SDK testing
Phase 5: Documentation & Launch (Month 5-6)
- Agent-readable documentation (OpenAPI, llms.txt, context.json)
- Complete API reference (auto-generated, versioned)
- Interactive docs with runnable examples
- Example agent library (6+ reference agents)
- Developer portal with guides
- CLI discoverability (human-agent ctx, human-agent docs)
- Community launch
SUCCESS METRICS
Adoption Metrics
| Metric | Year 1 Target | Measurement |
|---|---|---|
| SDK downloads | 10,000+ | npm/GitHub |
| Active developers | 500+ | Unique contributors |
| Agents built | 100+ | Registered with HumanOS |
| GitHub stars | 1,000+ | Community interest |
Quality Metrics
| Metric | Target | Measurement |
|---|---|---|
| Documentation coverage | 100% | All public APIs documented |
| Test coverage | 90%+ | Unit + integration tests |
| Time to first agent | < 15 min | Developer experience testing |
| Support response time | < 24 hours | Community support |
PUBLISHING TO MARKETPLACE
Overview
Agents built with the HUMAN SDK can be published to the Marketplace for discovery and installation by other users.
Publishing flow:
# Develop agent
$ human-agent init my-agent
$ cd my-agent
$ human-agent dev
# Test
$ human-agent test --golden
# Prepare for publishing
$ human-agent manifest
# Publish to Marketplace
$ human-agent publish
Manifest Requirements
Every published agent needs a complete human-agent.yaml:
# Marketplace metadata
marketplace:
name: "Invoice Processor"
description: "Extracts and validates invoice data"
category: "finance"
keywords: ["invoice", "ocr", "validation"]
# Trust tier target
trustTier: "verified" # community, verified, or human_certified
# Pricing
pricing:
free:
executions: 100
period: "month"
pro:
price: 49
currency: "USD"
period: "month"
executions: "unlimited"
# Screenshots and assets
assets:
icon: "./assets/icon.png"
screenshots:
- "./assets/screenshot1.png"
- "./assets/screenshot2.png"
demo_video: "https://youtube.com/..."
# Support
support:
documentation: "https://docs.example.com"
contact: "support@example.com"
repository: "https://github.com/..."
# Agent capabilities (for discovery)
capabilities:
- finance/invoice/process
- document/extraction
- data/validation
# Required permissions (shown to users during install)
requires:
scope:
- read:documents
- write:structured_data
escalate:
- finance/approver # When amount > $5000
App Review Process
Upon publishing, the agent enters the App Review pipeline:
-
Automated Review (<5 minutes for most agents)
- Security Scanner Agent checks for vulnerabilities
- Policy Compliance Agent verifies capability claims
- Quality Assessment Agent scores code quality
- Trust Scoring Agent classifies risk
-
Approval Decision
- Low risk + good quality → Auto-approved (Community tier)
- Medium risk → Fast-track human review (<4 hours)
- High risk → Full human review (1-3 days)
-
Publisher Dashboard
# Check review status $ human-agent status Status: APPROVED (Community tier) Marketplace URL: https://marketplace.human.cloud/agents/invoice-processor Stats: ├─ Installs: 234 ├─ Active users: 189 ├─ Invocations (30d): 45,293 ├─ Revenue (30d): $2,341 └─ Rating: 4.7 / 5.0 (47 reviews)
Revenue Share
| Trust Tier | Listing Fee | Revenue Share | Benefits |
|---|---|---|---|
| Community | Free | Developer: 85%HUMAN: 15% | Basic listing |
| Verified | $99/year | Developer: 90%HUMAN: 10% | Featured, verified badge |
| HUMAN Certified | $999/year | Developer: 95%HUMAN: 5% | Top placement, certified badge, SLA |
Best Practices
For approval success:
- Comprehensive README and documentation
- Usage examples with sample data
- Error handling for all failure modes
- No hardcoded secrets or API keys
- Clear capability claims matching actual functionality
For marketplace success:
- Solve a common problem (browse existing agents for gaps)
- Competitive pricing (check similar agents)
- Responsive support (answer user questions fast)
- Regular updates (fix bugs, add features)
- Engage with reviews (thank users, address concerns)
For revenue success:
- Free tier to get users (100-1000 executions)
- Pro tier with real value (unlimited, premium features)
- Enterprise tier with custom pricing
- Bundle multiple agents into suites
- Cross-promote your other agents
SDK Commands for Publishing
# Initialize with marketplace support
$ human-agent init my-agent --marketplace
# Validate manifest before publishing
$ human-agent validate
# Test locally with marketplace configuration
$ human-agent dev --marketplace-mode
# Generate marketplace assets
$ human-agent assets generate
# Publish to marketplace (staging first)
$ human-agent publish --staging
# Promote to production after testing
$ human-agent promote --production
# Update published agent
$ human-agent publish --version 1.1.0
# View marketplace analytics
$ human-agent analytics --period 30d
# Respond to reviews
$ human-agent reviews --respond
See also:
135_agent_marketplace_architecture.md- Complete marketplace architecture139_app_review_agent_spec.md- App Review technical specification137_companion_powered_builder.md- Builder Companion integration
AGENT DISCOVERY & CAPABILITY MANAGEMENT
The Challenge: Managing Hundreds of Agents
As organizations mature their agent ecosystem, they face a critical discovery problem:
Symptoms:
- "Do we already have an agent that does X?"
- "Which agents can process invoices?"
- "Am I creating duplicates?"
- "How do I know what capabilities we have?"
- "Which marketplace agents would help us?"
Without proper discovery: Redundant agents, missed reuse opportunities, shadow AI.
The solution: Agent-aware Capability Graph + intelligent discovery interfaces.
Agent Storage: Multi-Tier Visibility
Agents are stored based on visibility:
┌─────────────────────────────────────────────────────────────┐
│ 1. MARKETPLACE AGENTS (Public, Saleable) │
│ Storage: Global registry (HUMAN-hosted) │
│ Discovery: Anyone can browse │
│ Examples: Invoice processors, contract reviewers │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. ORG-PRIVATE AGENTS (Internal, Never Public) │
│ Storage: Org Vault (file-based) │
│ Discovery: Only within org │
│ Examples: Internal workflow agents, proprietary tools │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. SELF-HOSTED ENTERPRISE AGENTS (Behind Firewall) │
│ Storage: On-premise vault + optional registry sync │
│ Discovery: Only within enterprise network │
│ Examples: Regulated industry agents, air-gapped systems │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 4. PERSONAL AGENTS (Individual's Vault) │
│ Storage: Personal Vault │
│ Discovery: Only by owner │
│ Examples: Personal assistants, custom automations │
└─────────────────────────────────────────────────────────────┘
Agent Manifest: Visibility Control
Every agent declares its visibility:
# agent.yaml
id: acme-proprietary-workflow
name: ACME Proprietary Workflow Agent
version: 1.0.0
capabilities: [workflow/acme_internal]
# Visibility control
visibility:
scope: org # 'public' | 'org' | 'enterprise' | 'personal'
ownerOrgId: org_acme_corp
shareWith: [] # Optional: specific org IDs for B2B sharing
# Storage location
storage:
type: vault # 'marketplace' | 'vault' | 'self_hosted'
vaultRef: vault://org_acme_corp/agents/acme-workflow
Vault-Based Agent Storage
Agents are files, not database records:
vault://org_acme_corp/agents/
acme-invoice-processor/
agent.yaml # Manifest (metadata + config)
handler.js # Code bundle
package.json # Dependencies
schemas/
input.schema.json
output.schema.json
README.md # Human-readable docs
acme-ap-workflow/
agent.yaml
handler.js
...
Benefits:
- ✅ Zero-config: Just drop files in vault
- ✅ No database setup required
- ✅ Version control friendly
- ✅ Works offline
- ✅ Vault is source of truth
Automatic Agent Indexing
How discovery works without databases:
┌─────────────────────────────────────────────────────────────┐
│ 1. Agent files live in vaults (source of truth) │
│ - Org vaults: vault://org_id/agents/* │
│ - Personal vaults: vault://did/agents/* │
│ - Marketplace: vault://marketplace/agents/* │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 2. Agent Indexer watches vaults (background service) │
│ - Scans vault://*/agents/ folders │
│ - Parses agent.yaml files │
│ - Builds in-memory search index │
│ - Refreshes on file changes │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 3. Discovery API queries index (ephemeral, rebuildable) │
│ - Fast search across all accessible agents │
│ - Scoped by user's vault access │
│ - Index loss = rebuild from vaults (no data loss) │
└─────────────────────────────────────────────────────────────┘
Agent Discovery Service
interface AgentDiscoveryService {
// Context-aware search
findAgents(query: AgentQuery, context: DiscoveryContext): Promise<Agent[]>;
// Scoped to what the user can see
listAvailable(context: DiscoveryContext): Promise<Agent[]>;
// Suggest agents for a capability need
suggestForCapability(
capability: string,
context: DiscoveryContext
): Promise<AgentSuggestion[]>;
// Check for duplicate/overlapping agents
findSimilarAgents(agentId: string): Promise<AgentSimilarity[]>;
// Get capability coverage for org
getCapabilityCoverage(orgId: string): Promise<CapabilityCoverage>;
}
interface DiscoveryContext {
passportId: string;
orgId?: string;
vaultAccess: VaultAccessToken[];
deploymentProfile: 'hosted' | 'hybrid' | 'self_hosted' | 'desktop';
}
interface AgentQuery {
capabilities?: string[];
keywords?: string;
category?: string;
// Scope filters
includeMarketplace?: boolean; // Default: true
includeOrgPrivate?: boolean; // Default: true if orgId present
includePersonal?: boolean; // Default: true
includeSelfHosted?: boolean; // Default: true if on-premise
}
interface AgentSuggestion {
installed: Agent[]; // You already have these
marketplace: Agent[]; // Available to install
orgAvailable: Agent[]; // In your org but not installed
}
Companion Agent Discovery
Natural language queries to discover agents:
User: "What agents do I have for processing invoices?"
Companion (querying discovery service):
→ Found 4 agents with invoice processing capabilities:
📦 Installed (3):
1. Invoice Fraud Detector [Marketplace, Gold certified]
- Analyzes invoices for fraud patterns
- Last used: 2 hours ago
- 15.2K calls this month
2. ACME AP Workflow [Org-private]
- ACME-specific accounts payable automation
- Last used: today
- 8.9K calls this month
3. My Invoice Parser [Personal]
- Custom PDF extraction
- Draft status, not deployed
🛒 Available in marketplace (2):
4. QuickBooks Sync Agent [Silver certified]
- Bi-directional QuickBooks integration
- $49/mo or 100 calls/mo free
5. Invoice OCR Pro [Bronze certified]
- Advanced OCR with 50+ templates
- Free tier available
Would you like to:
• See usage details for any agent
• Install a marketplace agent
• Create a new agent
Query patterns:
"What agents can do X?"
"Do I already have an agent for Y?"
"Show me all finance agents"
"Which agents use this capability?"
"What's available for contract review?"
"Compare agent X and agent Y"
"What capabilities am I missing?"
Agent Capability Graph
Building a semantic capability index:
interface CapabilityIndex {
capabilities: {
[capabilityId: string]: {
name: string;
description: string;
installed_agents: AgentSummary[];
marketplace_agents: AgentSummary[];
related_capabilities: string[];
common_use_cases: string[];
};
};
}
// Example:
{
"finance/invoice/process": {
"name": "Invoice Processing",
"description": "Extract, validate, and route invoices for approval",
"installed_agents": [
{
"id": "invoice-fraud-detector",
"name": "Invoice Fraud Detector",
"certification": "gold",
"usage_stats": {
"calls_last_30d": 15234,
"avg_latency_ms": 1200,
"success_rate": 0.997
}
},
{
"id": "acme-ap-workflow",
"name": "ACME AP Workflow",
"visibility": "org",
"usage_stats": {
"calls_last_30d": 8941
}
}
],
"marketplace_agents": [
{
"id": "quickbooks-sync",
"name": "QuickBooks Sync Agent",
"certification": "silver",
"pricing": "$49/mo or 100 calls/mo free"
}
],
"related_capabilities": [
"finance/accounts_payable",
"document_processing/invoice",
"finance/fraud_detection"
],
"common_use_cases": [
"Invoice validation",
"PO matching",
"Approval routing"
]
}
}
Duplicate Detection
Before creating a new agent:
// Check for duplicates
const similar = await agentDiscovery.findSimilarAgents({
capabilities: ['finance/invoice/process'],
description: 'Process invoices for ACME',
}, context);
// Returns:
{
exact_matches: [], // Same capabilities + description
high_overlap: [
{
agent: acme_ap_workflow,
overlap_score: 0.85,
shared_capabilities: ['finance/invoice/process'],
recommendation: 'Consider extending existing agent'
}
],
moderate_overlap: [
{
agent: invoice_fraud_detector,
overlap_score: 0.45,
shared_capabilities: ['document_processing/invoice'],
recommendation: 'Could use as dependency'
}
]
}
AUTO-GENERATED AGENT DOCUMENTATION
The Vision: API Reference for Agents
Every agent should have API-style documentation like Stripe or Twilio docs.
What developers need:
- Capabilities: What does this agent do?
- Entrypoints: How do I call it?
- Input/Output schemas: What data does it expect/return?
- Examples: Show me working code
- Reasoning requirements: What models does it need?
- Compatibility: Will it work in my org?
- Usage stats: How reliable is it?
Auto-Generated from Manifest
Agent documentation is generated from agent.yaml:
# Invoice Fraud Detector
**Publisher:** acme-security-corp
**Certification:** Gold
**Visibility:** Marketplace (public)
**Version:** 1.0.3
**Last Updated:** 2025-12-15
## Description
Analyzes invoices for fraud patterns using AI + human review.
## Capabilities
- `finance/fraud_detection` - Detect fraudulent transactions
- `document_processing/invoice` - Extract and analyze invoice data
## Installation
```bash
human agent install invoice-fraud-detector
Usage
const result = await human.call({
target: "agent://invoice-fraud-detector.analyze",
input: {
document_id: "inv_123",
vendor_id: "vendor_456",
amount: 50000,
},
delegation: passport.delegate({
scope: ["read:documents", "write:audit_log"],
}),
});
// Returns:
// {
// risk_score: 0.87, // 0-1
// fraud_indicators: ["unusual_amount", "new_vendor"],
// recommended_action: "review",
// confidence: 0.92
// }
Entrypoints
analyze
Analyzes a single invoice for fraud indicators.
Input Schema:
{
"type": "object",
"required": ["document_id", "amount"],
"properties": {
"document_id": {
"type": "string",
"description": "ID of invoice document"
},
"vendor_id": {
"type": "string",
"description": "Vendor identifier"
},
"amount": {
"type": "number",
"description": "Invoice amount"
},
"context": {
"type": "object",
"properties": {
"previous_invoices": {
"type": "number",
"description": "Number of prior invoices from this vendor"
}
}
}
}
}
Output Schema:
{
"type": "object",
"properties": {
"risk_score": {
"type": "number",
"description": "Risk score from 0 (safe) to 1 (high risk)"
},
"fraud_indicators": {
"type": "array",
"items": { "type": "string" },
"description": "List of detected fraud indicators"
},
"recommended_action": {
"type": "string",
"enum": ["approve", "review", "reject"],
"description": "Recommended next action"
},
"confidence": {
"type": "number",
"description": "Confidence in assessment (0-1)"
}
}
}
Reasoning Requirements
- Capabilities: classification, anomaly_detection, natural_language
- Min context window: 16,000 tokens
- Allows PHI: true
- Regulatory domains: finance
- Supported profiles: high_safety, standard_safety
Compatibility: Will work with your org if you have:
- ✅ GPT-4, Claude 3.5 Sonnet, or equivalent
- ✅ standard_safety or higher reasoning profile
- ✅ 16K+ context window models
Usage Statistics
- Total installs: 1,247
- Active orgs: 892
- Calls (last 30 days): 2.4M
- Avg latency: 1.2s
- Success rate: 99.7%
- Rating: 4.8/5 (230 reviews)
Pricing
- Free tier: 100 calls/month
- Pro tier: $49/month (1,000 calls)
- Business tier: $199/month (unlimited)
- Enterprise: Custom pricing
Support
- Documentation: https://docs.acme-security.com/fraud-detector
- Contact: support@acme-security.com
- Repository: https://github.com/acme/invoice-fraud-detector
Generated from agent manifest on 2025-12-19
### Documentation Export Formats
**CLI commands:**
```bash
# Generate Markdown docs
human agent docs generate --format markdown --output ./docs/
# Generate OpenAPI spec
human agent docs generate --format openapi --output ./openapi.yaml
# Generate interactive HTML site
human agent docs generate --format html --output ./agent-docs/
# Generate for specific agents
human agent docs generate invoice-fraud-detector --format markdown
OpenAPI export:
# Auto-generated OpenAPI spec
openapi: 3.0.0
info:
title: Invoice Fraud Detector Agent
version: 1.0.3
description: Analyzes invoices for fraud patterns
paths:
/agents/invoice-fraud-detector/analyze:
post:
summary: Analyze invoice for fraud
operationId: analyzeInvoice
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/AnalyzeInput'
responses:
'200':
description: Analysis complete
content:
application/json:
schema:
$ref: '#/components/schemas/AnalyzeOutput'
components:
schemas:
AnalyzeInput:
type: object
required: [document_id, amount]
properties:
document_id:
type: string
vendor_id:
type: string
amount:
type: number
AnalyzeOutput:
type: object
properties:
risk_score:
type: number
fraud_indicators:
type: array
items:
type: string
recommended_action:
type: string
enum: [approve, review, reject]
Use with Postman, Swagger UI, etc.
Organization-Wide Agent Reference
Generate reference for all installed agents:
# Export all installed agents
human agent docs generate --scope org --output ./org-agent-reference/
# Creates:
org-agent-reference/
index.html
agents/
invoice-fraud-detector.html
acme-ap-workflow.html
contract-risk-analyzer.html
...
capabilities/
finance.html
legal.html
...
search.js
Browsable site:
┌─────────────────────────────────────────────────────────────┐
│ ACME Corp Agent Reference [Search...] │
├─────────────────────────────────────────────────────────────┤
│ │
│ Quick Links │
│ • All Agents (15) │
│ • By Capability │
│ • By Certification │
│ • Usage Statistics │
│ │
│ ────────────────────────────────────────────────────────── │
│ │
│ Finance Agents (5) │
│ │
│ 🟢 Invoice Fraud Detector [Gold, Marketplace] │
│ Analyzes invoices for fraud patterns │
│ 15.2K calls/month • 99.7% success │
│ [View Docs] [Usage Stats] [Call Examples] │
│ │
│ 🟢 ACME AP Workflow [Org-private] │
│ ACME-specific accounts payable automation │
│ 8.9K calls/month • 98.2% success │
│ [View Docs] [Usage Stats] [Call Examples] │
│ │
│ ... more agents ... │
│ │
└─────────────────────────────────────────────────────────────┘
IDE INTEGRATION: AGENT-AWARE DEVELOPMENT
Cursor as Agent-Aware IDE
Modern AI-driven IDEs (Cursor, Windsurf, etc.) should be fully aware of the agent ecosystem.
MCP Server for Agent Discovery
HUMAN provides an MCP server that exposes agent information to IDEs:
// packages/mcp-server-agents/src/index.ts
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
const server = new Server({
name: 'human-agents-server',
version: '0.1.0',
}, {
capabilities: { tools: {}, resources: {} },
});
// Tool: Search for agents by capability
server.setRequestHandler('tools/call', async (request) => {
if (request.params.name === 'search_agents') {
const { capability, keywords, include_marketplace } = request.params.arguments;
const agents = await agentIndexer.findAgents({
capabilities: capability ? [capability] : undefined,
keywords,
includeMarketplace: include_marketplace,
});
return { content: [{ type: 'text', text: JSON.stringify(agents, null, 2) }] };
}
if (request.params.name === 'get_agent_details') {
const { agent_id } = request.params.arguments;
const agent = await agentIndexer.getAgent(agent_id);
const docs = await generateAgentDocs(agent);
return { content: [{ type: 'text', text: docs }] };
}
if (request.params.name === 'suggest_agents_for_code') {
const { code_snippet, intent } = request.params.arguments;
const suggestions = await suggestAgentsForIntent(intent, code_snippet);
return { content: [{ type: 'text', text: JSON.stringify(suggestions, null, 2) }] };
}
});
// Resource: List all installed agents
server.setRequestHandler('resources/list', async () => {
return {
resources: [
{
uri: 'human://agents/installed',
name: 'Installed Agents',
mimeType: 'application/json',
},
{
uri: 'human://agents/marketplace',
name: 'Marketplace Agents',
mimeType: 'application/json',
},
{
uri: 'human://agents/capabilities',
name: 'Agent Capabilities Index',
mimeType: 'application/json',
},
],
};
});
// Resource: Read agent data
server.setRequestHandler('resources/read', async (request) => {
if (request.params.uri === 'human://agents/installed') {
const agents = await agentIndexer.listInstalled();
return {
contents: [{
uri: request.params.uri,
mimeType: 'application/json',
text: JSON.stringify(agents, null, 2),
}],
};
}
if (request.params.uri === 'human://agents/capabilities') {
const capabilityIndex = await buildCapabilityIndex();
return {
contents: [{
uri: request.params.uri,
mimeType: 'application/json',
text: JSON.stringify(capabilityIndex, null, 2),
}],
};
}
});
Cursor Configuration
// .cursor/mcp.json
{
"mcpServers": {
"human-agents": {
"command": "node",
"args": [".human/mcp-server-agents/index.js"],
"env": {
"HUMAN_VAULT_PATH": "${workspaceFolder}/.human/vault",
"HUMAN_ORG_ID": "org_acme_corp"
}
}
}
}
Auto-Updated .cursorrules
HUMAN generates .cursorrules from agent ecosystem:
# .cursorrules (Auto-generated by HUMAN)
Last updated: 2025-12-19T16:30:00Z
Org: ACME Corp (org_acme_corp)
Installed agents: 15 org + 4 personal
Marketplace agents: 1,247 available
## Installed Agents
### Finance (5 agents)
- **invoice-fraud-detector**: Analyzes invoices for fraud patterns
- Capabilities: finance/fraud_detection, document_processing/invoice
- Usage: 15.2K calls (last 30d), 99.7% success rate
- Example: `agent://invoice-fraud-detector.analyze`
- **acme-ap-workflow**: ACME-specific accounts payable automation
- Capabilities: finance/accounts_payable, workflow/approval
- Usage: 8.9K calls (last 30d), 98.2% success rate
- Example: `agent://acme-ap-workflow.process`
... [full list of installed agents]
## Marketplace Highlights (Relevant to Your Work)
Based on your codebase analysis, these marketplace agents might help:
- **payment-risk-analyzer** (Silver, $99/mo): Similar to your fraud detector but for payments
- **quickbooks-sync** (Silver, free tier): You're calling QuickBooks API manually in 3 places
- **document-parser-pro** (Bronze, $29/mo): Better than your PDF parsing code
## Coding Guidelines
1. **Agent-first development**: Check for existing agents before implementing
2. **Use MCP**: Query `@human` in chat to search agents
3. **Compose agents**: Build workflows from existing agents when possible
4. **Create agents**: If logic is reusable, make it an agent
## Common Patterns in This Workspace
```typescript
// Invoice processing (standard pattern in this org)
const fraud = await human.call({ target: "agent://invoice-fraud-detector.analyze", ... });
const po = await human.call({ target: "agent://acme-po-validator.validate", ... });
if (fraud.risk_score > 0.7) {
await human.call({ target: "agent://acme-approval-workflow.route", ... });
}
Capability Map
[Semantic index of all capabilities with installed/marketplace agents]
finance/fraud_detection:
- invoice-fraud-detector (installed, gold)
- payment-risk-analyzer (marketplace, silver)
legal/contract/risk:
- contract-risk-analyzer (installed, silver)
- msla-review-agent (marketplace, gold)
... [full capability graph]
### IDE Behaviors with Agent Context
**Scenario 1: Writing code that could use an agent**
```typescript
// Developer writes:
async function detectFraud(invoice: Invoice) {
// User starts writing fraud detection logic
Cursor AI (with agent context) responds:
💡 I noticed you're implementing fraud detection.
You have an installed agent that does this:
• invoice-fraud-detector (finance/fraud_detection)
Instead of implementing this manually, you can use:
[Shows agent call code with proper types]
This agent is maintained by your security team and is already
handling 10,000+ invoices/day.
Would you like to use it? [Yes] [Show alternatives] [Continue manual]
Scenario 2: Need capability not installed
async function extractNDATerms(contract: string) {
// Need to parse NDA contract
Cursor AI:
💡 You don't have an agent for NDA extraction, but there's one
in the marketplace:
📦 nda-automation-suite (Gold certified)
• Extracts terms, parties, dates, obligations
• Handles 50+ NDA templates
• $49/month or 100 free calls/month
• 4.8/5 stars (230 reviews)
Install? [Yes, add to org] [Yes, personal trial] [Show alternatives]
Or I can help you create a custom agent if you need different
functionality.
Scenario 3: Agent composition
async function processContract(contractPdf: Buffer) {
// Extract text from PDF
// Analyze for risks
// Check compliance
// Route for approval
Cursor AI:
💡 This workflow combines multiple agent capabilities:
1. PDF extraction: document-parser-pro (installed)
2. Risk analysis: contract-risk-analyzer (installed)
3. Compliance: compliance-checker (installed)
4. Approval routing: acme-approval-workflow (installed)
Here's a composed implementation:
[Shows full workflow code using 4 agents]
All 4 agents are already installed and working together in your org.
Natural Language Agent Queries in IDE
Chat with Cursor about agents:
Developer: "@human what agents do I have for processing contracts?"
Cursor (via MCP):
You have 3 agents for contract processing:
1. **contract-risk-analyzer** (installed, org-wide)
- Capabilities: legal/contract/risk, document_processing
- Analyzes contracts for legal risks, unfavorable terms
- Used 234 times this month
- Avg latency: 2.3s
2. **compliance-checker** (installed, org-wide)
- Capabilities: legal/compliance, regulatory/audit
- Checks contracts against regulatory requirements
- Used 156 times this month
- Avg latency: 1.8s
3. **nda-automation-suite** (marketplace, not installed)
- Capabilities: legal/nda, contract/automation
- Gold certified, $49/mo or 100 calls/mo free
- Extracts terms, generates summaries, compares versions
- 4.8/5 stars
Would you like to:
- See example usage code for any of these
- Install nda-automation-suite
- Create a new contract agent
TypeScript Types from Agent Schemas
Auto-generated types for installed agents:
// Auto-generated: .human/types/agents.d.ts
declare module '@human/agents' {
export namespace invoiceFraudDetector {
interface AnalyzeInput {
document_id: string;
vendor_id: string;
amount: number;
context?: {
previous_invoices?: number;
};
}
interface AnalyzeOutput {
risk_score: number;
fraud_indicators: string[];
recommended_action: 'approve' | 'review' | 'reject';
confidence: number;
}
function analyze(input: AnalyzeInput): Promise<AnalyzeOutput>;
}
export namespace contractRiskAnalyzer {
// ... types for this agent
}
}
Developer uses types:
import { invoiceFraudDetector } from '@human/agents';
async function processInvoice(invoiceData: InvoiceData) {
// TypeScript knows the shape!
const result = await invoiceFraudDetector.analyze({
document_id: invoiceData.id,
vendor_id: invoiceData.vendorId,
amount: invoiceData.amount,
});
// Autocomplete works:
if (result.recommended_action === 'approve') {
console.log(result.fraud_indicators); // string[]
}
}
Zero-Config Workspace Setup
# In your HUMAN workspace:
human workspace init
✓ Scanned vaults (15 org + 4 personal agents)
✓ Built capability index
✓ Started MCP server
✓ Generated .cursorrules
✓ Generated TypeScript types
✓ Configured Cursor integration
Cursor AI is now agent-aware!
Try asking:
@human what agents do I have?
@human show me fraud detection agents
@human help me build an invoice workflow
EXAMPLE USE CASES: PROVING THE SDK
Purpose: Concrete, working examples that demonstrate the SDK solves real problems.
These examples serve as:
- Validation - If we can't write the example cleanly, the SDK needs work
- Benchmark - Every new SDK feature must have a use case
- Onboarding - Developers copy-paste and ship
See also: 106_use_case_library.md for complete examples with full code
Use Case 1: Invoice Processing (Medium Complexity)
Scenario: Email arrives with invoice PDF → Extract data → Approve if >$5000 → Update QuickBooks
Time to build: 1 hour
Complexity: Medium
Components: Agent + LLM + Database + Human Approval + External API
Key SDK Features Demonstrated
import { handler } from '@human/agent-sdk';
export const invoiceProcessor = handler({
id: 'invoice-processor',
capabilities: ['finance/invoice/process'],
async execute(ctx) {
const { pdfUrl } = ctx.input;
// 1. Infrastructure-invisible: PDF extraction
const pdfText = await ctx.call.muscle('pdf-extractor', { url: pdfUrl });
// 2. Infrastructure-invisible: LLM with provider abstraction
const extraction = await ctx.llm.complete({
model: 'gpt-4',
messages: [
{ role: 'system', content: 'Extract invoice data as JSON' },
{ role: 'user', content: pdfText }
],
responseFormat: 'json'
});
const invoice = JSON.parse(extraction.text);
// 3. Infrastructure-invisible: Database (auto-provisioned)
const invoiceId = await ctx.db.query(
'INSERT INTO invoices (vendor, amount, status) VALUES ($1, $2, $3) RETURNING id',
[invoice.vendor, invoice.amount, 'pending']
);
// 4. Human-in-the-loop: Conditional approval
if (invoice.amount > 5000) {
const approved = await ctx.oversight.approve({
question: `Approve invoice from ${invoice.vendor} for $${invoice.amount}?`,
context: invoice,
requiredCapability: 'finance/invoice-approver',
timeout: 86400 // 24 hours
});
if (!approved) {
await ctx.db.query('UPDATE invoices SET status = $1 WHERE id = $2', ['rejected', invoiceId]);
return { status: 'rejected' };
}
}
// 5. External integration: QuickBooks muscle
await ctx.call.muscle('quickbooks-create-bill', {
vendor: invoice.vendor,
amount: invoice.amount,
lineItems: invoice.lineItems
});
// 6. Provenance: All logged automatically
return {
status: 'processed',
invoiceId,
amount: invoice.amount
};
}
});
What This Proves
| SDK Principle | Demonstrated How |
|---|---|
| Infrastructure-invisible | No config for PDF, LLM, DB, or QB |
| Secure by default | Provenance logged automatically |
| Scale-to-zero | No min instances configured |
| Progressive permissions | Approval requested when needed |
| Provider-agnostic | Could swap LLM model with one line |
Result: 60 lines of business logic. Zero infrastructure code.
Use Case 2: Contract Review (High Complexity)
Scenario: Analyze contract → Flag risks → Route to lawyer if high-risk
Time to build: 1 hour
Complexity: Medium
Components: Agent + LLM + Capability Routing + Risk-Based Escalation
Key SDK Features Demonstrated
import { handler } from '@human/agent-sdk';
export const contractReviewer = handler({
id: 'contract-reviewer',
capabilities: ['legal/contract-review'],
async execute(ctx) {
const { contractText } = ctx.input;
// 1. Model selection: Claude for legal reasoning
const analysis = await ctx.llm.complete({
model: 'claude-3-5-sonnet', // Best for legal
messages: [
{
role: 'system',
content: `You are a legal contract analyzer. Output as JSON: {riskLevel, concerns, recommendations}`
},
{ role: 'user', content: contractText }
],
responseFormat: 'json'
});
const review = JSON.parse(analysis.text);
// 2. Capability-based routing: High risk → lawyer
if (review.riskLevel === 'high') {
const lawyerReview = await ctx.oversight.escalate({
question: 'High-risk contract detected. Please review.',
context: { contractText, riskLevel: review.riskLevel, concerns: review.concerns },
requiredCapability: 'legal/contract-attorney', // Only certified attorneys
priority: 'high'
});
return {
status: 'reviewed',
riskLevel: 'high',
humanReview: lawyerReview,
concerns: review.concerns
};
}
// 3. Medium risk: optional review (create review link)
if (review.riskLevel === 'medium') {
return {
status: 'flagged',
riskLevel: 'medium',
concerns: review.concerns,
reviewUrl: ctx.oversight.createReviewLink({ context: review })
};
}
// 4. Low risk: auto-approve
return {
status: 'approved',
riskLevel: 'low',
concerns: review.concerns
};
}
});
What This Proves
| SDK Principle | Demonstrated How |
|---|---|
| Capability-first routing | legal/contract-attorney ensures qualified reviewer |
| Risk-based escalation | High → lawyer, medium → optional, low → auto |
| Model abstraction | Selected Claude for legal reasoning |
| Escalation vs approval | escalate() for urgent, createReviewLink() for optional |
Use Case 3: Multi-Agent Medical Triage (Very High Complexity)
Scenario: Patient symptoms → Triage agent analyzes → Route to specialist → Book appointment
Time to build: 2 hours
Complexity: High
Components: Multi-agent orchestration + Capability routing + External integrations
Key SDK Features Demonstrated
import { handler } from '@human/agent-sdk';
export const medicalTriage = handler({
id: 'medical-triage',
capabilities: ['healthcare/triage'],
async execute(ctx) {
const { patientId, symptoms } = ctx.input;
// 1. Call marketplace agent (triage analysis)
const triageResult = await ctx.call.agent('marketplace:medical-triage-analyzer', {
symptoms,
patientHistory: await getPatientHistory(ctx, patientId)
});
// 2. Severity-based routing
switch (triageResult.severity) {
case 'critical':
// Immediate escalation to on-call physician
const onCallResponse = await ctx.oversight.escalate({
question: 'CRITICAL: Patient requires immediate attention',
context: { patientId, symptoms, triageAnalysis: triageResult },
requiredCapability: 'healthcare/physician-on-call',
priority: 'critical',
timeout: 300 // 5 minutes
});
// If no response, call emergency services
if (!onCallResponse) {
await ctx.call.muscle('emergency-dispatch', { patientId });
}
return { status: 'escalated', severity: 'critical' };
case 'urgent':
// Route to specialist via capability
const specialist = await ctx.call.agent('cap:healthcare-specialist-router', {
symptoms,
specialty: triageResult.recommendedSpecialty
});
// Book urgent appointment (EMR integration)
const appointment = await ctx.call.muscle('ehr-schedule-appointment', {
patientId,
providerId: specialist.providerId,
urgency: 'same-day'
});
return { status: 'scheduled', appointmentTime: appointment.time };
case 'routine':
// Self-care advice
const advice = await ctx.call.agent('marketplace:medical-self-care-advisor', {
symptoms,
severity: 'routine'
});
return {
status: 'advised',
advice: advice.recommendations,
scheduleUrl: await ctx.call.muscle('ehr-generate-booking-link', { patientId })
};
}
}
});
What This Proves
| SDK Principle | Demonstrated How |
|---|---|
| Multi-agent orchestration | 3 agents + 2 muscles coordinated |
| Capability routing | cap:healthcare-specialist-router finds right specialist |
| Marketplace integration | Used 2 marketplace agents |
| Emergency handling | Critical path with fallback to 911 |
| External integrations | EMR scheduling via muscle |
Use Case 4: Real-Time Debugging (Developer Experience)
Scenario: Production agent fails → Developer time-travels to failure → Fixes → Redeploys
Time to debug: 5 minutes (vs 2 hours without time-travel)
Complexity: SDK feature validation
Components: Time-travel debugging + Provenance
The Problem (Without HUMAN SDK)
// Production failure:
// Error: "Invoice approval failed"
//
// Developer must:
// 1. Search logs (10 min)
// 2. Reconstruct state (20 min)
// 3. Find approval context (15 min)
// 4. Reproduce locally (30 min)
// 5. Debug (45 min)
//
// Total: 2 hours
The Solution (With HUMAN SDK)
# 1. Open console, see failure (30 seconds)
human console
# 2. Click "Time Travel to Failure" (30 seconds)
# → Automatically reconstructs exact state at failure
# 3. See provenance chain (1 minute)
# → Who: Agent invoice-processor
# → What: Approval request
# → When: 2026-01-03 10:30:15Z
# → Why: Amount exceeded threshold ($6,234 > $5,000)
# → Result: Human reviewer rejected (reason: "Duplicate invoice")
# 4. Identify issue: No duplicate detection (1 minute)
# 5. Fix locally (2 minutes)
# Add duplicate check before approval request
# 6. Deploy (1 minute)
human deploy
Total: 5 minutes
What This Proves
| SDK Principle | Demonstrated How |
|---|---|
| Provenance by default | Every step logged automatically |
| Time-travel debugging | Reconstruct exact state at failure |
| Exquisite DX | 5 min vs 2 hours to debug |
| Infrastructure-invisible | No logging config required |
Use Case 5: Cost Optimization (Automatic)
Scenario: Agent uses expensive model → SDK automatically optimizes → 80% cost reduction
Time to optimize: 0 minutes (automatic)
Complexity: SDK intelligence validation
Components: Model routing + Cost tracking
What Developer Writes
export const questionAnswerer = handler({
id: 'question-answerer',
capabilities: ['ai/question-answering'],
async execute(ctx) {
// Developer just says "use LLM"
const answer = await ctx.llm.complete({
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: ctx.input.question }
]
// No model specified - SDK chooses
});
return { answer: answer.text };
}
});
What SDK Does Automatically
// Behind the scenes (developer never sees this):
// Week 1: Use default model (GPT-4)
// → Tracks: 1000 questions, $50 cost, 2.3s avg latency
// Week 2: SDK learns pattern
// → Analysis: Simple questions, high latency tolerance, no complex reasoning
// → Suggestion: Try GPT-3.5-turbo (10x cheaper)
// Week 3: SDK A/B tests
// → GPT-4: $50/1000 questions, 2.3s latency, 95% quality
// → GPT-3.5-turbo: $5/1000 questions, 1.8s latency, 94% quality
// → Decision: Switch to GPT-3.5-turbo (saves $45/1000 = 90% cost reduction)
// Week 4: Auto-optimized
// → All questions route to GPT-3.5-turbo
// → Complex questions (detected via routing logic) still use GPT-4
// → Developer never configured anything
What This Proves
| SDK Principle | Demonstrated How |
|---|---|
| Smart defaults | SDK learns usage patterns |
| Cost optimization | 90% reduction without developer action |
| Adaptive infrastructure | Model selection improves over time |
| Developer focus | Write business logic, SDK handles efficiency |
Use Case Matrix: SDK Features Demonstrated
| Use Case | Infrastructure-Invisible | Human-in-Loop | Multi-Agent | Cost Opt | Time-Travel | Capability Routing |
|---|---|---|---|---|---|---|
| Invoice Processing | ✅ | ✅ | ⚠️ (muscles) | ✅ | ✅ | ⚠️ |
| Contract Review | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ |
| Medical Triage | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Time-Travel Debugging | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |
| Cost Optimization | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ |
Coverage: All core SDK principles demonstrated across 5 use cases.
Validation Checklist
Before shipping any SDK feature, validate against these use cases:
- Can invoice processing use it? (general purpose validation)
- Can contract review use it? (legal domain validation)
- Can medical triage use it? (healthcare + multi-agent validation)
- Does it help debugging? (DX validation)
- Does it reduce cost? (efficiency validation)
If answer is "no" to all → Feature may not be necessary.
Developer Journey Validation
| Time | Milestone | Use Case That Proves It |
|---|---|---|
| 5 min | First agent deployed | Echo agent (not shown, but KB 106) |
| 15 min | Add database | Invoice processing (simplified) |
| 30 min | Add human approval | Invoice processing (complete) |
| 1 hour | Multi-step workflow | Contract review |
| 2 hours | Multi-agent orchestration | Medical triage |
| 1 week | Production-ready | All 5 use cases + monitoring |
Next: Full Use Case Library
For complete, copy-paste-ready examples:
- See:
kb/106_use_case_library.md- 20+ use cases with full code - See:
human-labs/quickstartrepo (planned) - Runnable examples with tests
Metadata
File: 105_agent_sdk_architecture.md
Created: November 25, 2025
Version: 1.5 (December 19, 2025 - Added: Agent Discovery & Capability Management, Auto-Generated Agent Documentation, IDE Integration patterns with MCP server, vault-based agent storage, Cursor agent-aware development)
Status: Canonical
Classification: Internal (SDK will be Open Source)
Cross-References:
- See:
10_ai_internal_use_and_companion_spec.md- Companion as reference implementation - See:
20_passport_identity_layer.md- Delegation model, credential scoping, identity hierarchy - See:
22_humanos_orchestration_core.md- Agent-to-agent delegation chains, provenance - See:
104_companion_meeting_muscles_spec.md- Meeting muscles extracted to SDK - See:
111_consumer_companion_and_agent_store.md- Agent Store marketplace and consumer distribution - See:
112_extension_connector_gtm_roadmap.md- Complete SDK specifications, 25+ connector examples, developer onboarding flows, CLI tool documentation, and GTM strategy - See:
95_open_source_strategy_and_licensing.md- SDK licensing (Apache 2.0) - See:
43_haio_developer_architecture.md- HAIO protocol for developers - See:
50_human_agent_design.md- Agent design principles - See:
130_agent_design_patterns.md- Orchestration and trust patterns
Key Sections Added (December 2025):
- CORE SDK PRIMITIVES - Unified
ctx.*API pattern,handler()wrapper - AGENT-TO-AGENT DELEGATION MODEL - Chained delegation with auto-scoping
- CREDENTIAL MANAGEMENT - Passport > Vault > Env cascade with handler scoping
- LLM COST CONTROLS - Tiered thresholds with human escalation
- TESTING PATTERNS - Semantic assertions, golden outputs, deterministic fixtures
- TIME-TRAVEL DEBUGGING - Metadata vs full capture, replay system
- PROMPT VERSIONING - Repo as source, registry as runtime, A/B testing
- INFRASTRUCTURE PROVISIONING - Declarative YAML for DB, cache, storage, queue, vector
- MULTI-LANGUAGE SDK GENERATION - Protocol Buffers + OpenAPI → TypeScript, Python, Go, Rust
- AGENT-READABLE DOCUMENTATION - OpenAPI, llms.txt, context.json formats
For detailed implementation:
- Connector SDK Architecture: Full
BaseConnectorclass,OAuthHelper, governance integration patterns - see112_extension_connector_gtm_roadmap.mdlines 1772-2070 - 25+ Production Connector Examples: Google Calendar, Gmail, Salesforce, AWS Bedrock, Epic EHR, Bloomberg, and more with complete TypeScript interfaces and use cases
- CLI Tool Specification:
human init,human dev,human test,human publishwith AI-assisted development features - Developer Onboarding Flows: 3-5 day path from discovery to first connector published
Next Review: Monthly during development