The Blueprint for Building HAIO-Compliant Agents

Version: 1.0
Status: Canonical
Priority: Must-Ship Year 1
Classification: Internal (SDK will be Open Source)
Related Must-Ships: Developer Experience (43_haio_developer_architecture.md), Meeting Muscles v0.2 (104_companion_meeting_muscles_spec.md)

CORE PHILOSOPHY: INFRASTRUCTURE IS INVISIBLE

Developers write business logic. HUMAN owns everything else.

This is not a convenience — it's a requirement for global scale (P13), security by default, and exquisite developer experience (P10).

The Principle

┌─────────────────────────────────────────────────────────────────┐
│                  WHAT DEVELOPERS OWN                            │
│                                                                 │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │                   BUSINESS LOGIC                         │  │
│   │                                                          │  │
│   │   • What the agent does                                  │  │
│   │   • Domain rules                                         │  │
│   │   • Input/output contracts                               │  │
│   │   • Prompts and reasoning                                │  │
│   │                                                          │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│                  WHAT HUMAN OWNS                                │
│                                                                 │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │                  EVERYTHING ELSE                         │  │
│   │                                                          │  │
│   │   • Scaling (serverless, auto, predictive)              │  │
│   │   • Security (zero-trust, encrypted, audited)           │  │
│   │   • Storage (vaults, databases, caching)                │  │
│   │   • Networking (routing, load balancing, edge)          │  │
│   │   • Observability (metrics, logs, traces)               │  │
│   │   • Cost optimization (model selection, right-sizing)   │  │
│   │   • Compliance (retention, PII, audit trails)           │  │
│   │   • Deployment (CI/CD, preview, rollback)               │  │
│   │   • Identity (Passport, delegation, verification)       │  │
│   │   • Provenance (logging, signing, attestation)          │  │
│   │   • Reasoning (AI model routing, provider abstraction)  │  │
│   │                                                          │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why This Matters

Without This	With This
Developers configure scaling thresholds	HUMAN scales automatically
Developers manage secrets rotation	HUMAN rotates secrets
Developers set up monitoring	HUMAN monitors everything
Developers estimate database sizes	HUMAN right-sizes infrastructure
Developers debug distributed systems	HUMAN provides time-travel debugging
Developers think about cold starts	HUMAN pre-warms intelligently
Developers worry about security	HUMAN is secure by default

Design Principles

Declare outcomes, not mechanisms
- ❌ scale_threshold: 10
- ✅ slo: { latency: { p99: 500ms } }
Secure by default, not opt-in
- ❌ encryption: true
- ✅ Everything encrypted. Always.
Scale-to-zero by default
- ❌ min_instances: 2
- ✅ Serverless. Pay for what you use.
Smart defaults that learn
- ❌ Static thresholds
- ✅ Adaptive based on observed behavior
Infrastructure appears when needed
- ❌ database: { type: postgres, size: small }
- ✅ Use ctx.db → infrastructure auto-provisions
Progressive permission acquisition
- ❌ Declare all scopes upfront
- ✅ Request scopes when needed

The Minimal Manifest

Most agents need only this:

name: invoice-processor
version: 1.0.0
capabilities: [finance/invoice/process]

Everything else has smart defaults. Optional overrides only when needed:

# Only if you have specific requirements
slo:
  latency:
    p99: 200ms      # Stricter than default

budget:
  daily: $50        # Cost cap

OVERVIEW

The HUMAN Agent SDK is the developer toolkit for building agents that operate within the HAIO protocol. It extracts the patterns from the HUMAN Companion into a reusable framework that any developer can use to create identity-aware, capability-verified, safety-bounded agents.

Strategic Importance: The Agent SDK is not a "nice to have" — it's the primary mechanism for growing the HAIO ecosystem. Every agent built on the SDK strengthens the protocol's network effects.

Core Philosophy: Developers write business logic. HUMAN owns everything else. Infrastructure is invisible.

STRATEGIC RATIONALE

Why an Agent SDK?

Ecosystem Growth: Developers build agents → agents need identity/capability → HAIO adoption grows
Protocol Validation: Third-party agents prove HAIO works beyond HUMAN's own products
Network Effects: More agents → more Workforce Cloud demand → more humans trained → stronger protocol
Revenue Path: SDK is free; infrastructure/services are paid

The Companion as Blueprint

The HUMAN Companion is the reference implementation:

Demonstrates all SDK patterns in production
Proves the architecture works at scale
Provides code that can be extracted into SDK

HUMAN Companion (Production)
         ↓
    Pattern Extraction
         ↓
    @human/agent-sdk (Open Source)
         ↓
    Third-Party Agents (Ecosystem)

The Human App Store

The Agent SDK enables HUMAN to become the App Store for human-AI collaboration - a vibrant marketplace where developers build and monetize across all five pillars: Companion Pillar (Agent SDK focus):

Capabilities (what AI can do: contract review, sentiment analysis, medical triage)
Connectors (data sources: Salesforce, Epic EHR, Stripe, SAP)
Extensions (UI enhancements: Gmail sidebar, Calendar overlay, Slack bot)
Complete Agents (packaged capability + connectors + UX)

Other Pillars (additional SDKs):

Passport Apps - Credential issuers, identity verifiers, vault providers, auth methods
Capability Graph Apps - Evidence providers, assessment tools, capability readers
Academy Apps - Course publishers, training modules, simulations, certifications
Workforce Cloud Apps - Task publishers, workflow integrators, review interfaces, QA tools

The Vision:

Year 1: HUMAN builds 24 institutional agents (seed catalog - Companion pillar only) → $7.6M ARR
Year 2: Human App Store expands (Companion + Workforce Cloud + Academy open) → $50M GMV
Year 3: Full five-pillar ecosystem (all pillars open, 90% third-party) → $850M+ GMV

For Consumers:

Browse Human App Store across all five pillars
One-click install: Companion capabilities, Academy courses, Workforce Cloud tasks, Passport tools, Capability Graph assessments
Access through appropriate interfaces (Companion for AI, Academy for learning, etc.)
Free tier + paid tier (70% to developer, 30% to HUMAN)

For Developers:

Build using Agent SDK (free & open source)
Publish to marketplace (free or paid)
70/30 revenue share (developer/HUMAN)
HUMAN handles: payments, certification, distribution, support
Potential to earn $100k+/year from popular capabilities

For Enterprises:

Install Human App Store apps across all pillars for teams
Build internal-only apps with same SDKs (Companion agents, Academy training, Workforce workflows)
Or monetize proprietary apps externally in Human App Store
Volume licensing and enterprise deployment options

The Network Effects Flywheel:

More Capabilities Built (SDK) 
    ↓
More Organizations Install HUMAN
    ↓
More Developers See Opportunity
    ↓
More Capabilities Built
    ↓
Better Coverage of Enterprise Needs
    ↓
More Organizations Install HUMAN
    (Flywheel accelerates)

Strategic Benefits:

Network effects: Each app across five pillars increases platform value
Revenue: 30% of Human App Store GMV (projected $255M+ by Year 3 across all pillars)
Ecosystem lock-in: Organizations invest in app collections across Passport, Graph, Academy, Workforce, Companion
Category leadership: Become the standard for human-AI collaboration across the full stack
Validation: Third-party success proves HAIO architecture at scale

Certification Process: Every marketplace submission undergoes:

Automated security scan
Permission audit
HAIO compliance check
Manual review for paid items
Ongoing monitoring

For complete Agent Store strategy, see: 111_consumer_companion_and_agent_store.md

MIGRATION & INTEROPERABILITY

Making migration from existing agent frameworks really easy is a core GTM lever. The SDK provides patterns for wrapping external executors while adding HUMAN's trust layer.

The Migration Philosophy

Import in 5 minutes — their workflow, your governance
Run hybrid for 30 days — their executor, your wrapper
Migrate to native when ready — optional, but better UX

Developers shouldn't have to rewrite their workflows. They should get governance for free.

The Muscle Adapter Pattern

Key insight: Wrap their executor, govern with HUMAN.

A "muscle adapter" wraps external execution engines (n8n, LangChain, CrewAI, etc.) while routing all calls through MARA's policy engine.

import { muscle } from '@human/agent-sdk';
import { N8nClient } from '@human/muscles-n8n';  // Framework adapter

export const legacyInvoiceProcessor = muscle({
  id: 'legacy_invoice',
  
  // Their workflow still executes
  executor: N8nClient.fromWorkflow('invoice-processor-v3'),
  
  // HUMAN adds governance
  governance: {
    approval: { 
      threshold: '$1000', 
      requires: 'ctx.oversight.approve' 
    },
    audit: 'full',           // Every execution gets provenance
    delegation: true,        // Passport-scoped
  },
});

// Usage: governed external execution
await ctx.call.muscle('legacy_invoice', { invoice });

How Muscle Adapters Work

┌─────────────────────────────────────────────────────────────────┐
│                     ctx.call.muscle()                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   ┌───────────────────────────────────────────────────────┐    │
│   │                 MARA Policy Engine                     │    │
│   │  • Validates delegation (P1)                          │    │
│   │  • Classifies risk (P5)                               │    │
│   │  • Routes to approval if needed (P5)                  │    │
│   │  • Pre-persists execution record (P7)                 │    │
│   └───────────────────────┬───────────────────────────────┘    │
│                           │                                     │
│                           ▼                                     │
│   ┌───────────────────────────────────────────────────────┐    │
│   │                Muscle Adapter                          │    │
│   │  • Translates ctx context to external format          │    │
│   │  • Injects X-Human-* headers where possible           │    │
│   │  • Captures input hash for attestation                │    │
│   └───────────────────────┬───────────────────────────────┘    │
│                           │                                     │
│               ════════════╪════════════                         │
│                    TRUST BOUNDARY                               │
│               ════════════╪════════════                         │
│                           ▼                                     │
│   ┌───────────────────────────────────────────────────────┐    │
│   │              External Executor                         │    │
│   │  (n8n, LangChain, CrewAI, Zapier, etc.)              │    │
│   │  ⚠️ Internal execution not cryptographically verified │    │
│   └───────────────────────┬───────────────────────────────┘    │
│                           │                                     │
│               ════════════╪════════════                         │
│                    TRUST BOUNDARY                               │
│               ════════════╪════════════                         │
│                           ▼                                     │
│   ┌───────────────────────────────────────────────────────┐    │
│   │               Egress Processing                        │    │
│   │  • Captures output hash for attestation               │    │
│   │  • Generates HUMAN attestation (gateway-level)        │    │
│   │  • Logs provenance                                    │    │
│   └───────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

What Muscle Adapters CAN Do

Capability	How
Validate delegation at boundary	MARA ingress
Classify risk at boundary	MARA policy engine
Request approval before forwarding	ctx.oversight integration
Log provenance at boundaries	Input/output hash attestation
Revoke before external call starts	Delegation check

What Muscle Adapters CANNOT Do

Capability	Why Not
Enforce delegation scope within external system	External system ignores HUMAN delegation
Guarantee device-first operation	External system may require its control plane
Pause mid-execution for approval	Most external systems don't support it
Verify what happened internally	External execution is opaque
Revoke in-flight external executions	Most external systems lack revocation

Attestation Model for Muscles

Because we can't verify internal execution, muscle attestations are explicitly marked:

{
  type: 'muscle_execution',
  attestationLevel: 'gateway',  // Not 'full'
  
  // What we CAN attest
  inputHash: 'sha256:...',
  outputHash: 'sha256:...',
  delegation: { chain: ['user → agent → muscle'] },
  timestamp: '2025-12-17T10:00:00Z',
  policyApplied: 'risk-high-approval-required',
  
  // Explicit limitation
  limitation: 'Internal execution in external system not cryptographically verified',
  externalSystem: 'n8n',
  externalWorkflowId: 'invoice-processor-v3',
}

Planned Framework Adapters

Framework	Package	Status	Notes
n8n	`@human/muscles-n8n`	Planned	Workflow JSON import
LangChain	`@human/muscles-langchain`	Planned	Chain wrapper
CrewAI	`@human/muscles-crewai`	Planned	Crew wrapper
AutoGen	`@human/muscles-autogen`	Planned	Agent wrapper
Zapier	`@human/muscles-zapier`	Planned	Webhook bridge

CLI Import Commands (Planned)

# Import n8n workflow → generates HUMAN agent with muscle adapter
human-agent import n8n workflow.json --name invoice-processor

# Import LangChain agent → generates wrapped agent
human-agent import langchain agent.py --preserve-prompts

# Import CrewAI crew → generates agent suite with muscles
human-agent import crewai crew.py --name research-team

What import does:

Parses workflow/agent definition
Generates HUMAN agent scaffold
Wires external steps as muscle calls
Auto-detects human intervention points → ctx.oversight
Preserves prompts as ctx.prompts.load()

AgentField Interoperability Note

AgentField's trust model (control-plane-first) is fundamentally incompatible with HUMAN's (device-first). A full adapter cannot enforce P1 (Sovereignty) or P4 (Distributed).

Recommendation: Gateway pattern only, with explicit limitation disclosure.

See setup/agentfield_adapter_spec_v0.1.md for detailed analysis.

Migration Path to Native

The goal is Tier 0 (HUMAN-native). Muscle adapters provide a bridge:

1. Import existing workflow → muscle adapter wraps it
2. Run with HUMAN governance for 30 days
3. Identify highest-value steps
4. Rewrite those steps native (ctx.llm, ctx.call, etc.)
5. Eventually: fully native, no muscle dependencies

Incentive: Native agents get full attestation, better UX, marketplace eligibility, and "HUMAN Certified" badge.

For complete migration strategy, see: 107_developer_adoption_playbook.md

OPEN SOURCE STRATEGY

Licensing Model

Component	License	Rationale
`@human/agent-sdk`	Apache 2.0	Core framework, maximum adoption
`@human/agent-muscles`	MIT	Muscle implementations, permissive
`@human/companion-framework`	Apache 2.0	Reference implementation
Example agents	MIT	Educational, forkable
HUMAN Companion config	Proprietary	Our tuning, personality, enterprise features

What's Open

Framework (Apache 2.0):

Agent base classes and interfaces
Muscle interface specifications
Memory/Vault binding patterns
Safety boundary enforcement
Audit logging infrastructure
Multi-agent coordination patterns

Muscles (MIT):

Calendar muscle reference implementation
Video conference abstraction layer
Notes/summarization patterns
Generic task routing
Notification patterns

Examples (MIT):

Meeting facilitator agent (simplified)
Document reviewer agent
Task coordinator agent
Research assistant agent

What's Proprietary

HUMAN Companion system prompts
HUMAN-specific personality tuning
Enterprise-grade muscle implementations
Production Recall.ai integration
Advanced facilitation logic
Internal HUMAN workflows

DEVELOPER-FIRST EXPERIENCE

Building multi-agent systems should feel as easy as building microservices—but with trust built in.

The 10-Minute Agent

From idea to deployed agent in under 10 minutes:

# 1. Initialize (30 seconds)
human-agent init invoice-processor
cd invoice-processor

# 2. Customize handler (5 minutes)
code src/handlers/process.ts

# 3. Test locally (2 minutes)
human-agent dev

# 4. Deploy (1 minute)
human-agent deploy

What you get immediately:

Ed25519 keypair for agent identity
Delegation support with mock Passport
Dev tools (DAG visualizer, approval tester)
Production-ready manifest

CLI Commands

Command	What It Does
`human-agent init <name>`	Scaffold agent with identity, delegation, manifest
`human-agent init <name> --suite`	Scaffold multi-agent suite
`human-agent dev`	Hot reload + mock Passport + DAG visualizer + cost tracking
`human-agent dev --share`	Instant shareable tunnel URL
`human-agent test`	Run tests with semantic assertions + golden output comparison
`human-agent test --refresh-fixtures`	Re-record LLM fixtures for deterministic CI
`human-agent deploy`	One command to Workforce Cloud
`human-agent deploy --all`	Deploy entire suite
`human-agent clone <catalog-agent>`	Clone from sample catalog
`human-agent catalog list`	Browse available sample agents
`human-agent vault set KEY=value`	Store secret in agent vault
`human-agent vault list`	List secrets (masked)
`human-agent prompts list <id>`	List prompt versions
`human-agent prompts deploy <id>@v2`	Deploy specific prompt version
`human-agent prompts rollback <id>`	Rollback to previous version
`human-agent prompts test <id>@v2 --against v1`	A/B test prompt versions
`human-agent replay <exec-id>`	Replay execution for debugging
`human-agent golden approve <id>`	Approve golden output for tests
`human-agent docs <topic>`	Show docs for SDK topic (e.g., `ctx.llm`)
`human-agent ctx`	Show all available ctx.* resources

Agent Suites (Monorepo for Agents)

For multi-agent systems, use suites:

human-agent init research-suite --suite

Generated structure:

research-suite/
├── human-suite.yaml         # Suite manifest
├── agents/
│   ├── planner/
│   │   ├── human-agent.yaml
│   │   └── src/handlers/
│   ├── searcher/
│   │   ├── human-agent.yaml
│   │   └── src/handlers/
│   └── synthesizer/
│       ├── human-agent.yaml
│       └── src/handlers/
├── shared/                   # Shared types, utilities
│   └── types.ts
└── tests/
    └── integration.test.ts

Deploy entire suite:

human-agent deploy --all  # All agents, one command

Deployment Profiles: Roll Your Own or Just Works

HUMAN supports three deployment profiles to fit every operational model—from rapid prototyping to air-gapped data centers:

Profile	Best For	Setup Time	Monitoring	Data Sovereignty
Hosted	Startups, rapid deployment	5 minutes	Fully managed	HUMAN Cloud
Hybrid	Enterprises, data residency	1-2 days	Managed or self-hosted	Customer VPC
Self-Hosted	Regulated industries, full control	1-2 weeks	Self-hosted	Customer infrastructure

Hosted Profile: Zero Config

For teams who want "it just works":

$ human-agent deploy  # That's it

  🚀 Deploying: invoice-processor
  
  Auto-configured:
    ✅ Organization: Acme Corp (from your Passport)
    ✅ Agent DID: did:human:agent:acme:invoice-processor
    ✅ Monitoring: https://dashboard.human.cloud
    ✅ Audit logs: Enabled
  
  ✅ Deployed! (38 seconds)

HUMAN manages:

Infrastructure (Kubernetes, databases, object storage)
Monitoring (Prometheus, Grafana, alerts)
Security (TLS, backups, disaster recovery)
Compliance (SOC 2, GDPR, audit trails)

You control:

Agent code
Risk policies
Approval workflows
Cost visibility

Best for: Teams <200, focus on product not infrastructure

Hybrid Profile: Data Stays in Your VPC

For regulated industries needing data locality:

# Infrastructure team sets up once
$ human-agent hybrid setup
  🔐 Installing secure tunnel agent in your VPC...
  ✅ Connected to HUMAN Cloud control plane
  ✅ Monitoring configured (choose: push or self-hosted)

# Developers deploy normally
$ human-agent deploy --profile hybrid
  🚀 Deploying to: Customer VPC (us-west-2)
  📊 Dashboard: https://dashboard.human.cloud
  🔐 Data location: Your PostgreSQL

Architecture:

Control plane: HUMAN Cloud (managed)
Agent execution: Customer VPC
Data storage: Customer infrastructure
Monitoring: Push to HUMAN Cloud OR self-hosted Prometheus

Data residency:

✅ Stays in your VPC: Execution data, agent memory, audit logs, LLM prompts
✅ HUMAN Cloud only stores: Agent metadata, execution status (no payload data)

Secure tunnel:

Customer-initiated connection (no inbound firewall rules)
mTLS with certificate pinning
Instant revocation capability

Best for: HIPAA/GDPR requirements, on-prem system integration

Self-Hosted Profile: Full Control

For air-gapped environments and maximum control:

# Install HUMAN control plane in your infrastructure (one-time)
$ helm install human-control-plane human/control-plane \
  --namespace human \
  --values values.yaml

# Configure CLI
$ human-agent config set control-plane https://api.human.acme.internal

# Deploy agents
$ human-agent deploy
  🚀 Deploying to: Self-hosted control plane
  📊 Dashboard: https://dashboard.human.acme.internal
  🔐 All data: Your infrastructure

You manage:

Control plane (Helm chart)
Agent runtime (Kubernetes)
Databases (PostgreSQL)
Monitoring (Prometheus/Grafana)
Ledger nodes (attestation storage)

HUMAN provides:

Helm charts and Terraform modules
Reference Prometheus/Grafana configs
Migration tools
Optional support contracts

Air-gapped support:

Internal image registry
No external connectivity required
Manual updates via tarball

Best for: FedRAMP, DoD, financial institutions, on-premises requirements

Choosing a Deployment Profile

Start with Hosted, migrate later:

# Start with Hosted (zero config)
$ human-agent deploy

# Migrate to Hybrid when you need data sovereignty
$ human-agent migrate --to hybrid
  📤 Exporting data from HUMAN Cloud...
  📥 Importing to your VPC...
  ✅ Migration complete (zero downtime)

# Migrate to Self-Hosted for full control
$ human-agent migrate --to selfhosted \
  --control-plane https://api.acme.internal
  ✅ Control plane switched

Decision matrix:

Requirement	Profile
Fastest deployment	Hosted
Data must stay in EU/US/region	Hybrid or Self-Hosted
HIPAA/GDPR compliance	Hybrid or Self-Hosted
Air-gapped network	Self-Hosted
<200 team members	Hosted
Custom infrastructure	Self-Hosted
Want managed monitoring	Hosted or Hybrid (push mode)
Full Prometheus control	Hybrid (scrape) or Self-Hosted

For detailed configuration:

Hosted setup: See KB 108, Setup: setup/agent_deployment_hosted_spec.md
Hybrid setup: See KB 108, Setup: setup/agent_deployment_hybrid_spec.md
Self-Hosted setup: See KB 108, Setup: setup/agent_deployment_selfhosted_spec.md
Monitoring configs: Setup: setup/monitoring_configurations.md

The `human.call()` Primitive

human.call() is the universal invocation primitive in HumanOS.

It invokes a capability (not a specific model, agent, or human) under explicit delegation, risk, and policy constraints, producing a verifiable execution record (provenance + attestation) by default.

Invariants:

Delegation validated (scope, expiry, revocation)

Risk evaluated against policy

Execution recorded (pre-persist + completion attestation)

Routed capability-first (humans/agents/models chosen by capability, then cost/constraints)

Human override is always available (escalate/defer/refuse as allowed)

Every agent-to-agent call uses the unified human.call() primitive:

import { human } from '@human/agent-sdk';

// Simple call
const result = await human.call({
  target: 'agent://invoice.parser.parse',
  input: { documentId: 'doc-123' },
});

// With delegation control
const result = await human.call({
  target: 'agent://payments.transfer',
  input: { amount: 5000, to: 'vendor-456' },
  delegation: passport.delegate({ 
    scope: ['write:payments'],
    expires: '1h',
  }),
  risk: 'high',  // Triggers approval
});

// Parallel calls
const results = await Promise.all([
  human.call({ target: 'agent://suite.validator.check', input }),
  human.call({ target: 'agent://suite.scanner.scan', input }),
  human.call({ target: 'agent://suite.classifier.classify', input }),
]);

// Async with callback
const executionId = await human.call({
  target: 'agent://research.deep_analysis',
  input: { topic: 'quantum computing' },
  async: true,
  callback: 'https://my-app.com/webhooks/research',
});

Parameter Semantics

target (direct) vs capability (indirect)

target: Direct agent/resource identifier (e.g., agent://invoice.parser.parse)
capability: HumanOS discovers capable resources automatically (preferred for flexibility)

input and schema expectations

Input must match the target's expected schema (validated before execution)
Schema defined in agent manifest or capability registration

delegation (required for sensitive actions)

Explicit delegation object with scope, expiry, and revocation status
Required for actions that modify state or access sensitive data
Validated before execution; call fails if delegation invalid or expired

risk levels and default policy behavior

low: No approval required, standard logging
medium: May require approval based on policy
high: Requires human approval before execution
critical: Multi-human approval required

async + callback

async: true: Returns execution ID immediately, result delivered via callback
callback: Webhook URL or event handler to receive completion notification
Use for long-running operations or fire-and-forget patterns

idempotencyKey (recommended)

Prevents duplicate execution of the same operation
If call with same key exists, returns existing result instead of re-executing
Critical for retry-safe operations

traceparent / correlation (optional but recommended)

W3C Trace Context header for distributed tracing
Links calls across service boundaries for observability
Format: 00-<trace-id>-<parent-id>-<trace-flags>

policyContext (jurisdiction/domain/user prefs)

Additional context for policy evaluation (jurisdiction, domain, user preferences)
Enables fine-grained policy decisions beyond basic risk levels

Error Model

delegation_invalid

Delegation missing, expired, revoked, or insufficient scope
Resolution: Request new delegation with appropriate scope

policy_denied

Policy engine determined action is not allowed
Resolution: Review policy rules, escalate if needed

no_qualified_executor

No resource found with required capability
Resolution: Register capable resource or adjust capability requirements

executor_timeout

Executor did not respond within timeout window
Resolution: Retry with longer timeout or use async mode

requires_human_approval

Risk level or policy requires human approval before execution
Resolution: Human reviews and approves/rejects via Passport interface

Agent Discovery

Find agents by capability, not hardcoded IDs:

// Discover by capability
const agent = await human.discover({
  capability: 'document/invoice/parse',
  minConfidence: 0.9,
});

await human.call({
  target: agent.id,
  input: { document },
});

Manifest Format

Human-readable YAML configuration with all SDK features:

# human-agent.yaml
name: invoice-processor
version: 1.0.0
description: Parse, validate, and route invoices

# Capabilities (registered to Capability Graph)
capabilities:
  - capability: finance/invoice-processing
    evidence:
      - type: test_coverage
        value: 95%

# What the agent needs permission to do
delegation:
  required_scope:
    - read:documents
    - write:accounting
  max_risk: high

# Handlers with explicit secret scoping
handlers:
  process:
    entrypoint: src/handlers/process.ts
    risk: medium
    secrets: [STRIPE_KEY]           # Only this handler can access
    passport_scopes: [salesforce.read]  # User credentials needed
  parse:
    entrypoint: src/handlers/parse.ts
    risk: low

# Credential management
secrets:
  agent:                            # Shared across all handlers
    - OPENAI_API_KEY
    - DATABASE_URL
  handlers:                         # Handler-specific (see above)
    process: [STRIPE_KEY]

# LLM configuration
llm:
  default_tier: balanced            # fast | balanced | powerful
  providers: [openai, anthropic]    # Preference order
  fallback_strategy: cascade        # Try next on rate limit

# Cost controls
cost_controls:
  daily_limit: 50.00
  dev:
    at_limit: warn_and_continue
  prod:
    thresholds:
      - percent: 80
        action: notify_developer
      - percent: 95
        action: escalate_to_passport
      - percent: 100
        action: hard_stop
    circuit_breaker:
      trigger: 5x_normal_rate
      action: pause_and_escalate

# Debugging configuration
debugging:
  default_retention: metadata_only
  full_capture:
    handlers: [parse]
    environments: [development, staging]
  retention:
    metadata: 90d
    full_data: 7d
  pii:
    mode: redact
    fields: [email, phone, ssn]

# Infrastructure provisioning
infrastructure:
  database:
    type: postgres
    size: small
  cache:
    type: redis
  storage:
    type: s3
    bucket: invoices

# Preview deployments
preview:
  database: seeded
  seed_file: ./fixtures/seed.sql

Dev Mode Features

$ human-agent dev

🚀 Starting HUMAN Agent: invoice-processor
   
   📍 Local:    http://localhost:3001
   🔑 Agent ID: agent://invoice-processor.local
   🎭 Passport: Using mock Passport (dev mode)
   
   Handlers:
   • process → POST /handlers/process
   • parse   → POST /handlers/parse
   
   Dev Tools:
   • Delegation Tester: http://localhost:3001/__dev__/delegation
   • Approval Queue:    http://localhost:3001/__dev__/approvals
   • Execution DAG:     http://localhost:3001/__dev__/dag
   
   ⌨️  Press 'r' to reload | 'q' to quit

Dev tools included:

Mock Passport - Test delegation without real Passport
Delegation Tester - Create test tokens, simulate scope
Approval Queue - Test approval flows locally
DAG Visualizer - See execution tree in real-time

The Killer Feature Matrix

What Developers Get	Without HUMAN	With HUMAN SDK
Create agent	Write boilerplate	`human-agent init`
Agent-to-agent calls	Build routing	`human.call()`
Deploy	Docker + K8s config	`human-agent deploy`
Trust/delegation	Build from scratch	Built-in
Human approval	Build UI + queue	Passport notification
Audit trail	Build logging	Automatic attestations
Revocation	Build kill-switch	One Passport tap
Clone templates	Copy-paste	`human-agent clone`

Learning Path

Quick Start: Clone a sample agent → human-agent clone deep-research my-agent
Patterns: Study design patterns → KB 130
Examples: Browse sample catalog → KB 131
Build: Create your agent suite
Deploy: human-agent deploy

CORE SDK PRIMITIVES

The HUMAN Agent SDK is built around a simple, unified programming model: everything is accessible through ctx (the execution context). This design makes the SDK discoverable, consistent, and easy to use across all languages.

Critical design principle: ctx is also the audit boundary. Every ctx method is automatically instrumented for provenance. Developers cannot bypass ctx — the runtime is sandboxed.

The Execution Context Pattern

Every handler receives a ctx object that provides access to all HUMAN capabilities:

import { handler } from '@human/agent-sdk';

export const processInvoice = handler({
  id: 'process_invoice',
  capabilities: ['finance/invoice/process'],
  requires: { 
    scope: ['read:documents', 'write:accounting'],
    vaults: ['vault://*/finance'],
  },
  
  async execute(ctx, input: { documentId: string }) {
    // All resources accessible via ctx (all auto-logged for provenance)
    const doc = await ctx.vaults.get('vault://acme/finance').read(`/invoices/${input.documentId}`);
    const analysis = await ctx.llm.complete({ prompt: `Analyze: ${doc.content}` });
    await ctx.call.agent('agent://accounting.record', { invoice: doc, analysis });
    return analysis;
  }
});

The `ctx` API Reference

Resource	Purpose	Key Methods	HUMAN System
`ctx.passport`	Identity & delegation	`self`, `principal`, `hasScope()`, `delegate()`	Passport
`ctx.oversight`	Human-in-the-loop (P5)	`approve()`, `decide()`, `escalate()`, `notify()`	HumanOS
`ctx.call`	Universal capability-first routing	`agent()`, `route()`, `withDelegation()`	HumanOS
`ctx.capabilities`	Capability Graph queries	`find()`, `mine()`, `register()`, `evidence()`	Capability Graph
`ctx.workforce`	Human worker pool	`submit()`, `status()`, `await()`, `cancel()`	Workforce Cloud
`ctx.vaults`	Multi-vault storage	`list()`, `get()`, `self`	Passport (Vaults)
`ctx.memory`	Convenience over vaults	`execution`, `session`, `persistent`, `suite`	—
`ctx.llm`	LLM access with auto-routing	`complete()`, `stream()`, `embed()`, `cost`	—
`ctx.db`	Database access	`query()`, `insert()`, `update()`	—
`ctx.secrets`	Credential cascade	`get()`, `list()`	—
`ctx.events`	Provenance logging	`log()`, `startSpan()`, `query()`, `export()`	—
`ctx.files`	File storage	`read()`, `write()`, `list()`	—
`ctx.queue`	Background jobs	`enqueue()`, `schedule()`	—
`ctx.http`	HTTP client with retries	`get()`, `post()`, `request()`	—
`ctx.prompts`	Prompt loading	`load()`, `render()`	—

Key Distinctions

Concept	Resource	Description
Identity (static)	`ctx.passport`	Who am I? Who delegated? What's my scope?
Oversight (dynamic)	`ctx.oversight`	Request approval, decisions, escalation from oversight chain
Routing	`ctx.call`	Route to agents, humans, or models based on capability
Worker Pool	`ctx.workforce`	Submit tasks to HUMAN's human workforce (not the principal)
Capabilities	`ctx.capabilities`	Query/register what agents and humans can do
Storage	`ctx.vaults`	Access purpose-scoped vaults (finance, legal, HR, etc.)

Handler Definition

The handler() wrapper defines an entry point for agent functionality:

export const analyzeContract = handler({
  // Unique identifier
  id: 'analyze_contract',
  
  // Version for tracking
  version: '1.0.0',
  
  // Capabilities this handler provides (for routing)
  capabilities: ['legal/contract/risk_analysis'],
  
  // What delegation this handler requires
  requires: {
    scope: ['read:contracts'],
    riskLevel: 'medium',
  },
  
  // Secrets this handler can access (explicit declaration)
  secrets: ['LEGAL_API_KEY'],
  
  // User credentials needed (via Passport)
  passport_scopes: ['salesforce.read'],
  
  // The implementation
  async execute(ctx, input: { contract: string }) {
    const analysis = await ctx.llm.complete({
      prompt: buildPrompt(input.contract),
      tier: 'powerful',
    });
    return analysis;
  }
});

HUMAN-IN-THE-LOOP (ctx.oversight)

The ctx.oversight resource is the SDK surface for P5 (Human-in-the-Loop). It provides interaction with whoever is providing oversight for the current execution — typically the human who delegated, but could also be an organization's policy enforcement or another agent in the chain.

Why "oversight" not "human"?

Original Term	Problem	Revised Term
`ctx.human`	Ambiguous — could mean user, worker, principal	`ctx.oversight`

"Oversight" clearly conveys:

Approval, decisions, escalations
The entity providing accountability
Works regardless of whether delegator is human, org, or agent

ctx.oversight API

interface OversightContext {
  // Request approval (blocks until response or timeout)
  approve(options: {
    action: string;           // What we want to do
    reason: string;           // Why approval needed
    risk: RiskClass;          // How risky
    timeout?: number;         // Milliseconds before auto-reject
    alternatives?: string[];  // Alternative options to show
  }): Promise<ApprovalResult>;
  
  // Present decision (multiple choice)
  decide(options: {
    question: string;
    options: { id: string; label: string; description?: string }[];
    default?: string;
    timeout?: number;
  }): Promise<DecisionResult>;
  
  // Escalate to oversight (full handoff with rich context)
  escalate(options: EscalateOptions): Promise<EscalationResult>;
  
  // Notify without blocking
  notify(message: string, options?: {
    urgency?: 'info' | 'warning' | 'alert';
    channel?: 'passport' | 'email' | 'slack';
  }): Promise<void>;
  
  // Check oversight availability
  available(): Promise<{
    online: boolean;
    responseTimeEstimate?: number;
    preferredChannel?: string;
  }>;
}

// Rich escalation context
interface EscalateOptions {
  // Developer provides: structured handoff context
  why: {
    reason: string;           // Human-readable explanation
    category: EscalationCategory;  // capability_exceeded | uncertainty | policy | error | review
    urgency: 'low' | 'normal' | 'urgent' | 'critical';
  };
  
  findings?: {
    summary: string;
    details: Record<string, unknown>;
    confidence: number;
  };
  
  recommendation?: {
    action: string;
    reasoning: string;
    alternatives?: string[];
  };
  
  question?: string;         // What the agent needs answered
  
  handoff?: {
    resumeFrom: string;      // Where to resume if task returns
    state: Record<string, unknown>;  // Serialized state
    vault?: string;          // Vault ref for large state
  };
  
  attachments?: AttachmentRef[];
}

// SDK auto-captures (developers don't need to provide)
interface EscalationContext extends EscalateOptions {
  // Automatically populated by SDK:
  execution: {
    id: string;
    agentDid: string;
    handlerId: string;
    startedAt: Date;
    duration: number;
  };
  
  provenance: {
    steps: ProvenanceEvent[];
    delegationChain: string[];
    decisionPoints: HumanDecision[];
  };
  
  cost: {
    spent: number;
    budget: number;
    breakdown: CostBreakdown[];
  };
}

Example: High-Risk Approval

export const processPayment = handler({
  id: 'process_payment',
  requires: { riskLevel: 'high' },
  
  async execute(ctx, input: { amount: number; recipient: string }) {
    // Request approval from oversight
    const approval = await ctx.oversight.approve({
      action: `Transfer $${input.amount} to ${input.recipient}`,
      reason: 'Amount exceeds $1,000 threshold',
      risk: 'high',
      timeout: 300000,  // 5 minutes
      alternatives: ['Reject', 'Request more info'],
    });
    
    if (!approval.approved) {
      return { status: 'rejected', reason: approval.reason };
    }
    
    // Approved, proceed
    const result = await ctx.call.agent('agent://payments.transfer', {
      ...input, 
      approvalRef: approval.reference,
    });
    
    // Notify completion
    await ctx.oversight.notify(`Payment completed: $${input.amount}`, { 
      urgency: 'info',
    });
    
    return result;
  }
});

Example: Rich Escalation

export const analyzeContract = handler({
  id: 'analyze_contract',
  
  async execute(ctx, input: { contractId: string }) {
    const contract = await ctx.vaults.get('vault://acme/legal')
      .read(`/contracts/${input.contractId}`);
    
    const analysis = await ctx.llm.complete({
      prompt: `Analyze risks in: ${contract}`,
      tier: 'powerful',
    });
    
    // Agent is uncertain about jurisdiction
    if (analysis.confidence < 0.7) {
      return ctx.oversight.escalate({
        why: {
          reason: 'Contract has cross-border implications I cannot assess',
          category: 'uncertainty',
          urgency: 'normal',
        },
        findings: {
          summary: analysis.summary,
          details: analysis,
          confidence: analysis.confidence,
        },
        recommendation: {
          action: 'Engage external legal counsel',
          reasoning: 'Multiple jurisdictions identified',
          alternatives: ['Proceed with standard review', 'Flag for compliance'],
        },
        question: 'Which jurisdiction should we prioritize?',
        handoff: {
          resumeFrom: 'jurisdiction_selected',
          state: { contractId: input.contractId, analysis },
        },
      });
    }
    
    return analysis;
  }
});

Provenance for Oversight Interactions

Every ctx.oversight.* call generates provenance:

{
  type: 'oversight.approve',
  executionId: 'exec-123',
  request: { action: 'Transfer $5000', risk: 'high' },
  response: { approved: true, approver: 'did:human:rick' },
  timestamp: '2025-12-16T10:00:00Z',
  responseTime: 45000,  // 45 seconds
  channel: 'passport_app',
}

ctx.workforce — Human Worker Pool

ctx.workforce connects to Workforce Cloud, HUMAN's marketplace of verified human workers. This is distinct from ctx.oversight:

Resource	Who	Purpose
`ctx.oversight`	The principal (who delegated)	Approvals, decisions, escalations
`ctx.workforce`	Pool of verified workers	Task execution by humans

ctx.workforce API

interface WorkforceContext {
  // Submit a task to be completed by a human
  submit(task: {
    type: string;                    // 'review' | 'label' | 'translate' | 'verify' | custom
    capability: string;              // Required capability from Capability Graph
    input: Record<string, unknown>;  // Task data
    instructions: string;            // What the human should do
    priority?: 'low' | 'normal' | 'high';
    deadline?: Date;
    constraints?: {
      minQualification?: string;     // Capability level required
      region?: string[];             // Geographic restrictions
      certifications?: string[];     // Required certs
    };
  }): Promise<TaskSubmission>;
  
  // Check task status
  status(taskId: string): Promise<TaskStatus>;
  
  // Wait for completion
  await(taskId: string, options?: { timeout?: number }): Promise<TaskResult>;
  
  // Cancel pending task
  cancel(taskId: string, reason: string): Promise<void>;
}

Example: Human Review in Workflow

export const processApplication = handler({
  id: 'process_application',
  
  async execute(ctx, input: { applicationId: string }) {
    const app = await ctx.db.query('applications', { id: input.applicationId });
    const aiAssessment = await ctx.llm.complete({ prompt: `Assess: ${app}` });
    
    // Submit to human workforce for verification
    const task = await ctx.workforce.submit({
      type: 'review',
      capability: 'hr/application/senior_review',
      input: { application: app, assessment: aiAssessment },
      instructions: 'Verify AI assessment and confirm hire decision',
      priority: 'normal',
      deadline: new Date(Date.now() + 24 * 60 * 60 * 1000),  // 24h
    });
    
    // Wait for human to complete
    const result = await ctx.workforce.await(task.id, { timeout: 86400000 });
    
    return result.decision;
  }
});

ctx.capabilities — Capability Graph

ctx.capabilities provides access to the Capability Graph Engine for querying what agents and humans can do.

ctx.capabilities API

interface CapabilitiesContext {
  // Find entities with a capability
  find(options: {
    capability: string;              // e.g., 'legal/contract/review'
    minLevel?: number;               // Minimum proficiency (0-1)
    entityType?: 'agent' | 'human' | 'both';
    available?: boolean;             // Currently available?
  }): Promise<CapabilityMatch[]>;
  
  // Get capabilities of current agent
  mine(): Promise<Capability[]>;
  
  // Register/update a capability (with evidence)
  register(capability: {
    domain: string;                  // e.g., 'finance/tax/compliance'
    evidenceRefs: string[];          // Proofs of capability
    confidence: number;              // Self-assessed (0-1)
  }): Promise<CapabilityRegistration>;
  
  // Add evidence to existing capability
  evidence(capabilityId: string, evidence: EvidenceRef): Promise<void>;
  
  // Check if current agent has capability
  has(capability: string, minLevel?: number): Promise<boolean>;
}

Example: Capability-Based Routing

export const routeToExpert = handler({
  id: 'route_to_expert',
  
  async execute(ctx, input: { taskType: string; document: string }) {
    // Find who can handle this
    const experts = await ctx.capabilities.find({
      capability: input.taskType,
      minLevel: 0.8,
      available: true,
    });
    
    if (experts.length === 0) {
      // No one available — escalate
      return ctx.oversight.escalate({
        why: { reason: 'No experts available', category: 'capability_exceeded', urgency: 'normal' },
      });
    }
    
    // Route to best match
    const best = experts[0];  // Sorted by capability score
    
    return ctx.call.route({
      capability: input.taskType,
      input: { document: input.document },
    });
  }
});

UNIVERSAL ROUTING (ctx.call)

ctx.call is the universal routing primitive. It routes to agents, humans, or models based on capability — not explicit targeting.

ctx.call API

interface CallContext {
  // Direct agent call (when you know the target)
  agent(target: string, input: Record<string, unknown>): Promise<CallResult>;
  
  // Capability-based routing (let HumanOS decide)
  route(options: {
    capability: string;              // What capability is needed
    input: Record<string, unknown>;
    preferences?: {
      preferAgent?: boolean;         // Prefer AI over human?
      maxCost?: number;              // Cost constraint
      maxLatency?: number;           // Latency constraint
    };
  }): Promise<CallResult>;
  
  // Wrap call with specific delegation
  withDelegation(options: {
    scopes: string[];
    budget?: number;
    expires?: Date;
  }): CallContext;
}

Example: Universal Routing

export const handleTask = handler({
  id: 'handle_task',
  
  async execute(ctx, input: { capability: string; data: unknown }) {
    // Route to whoever can handle it — agent or human
    const result = await ctx.call.route({
      capability: input.capability,
      input: { data: input.data },
      preferences: {
        preferAgent: true,   // Try AI first
        maxCost: 1.00,       // $1 max
      },
    });
    
    return result;
  }
});

AGENT-TO-AGENT DELEGATION MODEL

When agents call other agents via ctx.call.agent(), delegation follows a chained model with automatic scoping.

The Delegation Chain

User (Rick) grants delegation to Agent A
  → Agent A calls Agent B via ctx.call.agent()
    → SDK auto-scopes: A grants B only what B needs
      → Provenance records: Rick → A → B

How It Works

// Agent A calls Agent B
await ctx.call.agent('agent://accounting.record', invoice);

// SDK automatically:
// 1. Reads B's manifest to see required scopes
// 2. Checks A's delegation allows sub-delegation (canSubDelegate)
// 3. Verifies A has the scopes B needs
// 4. Creates NEW delegation: A → B (not Rick → B directly)
// 5. Scopes to ONLY what B declared it needs (minimal privilege)
// 6. Logs provenance: "A sub-delegated to B under authority from Rick"

Provenance Chain Recorded

{
  chain: [
    { grantor: 'did:human:rick', grantee: 'did:human:agent:invoice-processor', 
      scope: ['read:invoices', 'write:accounting'], canSubDelegate: true },
    { grantor: 'did:human:agent:invoice-processor', grantee: 'did:human:agent:accounting-recorder',
      scope: ['write:accounting'], parentDelegation: 'del-abc123' }
  ],
  action: 'accounting.record',
  timestamp: '2025-12-16T10:00:00Z',
  signature: 'ed25519:...'
}

Explicit Pass-Through (When Required)

For cases where full delegation must pass through:

await ctx.call.withDelegation({
  scopes: ctx.passport.delegation.scopes,  // Full scope passthrough
}).agent('agent://accounting.record', invoice);
// ⚠️ Still creates provenance showing A delegated to B

Delegation Validation Rules

Cannot escalate scope — B cannot receive more than A has
Cannot exceed parent — If Rick didn't grant canSubDelegate, A cannot pass to B
Time-bounded — Sub-delegation expires when parent expires (or sooner)
Revocable — Rick revoking A automatically invalidates A→B chain

ctx.passport — Identity Layer

ctx.passport provides access to the Passport identity layer. Every entity in HUMAN — humans, organizations, and agents — has a Passport.

Passport Model

interface PassportContext {
  // This agent's identity
  self: {
    did: string;              // e.g., 'did:human:agent:invoice-processor'
    kind: 'human' | 'org' | 'agent';
    metadata: PassportMetadata;
  };
  
  // The principal who delegated (human, org, or agent)
  principal: {
    did: string;              // e.g., 'did:human:rick'
    kind: 'human' | 'org' | 'agent';
    metadata: PassportMetadata;
  };
  
  // The current delegation in effect
  delegation: {
    id: string;               // Delegation token ID
    scopes: string[];         // Granted scopes
    canSubDelegate: boolean;  // Can this agent delegate further?
    constraints: DelegationConstraints;
    expiresAt?: Date;
    revokedAt?: Date;
    chain: DelegationChainEntry[];  // Full delegation chain
  };
  
  // Check if current delegation includes scope
  hasScope(scope: string): boolean;
  
  // Request additional scope from principal
  requestScope(options: {
    scope: string;
    reason: string;
  }): Promise<ScopeRequestResult>;
  
  // Create sub-delegation (if canSubDelegate = true)
  delegate(options: {
    target: string;           // Target agent DID
    scopes: string[];         // Scopes to grant (subset of own)
    constraints?: DelegationConstraints;
    expires?: Date;
  }): Promise<DelegationToken>;
  
  // Get delegated access to user credential (OAuth, etc.)
  getAccess(scope: string): Promise<AccessGrant>;
}

Example: Checking Delegation

export const processFinance = handler({
  id: 'process_finance',
  
  async execute(ctx, input) {
    // Who am I?
    console.log(ctx.passport.self.did);
    // → 'did:human:agent:finance-processor'
    
    // Who delegated to me?
    console.log(ctx.passport.principal.did);
    // → 'did:human:rick' (a human)
    // or → 'did:human:org:acme' (an org)
    // or → 'did:human:agent:orchestrator' (another agent)
    
    // What can I do?
    if (!ctx.passport.hasScope('write:transactions')) {
      return ctx.oversight.escalate({
        why: { reason: 'Insufficient scope', category: 'capability_exceeded', urgency: 'normal' },
      });
    }
    
    // Process with granted scope
    // ...
  }
});

ctx.vaults — Multi-Vault Storage

ctx.vaults provides access to purpose-scoped Vaults. Passport holders (humans, orgs, agents) can own and access multiple Vaults, with access controlled by delegation.

Vault Model

Key points:

Vaults are many-to-one: A Passport holder can own/access many Vaults
Vaults are purpose-scoped: vault://acme/finance, vault://acme/legal, vault://acme/hr
Access is delegation-controlled: Agent only accesses Vaults included in its delegation
Agent always has its own Vault at ctx.vaults.self

ctx.vaults API

interface VaultsContext {
  // This agent's own vault (always accessible)
  self: VaultHandle;
  
  // List vaults accessible via current delegation
  list(): Promise<VaultInfo[]>;
  
  // Get handle to a specific vault
  get(uri: string): VaultHandle;
  
  // Check if vault is accessible
  canAccess(uri: string): boolean;
}

interface VaultHandle {
  uri: string;
  
  // Read data
  read(path: string): Promise<unknown>;
  exists(path: string): Promise<boolean>;
  list(prefix?: string): Promise<string[]>;
  
  // Write data (if permitted)
  write(path: string, data: unknown, options?: WriteOptions): Promise<void>;
  append(path: string, data: unknown): Promise<void>;
  delete(path: string): Promise<void>;
  
  // Metadata
  metadata(path: string): Promise<VaultEntryMetadata>;
  
  // Versioning (if enabled)
  history(path: string): Promise<Version[]>;
  restore(path: string, version: string): Promise<void>;
}

interface WriteOptions {
  overwrite?: boolean;        // Default: false (version if exists)
  schema?: string;            // Validate against schema
  ttl?: number;               // Auto-delete after N seconds
  encrypt?: boolean;          // Encrypt at rest (default: true)
}

Manifest Configuration

# human-agent.yaml
vaults:
  # Agent's own vault (auto-created)
  self:
    paths:
      '/cache/*':
        ttl: 3600              # Auto-clean after 1h
      '/state/*':
        schema: 'agent-state'
        versioned: true
  
  # Vaults this agent needs access to (via delegation)
  requires:
    - uri: 'vault://*/finance'
      scopes: ['read', 'write']
    - uri: 'vault://*/legal'
      scopes: ['read']
  
  # Path schemas (enforce structure)
  schemas:
    agent-state:
      type: object
      properties:
        lastRun: { type: string, format: date-time }
        checkpoint: { type: object }

Example: Multi-Vault Access

export const crossDepartmentAnalysis = handler({
  id: 'cross_department_analysis',
  requires: {
    vaults: ['vault://acme/finance', 'vault://acme/legal'],
  },
  
  async execute(ctx, input: { reportType: string }) {
    // Access finance vault
    const financeData = await ctx.vaults.get('vault://acme/finance')
      .read('/reports/quarterly');
    
    // Access legal vault
    const contracts = await ctx.vaults.get('vault://acme/legal')
      .read('/contracts/active');
    
    // Write to agent's own vault
    await ctx.vaults.self.write('/analysis/latest', {
      finance: financeData,
      contracts,
      generatedAt: new Date(),
    });
    
    return { status: 'complete' };
  }
});

Vault Safety Mechanisms

The SDK enforces multiple safety layers on Vault writes:

Mechanism	Description
Path Schemas	Only declared paths can be written
Namespace Isolation	Agent writes auto-prefixed with agent ID
Type Validation	Data validated against declared schemas
Path Sanitization	Prevents `../` injection attacks
Quotas	Size limits, key counts, write rate limits
Versioning	No silent overwrites (default)
PII Detection	Auto-detect/encrypt/reject PII
Audit Logging	Every write logged to provenance

# human-agent.yaml - Safety configuration
vaults:
  self:
    paths:
      # Only these paths allowed
      '/cache/*':
        writers: [self]              # Only this agent
        max_size: 1mb
        ttl: 3600
      '/state/checkpoint':
        writers: [self]
        schema: checkpoint-schema
        versioned: true              # Keep history
        pii: reject                  # No PII allowed
    
    quotas:
      max_total_size: 100mb
      max_keys: 10000
      writes_per_minute: 100

ctx.memory — Memory Fabric

ctx.memory is the built-in memory fabric for agents. Storage appears when you use it — no external database setup, no vector DB configuration, no credentials to manage.

Key Design Principle:

Memory is human-owned, not control-plane-owned.

Unlike other agent platforms where the control plane owns agent memory, HUMAN's memory fabric is backed by Passport-linked Vaults. The human (or org) owns their data — portable, encrypted, and sovereign.

Why This Matters

Without Built-in Memory	With ctx.memory
Configure Pinecone/Qdrant separately	Just use `ctx.memory`
Manage vector DB credentials	Credentials handled by platform
Different setup for dev/staging/prod	Same code everywhere
Agent code imports storage SDKs	Agent uses ctx, platform resolves
Data locked in vendor silos	Data portable with Passport

Scope Mapping

Scope	Maps To	Lifetime	Shared With
`execution`	In-memory	Single execution	No one
`session`	`ctx.vaults.self` `/sessions/{id}`	Session TTL	Same session
`persistent`	`ctx.vaults.self` `/persistent/`	Permanent	Same agent
`suite`	Suite vault	Permanent	All suite agents

ctx.memory API

interface MemoryContext {
  // Per-execution (in-memory, fastest)
  execution: ScopedMemory;
  
  // Per-session (persisted, session TTL)
  session: ScopedMemory;
  
  // Persistent for this agent
  persistent: ScopedMemory;
  
  // Shared across agent suite
  suite: ScopedMemory;
}

interface ScopedMemory {
  // ═══════════════════════════════════════════════════════════════
  // KEY-VALUE STORAGE
  // ═══════════════════════════════════════════════════════════════
  
  get<T>(key: string): Promise<T | undefined>;
  set<T>(key: string, value: T, options?: { ttl?: number }): Promise<void>;
  delete(key: string): Promise<void>;
  has(key: string): Promise<boolean>;
  keys(prefix?: string): Promise<string[]>;
  
  // ═══════════════════════════════════════════════════════════════
  // VECTOR STORAGE (Embeddings & Semantic Search)
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * Store a vector embedding with associated metadata.
   * Platform handles the underlying vector database (no Pinecone/Qdrant setup).
   */
  setVector(key: string, embedding: number[], metadata?: Record<string, unknown>): Promise<void>;
  
  /**
   * Retrieve a stored vector by key.
   */
  getVector(key: string): Promise<VectorEntry | undefined>;
  
  /**
   * Semantic similarity search across stored vectors.
   * Returns matches sorted by similarity (highest first).
   */
  search(embedding: number[], options?: SearchOptions): Promise<VectorMatch[]>;
  
  /**
   * Delete a vector by key.
   */
  deleteVector(key: string): Promise<void>;
  
  /**
   * List all vector keys (optionally filtered by prefix).
   */
  vectorKeys(prefix?: string): Promise<string[]>;
}

interface SearchOptions {
  topK?: number;           // Max results to return (default: 10)
  threshold?: number;      // Min similarity score 0-1 (default: 0.0)
  filter?: Record<string, unknown>;  // Metadata filter
  includeMetadata?: boolean;  // Include metadata in results (default: true)
  includeVectors?: boolean;   // Include vectors in results (default: false)
}

interface VectorEntry {
  key: string;
  embedding: number[];
  metadata?: Record<string, unknown>;
  createdAt: Date;
  updatedAt: Date;
}

interface VectorMatch {
  key: string;
  score: number;           // Similarity score 0-1
  metadata?: Record<string, unknown>;
  embedding?: number[];    // Only if includeVectors: true
}

Example: Memory Scopes (Key-Value)

export const conversationalAgent = handler({
  id: 'conversational_agent',
  
  async execute(ctx, input: { message: string }) {
    // Execution scope: temp working data (gone after this execution)
    await ctx.memory.execution.set('working', { partial: true });
    
    // Session scope: conversation history (lasts session TTL)
    const history = await ctx.memory.session.get<Message[]>('history') ?? [];
    history.push({ role: 'user', content: input.message });
    await ctx.memory.session.set('history', history);
    
    // Persistent: user preferences (permanent)
    const prefs = await ctx.memory.persistent.get('user_prefs');
    
    // Suite: shared knowledge (other agents can read)
    await ctx.memory.suite.set('last_interaction', {
      agentId: ctx.passport.self.did,
      timestamp: new Date(),
    });
    
    const response = await ctx.llm.complete({ 
      prompt: formatPrompt(history, prefs),
    });
    
    history.push({ role: 'assistant', content: response.content });
    await ctx.memory.session.set('history', history);
    
    return response;
  }
});

Example: Vector Storage (Semantic Search)

export const documentSearchAgent = handler({
  id: 'document_search',
  capabilities: ['docs/search', 'docs/ingest'],
  
  async execute(ctx, input: { action: 'ingest' | 'search'; content?: string; query?: string }) {
    
    if (input.action === 'ingest' && input.content) {
      // ═══════════════════════════════════════════════════════════════
      // INGEST: Store document with embedding
      // ═══════════════════════════════════════════════════════════════
      
      // Get embedding from LLM (platform handles provider)
      const embedding = await ctx.llm.embed(input.content);
      
      // Store in persistent memory — no Pinecone setup needed!
      const docId = `doc:${Date.now()}`;
      await ctx.memory.persistent.setVector(docId, embedding, {
        content: input.content,
        ingestedAt: new Date().toISOString(),
        source: 'user_upload',
      });
      
      return { docId, status: 'ingested' };
    }
    
    if (input.action === 'search' && input.query) {
      // ═══════════════════════════════════════════════════════════════
      // SEARCH: Find similar documents
      // ═══════════════════════════════════════════════════════════════
      
      // Embed the query
      const queryEmbedding = await ctx.llm.embed(input.query);
      
      // Semantic search — platform handles vector DB
      const results = await ctx.memory.persistent.search(queryEmbedding, {
        topK: 5,
        threshold: 0.7,  // Only return good matches
        includeMetadata: true,
      });
      
      return {
        query: input.query,
        results: results.map(r => ({
          docId: r.key,
          score: r.score,
          content: r.metadata?.content,
        })),
      };
    }
    
    throw new Error('Invalid action');
  }
});

Example: Cross-Agent Knowledge Sharing

// Agent A: Knowledge ingestion agent
export const knowledgeIngester = handler({
  id: 'knowledge_ingester',
  
  async execute(ctx, input: { documents: string[] }) {
    for (const doc of input.documents) {
      const embedding = await ctx.llm.embed(doc);
      
      // Store in SUITE scope — other agents in this suite can search it
      await ctx.memory.suite.setVector(`kb:${hash(doc)}`, embedding, {
        content: doc,
        ingestedBy: ctx.passport.self.did,
        ingestedAt: new Date().toISOString(),
      });
    }
    
    return { ingested: input.documents.length };
  }
});

// Agent B: Question answering agent (different agent, same suite)
export const qaAgent = handler({
  id: 'qa_agent',
  
  async execute(ctx, input: { question: string }) {
    const queryEmbedding = await ctx.llm.embed(input.question);
    
    // Search the SUITE memory — finds docs ingested by any suite agent
    const relevantDocs = await ctx.memory.suite.search(queryEmbedding, {
      topK: 3,
      threshold: 0.75,
    });
    
    const context = relevantDocs.map(d => d.metadata?.content).join('\n\n');
    
    const answer = await ctx.llm.complete({
      prompt: `Answer based on context:\n\nContext:\n${context}\n\nQuestion: ${input.question}`,
    });
    
    return { answer: answer.content, sources: relevantDocs.map(d => d.key) };
  }
});

Example: Namespace Pattern (Multi-Tenant)

export const multiTenantSearch = handler({
  id: 'multi_tenant_search',
  
  async execute(ctx, input: { namespace: string; query: string }) {
    const queryEmbedding = await ctx.llm.embed(input.query);
    
    // Filter by namespace prefix — isolation without separate DBs
    const results = await ctx.memory.persistent.search(queryEmbedding, {
      topK: 10,
      filter: { namespace: input.namespace },  // Metadata filter
    });
    
    // Or use key prefix pattern
    const allKeys = await ctx.memory.persistent.vectorKeys(`${input.namespace}:`);
    
    return { results, keyCount: allKeys.length };
  }
});

PROVENANCE & AUDIT MODEL

ctx serves as the audit boundary. Every ctx method is automatically instrumented for provenance. Developers cannot bypass ctx.

Design Principles

ctx IS the control point — All access goes through ctx
Auto-instrumented — Every ctx method logs automatically
Sandboxed runtime — Handlers cannot bypass ctx
Hashed sensitive data — Inputs/outputs hashed, not stored raw
Cryptographically signed — Events signed by agent

What Gets Logged

Method	Auto-Logged Data
`ctx.llm.complete()`	model, tier, tokens, cost, latency, prompt hash
`ctx.llm.embed()`	model, dimensions, tokens, latency
`ctx.call.agent()`	target, delegation chain, input/output hashes
`ctx.call.route()`	capability, selected resource, routing reason
`ctx.vaults.*.write()`	vault, path, size, schema, version
`ctx.vaults.*.read()`	vault, path, found/not found
`ctx.memory.*.set()`	scope, key, size, ttl
`ctx.memory.*.get()`	scope, key, found/not found
`ctx.memory.*.setVector()`	scope, key, dimensions, metadata keys
`ctx.memory.*.search()`	scope, dimensions, topK, threshold, result count
`ctx.oversight.approve()`	action, risk, approver, decision, response time
`ctx.oversight.escalate()`	reason, category, handoff state hash
`ctx.workforce.submit()`	task type, capability, priority, worker assigned
`ctx.db.query()`	query hash, rows affected, latency
`ctx.http.request()`	url (domain only), method, status, latency
`ctx.secrets.get()`	key name (not value!), source

ctx.events API

interface EventsContext {
  // SDK logs automatically — developers rarely call directly
  log(event: ProvenanceEvent): Promise<void>;
  
  // Span tracking (for nested operations)
  startSpan(name: string, metadata?: Record<string, unknown>): Span;
  
  // Developer custom events
  custom(type: string, data: Record<string, unknown>): Promise<void>;
  
  // Query provenance (for debugging)
  query(options: { 
    executionId?: string;
    timeRange?: [Date, Date];
    types?: string[];
  }): Promise<ProvenanceEvent[]>;
  
  // Export for audit
  export(options: ExportOptions): Promise<AuditBundle>;
}

Event Structure

interface ProvenanceEvent {
  // Identity
  id: string;
  executionId: string;
  parentSpanId?: string;
  
  // What happened
  type: string;  // 'llm.complete', 'vault.write', 'oversight.approve'
  status: 'started' | 'success' | 'error';
  
  // Context
  agentDid: string;
  handlerId: string;
  delegationChain: string[];
  
  // Data (hashed where sensitive)
  input?: string;   // Hash of input
  output?: string;  // Hash of output
  metadata: Record<string, unknown>;
  
  // Timing
  timestamp: Date;
  duration?: number;
  
  // Cost (if applicable)
  cost?: { amount: number; currency: 'USD'; type: string };
  
  // Cryptographic proof
  signature: string;  // Signed by agent
}

Manifest Configuration

# human-agent.yaml
provenance:
  # Primary storage
  primary:
    type: ledger              # Append-only, immutable
    location: managed         # or: self-hosted
    retention: 7y             # Regulatory compliance
  
  # Real-time streaming
  stream:
    enabled: true
    destinations:
      - type: webhook
        url: https://audit.acme.com/events
      - type: kafka
        topic: human-provenance
  
  # What to capture
  capture:
    # Always (cannot disable)
    required:
      - call.*
      - oversight.*
      - workforce.*
      - vault.write
    
    # Optional (for debugging)
    optional:
      - llm.*
      - db.*
      - http.*
  
  # Data handling
  data:
    hash_inputs: true
    hash_outputs: true
    full_capture:
      environments: [development]
      retention: 24h

How Bypass is Prevented

// ❌ Can't import raw HTTP — SDK doesn't expose it
import fetch from 'node-fetch';

// ❌ Can't access process.env — blocked
const key = process.env.API_KEY;

// ❌ Can't write to filesystem — blocked
import fs from 'fs';

// ✅ Must use ctx
const result = await ctx.http.get('https://api.example.com');
const key = await ctx.secrets.get('API_KEY');
await ctx.vaults.self.write('/data.json', data);

Enforcement:

Sandboxed runtime — Handler runs in isolated environment
Import restrictions — Only @human/agent-sdk available
Network policies — Outbound only via ctx.http
Filesystem isolation — Only ctx.files/ctx.vaults

CREDENTIAL MANAGEMENT

Philosophy: Progressive permission acquisition. No upfront secret lists.

Zero-Config Secrets

// Developer just uses secrets — no manifest declarations
export const processPayment = handler({
  id: 'process_payment',
  
  async execute(ctx, input) {
    // Just get the secret you need
    const stripeKey = await ctx.secrets.get('STRIPE_KEY');
    
    // Runtime handles:
    // - First access: "Allow STRIPE_KEY for process_payment?" (dev mode)
    // - Subsequent: Auto-allow (learned pattern)
    // - Production: Only allows learned patterns
  }
});

Developers don't configure:

❌ secrets: [STRIPE_KEY, SENDGRID_KEY]
❌ Handler-specific secret lists
❌ Agent-level secret lists

Runtime learns and enforces automatically.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                    CREDENTIAL CONTROLLER                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  DEV MODE (learning):                                          │
│  └── Handler tries: ctx.secrets.get('STRIPE_KEY')              │
│  └── Runtime: "Allow? [y/n]" (or auto-allow in dev)            │
│  └── Records: "process_payment uses STRIPE_KEY"                │
│                                                                 │
│  PROD MODE (enforcing):                                        │
│  └── Handler tries: ctx.secrets.get('STRIPE_KEY')              │
│  └── Runtime checks: "Is this a learned pattern?"              │
│  └── If yes: Allow                                             │
│  └── If no: Deny + alert                                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The Credential Cascade

When ctx.secrets.get('KEY') is called, runtime resolves:

Passport Keychain — User-owned (OAuth tokens, delegated)
Agent Vault — Agent's secrets (auto-discovered)
Org Vault — Org-level secrets (if granted)
Environment — Dev fallback (.env files)

User Credentials (Progressive)

export const sendEmail = handler({
  id: 'send_email',
  // No upfront passport_scopes declaration needed
  
  async execute(ctx, input) {
    // Request access when you need it (progressive)
    const gmailAccess = await ctx.passport.getAccess('gmail.send');
    
    if (!gmailAccess.granted) {
      // Runtime prompts user: "Allow email sending?"
      return ctx.oversight.escalate({
        why: {
          reason: 'Need Gmail access to send invoice',
          category: 'capability_exceeded',
          urgency: 'normal',
        },
      });
    }
    
    // Use delegated access (short-lived, revocable)
    await gmail.send(gmailAccess.token, { ... });
  }
});

Key principles:

Agent receives delegation tokens, not raw credentials
User can revoke anytime via Passport
No upfront scope declarations — request when needed

Automatic Secret Rotation

Runtime handles rotation automatically:

Database credentials: Rotated every 30 days
API keys: Rotated based on provider recommendations
OAuth tokens: Refreshed before expiry

Developers don't think about rotation.

LLM & COST MANAGEMENT

Philosophy: Declare budget, not thresholds. Runtime optimizes automatically.

Zero-Config LLM

// Developer just calls LLM — no tier, no model selection
const result = await ctx.llm.complete({
  prompt: 'Analyze this invoice...',
});

// Runtime automatically:
// - Selects optimal model based on prompt complexity
// - Considers agent's remaining budget
// - Routes to cheapest model that meets quality threshold
// - Falls back if primary provider is down

No tier selection. No model selection. No provider selection.

Optional: Declare Intent (Not Implementation)

// If you have quality requirements, declare INTENT:
const result = await ctx.llm.complete({
  prompt: '...',
  quality: 'high',        // Intent: "I need high quality"
  // NOT: tier: 'powerful' (implementation detail)
});

// Runtime decides: Claude Opus? GPT-4? Based on:
// - What's available
// - What's cheapest for this quality level
// - What's within budget

Budget-Based Cost Control

Minimal manifest:

# human-agent.yaml
budget:
  daily: $50

That's it. Runtime handles:

When to warn (learns from your response patterns)
When to escalate (based on spend velocity)
Anomaly detection (automatic)

What developers DON'T configure:

❌ thresholds: [80%, 95%, 100%] — runtime learns
❌ circuit_breaker: 5x_normal — automatic
❌ action: notify_developer — smart defaults

How Budget Works

┌─────────────────────────────────────────────────────────────────┐
│                    RUNTIME COST CONTROLLER                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Budget: $50/day                                               │
│                                                                 │
│  Observed patterns:                                            │
│  - Normal daily spend: $30                                     │
│  - Developer usually responds to alerts within 2h              │
│  - Developer approved 90% of budget increase requests          │
│                                                                 │
│  Adaptive behavior:                                            │
│  - At $40 (80%): Start preferring cheaper models              │
│  - At $45 (90%): Alert developer (learned threshold)          │
│  - At $48 (96%): Escalate for approval                        │
│  - Spike detection: If spending 3x normal rate → alert        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cost Observability (Automatic)

Every LLM call returns cost metadata:

const result = await ctx.llm.complete({ prompt: '...' });

// Cost info automatically available
console.log(ctx.llm.cost);
// {
//   thisCall: 0.02,
//   sessionTotal: 12.50,
//   dailyRemaining: 5.00,
//   status: 'healthy'  // or: 'approaching_limit', 'needs_approval'
// }

Model Selection is HUMAN's Problem

What Developers Think About	What HUMAN Handles
"Analyze this invoice"	Which model? Which provider?
"I need high quality"	GPT-4 vs Claude Opus vs Gemini Ultra?
"This is simple"	GPT-3.5 vs Claude Haiku vs Gemini Flash?
"I have $50/day"	Routing to stay within budget
"Something's wrong"	Fallback to backup provider

REASONING SERVICE & MARKETPLACE CERTIFICATION

Updated: 2025-12-19

Reasoning as First-Class OS Primitive

The HUMAN SDK provides AI reasoning as an OS-level service through ctx.reason(). This replaces direct LLM integration and provides:

Automatic model selection (capability-based routing)
Zero configuration (inherits org's keys/policies)
Governance integration (data tier constraints enforced)
Provider abstraction (same API across OpenAI, Anthropic, local models)

See comprehensive specification: 141_reasoning_service_architecture.md

Basic Usage

// Simple reasoning call
const result = await ctx.reason({
  task: "summarize",
  input: document,
  preferences: { latency: "interactive" }
});

// Sugar functions
const summary = await ctx.summarize(text);
const classification = await ctx.classify({ items, labels });

What you DON'T do:

❌ Import OpenAI/Anthropic SDKs
❌ Manage API keys
❌ Handle provider differences
❌ Configure governance rules

Everything inherits from org context automatically.

Agent Manifest: Declaring Reasoning Requirements

# human-agent.yaml
reasoning_requirements:
  capabilities:
    - natural_language
    - classification
  data_tiers:
    - Working
  supported_profiles:
    - standard
    - standard_safety

At runtime: HumanOS matches your requirements to org's available models automatically.

Marketplace Certification Tiers

Agents published to marketplace get certification badges based on portability:

🟢 Verified Portable (Preferred)

Pure capability-based routing
Works on any org's model setup
Featured placement in marketplace
Standard rev share (70% developer / 30% HUMAN)

🟡 Profile Required

Requires specific reasoning profile (e.g., "high_safety")
Works if org has compatible profile
Clear compatibility indicator
Standard rev share

🔴 Model-Specific (Use Sparingly)

Pinned to HUMAN alias (e.g., haio.sonnet_4_5_strict)
Only works if org has that specific model
Lower install rate
Higher HUMAN rev share (60/40)
Requires justification + manual review

Best Practice: Build Portable

# ✅ GOOD: Portable agent
reasoning_requirements:
  capabilities: ["natural_language", "tools"]
  min_context: 32000

# ⚠️ OK: Profile-specific (if needed)
reasoning_requirements:
  profiles: ["high_safety"]
  reason: "Handles PHI, requires HIPAA-compliant models"

# 🚫 AVOID: Model-specific (only if critical)
reasoning_requirements:
  model_alias: "haio.sonnet_4_5_strict"
  reason: "FDA-certified workflow"

Portable agents install everywhere. Model-specific agents have limited reach.

TESTING PATTERNS

The SDK provides testing utilities designed for non-deterministic LLM outputs.

Semantic Assertions

Test that outputs contain expected concepts, not exact strings:

import { test, expectSemantic } from '@human/agent-sdk/testing';

test('analyzeContract returns risk analysis', async () => {
  const result = await analyzeContract({ contract: sampleContract });
  
  // 85% semantic similarity threshold (configurable)
  await expectSemantic(result).toContain([
    'liability clauses',
    'termination rights',
    'risk assessment',
  ]);
  
  // Override threshold for stricter tests
  await expectSemantic(result, { threshold: 0.95 }).toContain([...]);
});

Golden Output Testing

Record and compare against approved outputs:

import { test, recordGolden } from '@human/agent-sdk/testing';

test('analyzeContract matches golden output', async () => {
  const result = await analyzeContract({ contract: sampleContract });
  
  // First run: Records output, marks as "pending review"
  // Subsequent runs: Compares semantically to approved golden
  await recordGolden('analyze-contract', result, {
    semanticSimilarityThreshold: 0.85,
  });
});

Developer workflow:

# First run records output
$ human-agent test
📸 New golden output recorded: analyze-contract.golden.json
   Status: PENDING REVIEW

# Developer reviews and approves
$ human-agent golden approve analyze-contract
✅ Golden output approved by rick@human.com

# CI enforces
$ human-agent test
✅ analyze-contract: 91% similar to golden (threshold: 85%)

Deterministic Mode (CI/CD)

Record real LLM calls and replay in CI:

// In dev: Record mode
beforeAll(() => {
  ctx.llm.setMode('record');  // Calls real LLM, saves responses
});

// In CI: Replay mode (deterministic)
beforeAll(() => {
  ctx.llm.setMode('replay');  // Uses saved responses
});

Fixtures stored in: fixtures/llm-responses/{input-hash}.json

# Refresh fixtures with current LLM
$ human-agent test --refresh-fixtures
🔄 Refreshing 23 fixtures...
✅ Done. Review changes in fixtures/

TIME-TRAVEL DEBUGGING

The SDK records execution history for debugging and replay.

Storage Model

Mode	What's Stored	Retention	Replay?
Metadata (default)	Hashes, timing, status	90 days	❌ No
Full Capture (opt-in)	All inputs, outputs, LLM calls	7 days	✅ Yes

Manifest Configuration

# human-agent.yaml
debugging:
  # Default: metadata only (privacy-safe)
  default_retention: metadata_only
  
  # Opt-in: full data for specific handlers
  full_capture:
    handlers:
      - parse_invoice   # Need to debug this one
    environments:
      - development
      - staging
      # Never production unless explicit
    
  # Retention periods
  retention:
    metadata: 90d
    full_data: 7d
    
  # PII handling
  pii:
    mode: redact        # Auto-redact detected PII
    fields: [email, phone, ssn]

Replay Executions

$ human-agent replay exec-abc123

Replaying execution: exec-abc123 (invoice-processor)
Step 1/5: parser.parse        ✅ Completed (234ms)
Step 2/5: validator.check     ✅ Completed (45ms)
Step 3/5: router.route        ❌ Failed: "No accounting dept found"
                              
Paused at step 3. Options:
  [r] Resume    [s] Step    [e] Edit input    [q] Quit

LLM Response Recording

When full capture is enabled, LLM responses are recorded for exact replay:

{
  step: 'analyze_contract',
  llm: {
    provider: 'openai',
    model: 'gpt-4-turbo',
    prompt: 'Analyze this contract...',
    response: 'The contract contains...',
    tokens: { input: 1500, output: 300 },
    cost: 0.02,
  }
}

Provenance vs Debug Data

Provenance (immutable, never deleted):

Cryptographically signed
Hashes only (no raw content)
Proves WHAT happened

Debug data (opt-in, time-limited):

Full content for replay
Auto-expires after retention period
Shows HOW it happened

PROMPT VERSIONING

Prompts are version-controlled in the code repo and published to a runtime registry.

Prompt File Format

<!-- prompts/analyze-contract.md -->
---
id: analyze-contract
version: 2.1.0
description: Analyze legal contracts for risk
author: rick@human.com
---

# Contract Risk Analysis

Analyze the following contract and identify:
1. Liability clauses
2. Termination rights
3. Financial obligations

{{contract}}

Return analysis as JSON with riskLevel, findings, confidence.

CI Publishing

# .github/workflows/prompts.yml
on:
  push:
    paths: ['prompts/**']
    branches: [main]

jobs:
  publish-prompts:
    steps:
      - run: human-agent prompts publish

Handler Usage

export const analyzeContract = handler({
  id: 'analyze_contract',
  
  // Pin to specific version (recommended for prod)
  prompt: 'prompts/analyze-contract@2.1.0',
  
  async execute(ctx, input) {
    const prompt = await ctx.prompts.load('analyze-contract');
    return ctx.llm.complete({
      prompt: prompt.render({ contract: input.contract })
    });
  }
});

Version Management

# List versions
$ human-agent prompts versions analyze-contract
v2.1.0 (current) - 2025-12-16 - "Added confidence scoring"
v2.0.0           - 2025-12-01 - "Restructured output format"
v1.0.0           - 2025-11-15 - "Initial version"

# Rollback
$ human-agent prompts rollback analyze-contract --to v2.0.0
⚠️  This will update production. Continue? (y/n): y
✅ Rolled back to v2.0.0

# A/B test
$ human-agent prompts test analyze-contract@v2.1.0 --against v2.0.0
Running 50 test cases...
v2.0.0: 85% quality score, $0.015 avg cost
v2.1.0: 91% quality score, $0.018 avg cost
📊 v2.1.0 is 7% better quality, 20% more expensive

INFRASTRUCTURE PROVISIONING

Philosophy: Infrastructure appears when you use it. No configuration required.

Zero-Config Infrastructure

// Developer writes this:
export const processInvoice = handler({
  id: 'process_invoice',
  
  async execute(ctx, input) {
    // Use database → it appears
    await ctx.db.query('invoices', { id: input.id });
    
    // Use cache → it appears
    await ctx.cache.get('recent');
    
    // Use file storage → it appears  
    await ctx.files.write('/reports/latest.pdf', pdf);
    
    // Use queue → it appears
    await ctx.queue.enqueue('process', task);
  }
});

HUMAN auto-provisions:

Database when ctx.db is used
Cache when ctx.cache is used
Storage when ctx.files is used
Queue when ctx.queue is used

No manifest configuration needed. No sizing. No provisioning.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                    FIRST DEPLOYMENT                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. SDK analyzes handler code                                  │
│     - "This handler uses ctx.db and ctx.cache"                 │
│                                                                 │
│  2. Runtime provisions required infrastructure                 │
│     - PostgreSQL (right-sized based on usage)                  │
│     - Redis (right-sized based on usage)                       │
│                                                                 │
│  3. Auto-scaling based on observed patterns                    │
│     - DB connections grow/shrink with load                     │
│     - Cache size adjusts to hit rate                           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Dev Mode

$ human-agent dev

🚀 Starting HUMAN Agent: invoice-processor

📦 Auto-detected infrastructure needs...
  ✅ PostgreSQL (local container)
  ✅ Redis (local container)

🔗 Ready at http://localhost:3001

Production Deployment

$ git push origin main

🚀 Deploying invoice-processor...

📦 Provisioning infrastructure...
  ✅ PostgreSQL (managed, auto-sized)
  ✅ Redis (managed, auto-sized)

✅ Live at https://invoice-processor.agents.human.dev

Optional: Data Residency Override

# human-agent.yaml (only if required)
compliance:
  data_residency: eu    # Keep data in EU

# That's it. HUMAN figures out:
# - Which regions to use
# - Which services comply
# - Replication strategy

Preview Deployments

Every branch gets isolated infrastructure automatically:

$ git push origin fix-invoice-bug

🚀 Preview deployment created!
   URL: https://invoice-processor-pr-42.agents.human.dev
   
   Infrastructure: Isolated (auto-provisioned)
   Data: Synthetic test data (default)
   Auto-delete: On branch merge/close

No seed files. No database snapshots. HUMAN generates realistic test data.

Optional override:

# human-agent.yaml (only if needed)
preview:
  data: staging_snapshot   # Copy from staging (rare)

SCALING & AGENT POOLS

Philosophy: Serverless by default. Scale-to-zero. SLO-driven. No configuration.

For conceptual architecture (logical instances vs physical replicas, workflow-level scaling, queue-based burst handling), see: 55_multi_agent_runtime_architecture.md - Runtime Scaling Architecture section.

This section covers the SDK developer experience for scaling configuration.

Default: Serverless (Scale-to-Zero)

# human-agent.yaml
name: invoice-processor
capabilities: [finance/invoice/process]

# No scaling config needed. Defaults:
# - Serverless (scale-to-zero when idle)
# - Scale up automatically under load
# - Pay only for invocations

Developers don't configure:

❌ min_instances: 2
❌ max_instances: 20
❌ scale_threshold: 10
❌ scale_down_delay: 300

Runtime handles all scaling automatically.

SLO-Driven Scaling

# human-agent.yaml (only if specific latency requirements)
slo:
  latency:
    p99: 200ms    # "Keep p99 under 200ms"

Runtime automatically:

Monitors p50, p95, p99 latency
Scales up when approaching SLO breach
Scales down when over-provisioned
Pre-warms based on traffic prediction

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                    RUNTIME SCALING CONTROLLER                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  SLO: p99 < 500ms (default) or p99 < 200ms (if specified)      │
│                                                                 │
│  IDLE STATE:                                                   │
│  └── 0 instances running (scale-to-zero)                       │
│  └── Cold start on first request (~200ms)                      │
│                                                                 │
│  UNDER LOAD:                                                   │
│  └── Observed: p99 = 180ms, SLO = 200ms                       │
│  └── Status: ✅ Healthy (10% headroom)                         │
│  └── Action: Maintain current instances                        │
│                                                                 │
│  APPROACHING SLO:                                              │
│  └── Observed: p99 = 195ms, SLO = 200ms                       │
│  └── Status: ⚠️ Degrading                                      │
│  └── Action: Scale up proactively                              │
│                                                                 │
│  OVER-PROVISIONED:                                             │
│  └── Observed: p99 = 50ms, SLO = 200ms                        │
│  └── Status: 💰 Over-provisioned                               │
│  └── Action: Scale down to save cost                           │
│                                                                 │
│  TRAFFIC PREDICTION:                                           │
│  └── Learned: "Busy Mon-Fri 9am-5pm, quiet weekends"          │
│  └── Action: Pre-warm before predicted peaks                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Agent Pools (Implicit)

When multiple requests arrive, the runtime creates a pool automatically:

Request 1 ──┐
Request 2 ──┼──► ┌─────────────────────────────────┐
Request 3 ──┤   │        AGENT POOL                │
Request 4 ──┤   │  (auto-managed, ephemeral)       │
Request 5 ──┘   │                                  │
                │  Instance A ← Processing Req 1   │
                │  Instance B ← Processing Req 2   │
                │  Instance C ← Processing Req 3   │
                │  (more spawn as needed)          │
                │                                  │
                └─────────────────────────────────────┘

Developers don't:

Configure pool size
Manage instance lifecycle
Think about load balancing

Runtime handles:

Instance creation/destruction
Request routing
Health checks
Automatic recovery

Optional: Keep Warm (Rare)

For latency-critical agents where cold starts aren't acceptable:

# human-agent.yaml
slo:
  latency:
    p99: 50ms     # Very strict SLO
    
warmth:
  min_warm: 1     # Always keep 1 instance ready

# Note: This costs more (always-on instance)

Optional: Burst Capacity (Rare)

For known high-traffic events:

# human-agent.yaml
scaling:
  burst_max: 10000  # Handle up to 10k concurrent
  
# Runtime pre-provisions capacity for bursts

State in Vaults, Not Instances

Instances are ephemeral. State lives in Vaults:

export const conversationalAgent = handler({
  async execute(ctx, input) {
    // ✅ State in vault (survives instance death)
    const history = await ctx.vaults.self.read('/sessions/' + ctx.session.id);
    
    // ❌ State in memory (lost on scale-down)
    // globalHistory[sessionId] = messages; // DON'T DO THIS
    
    // Process...
    
    // Save state
    await ctx.vaults.self.write('/sessions/' + ctx.session.id, updatedHistory);
  }
});

Cross-Deployment Routing

Agents can call other agents regardless of where they run:

// Agent A (in HUMAN Cloud) calls Agent B (in Acme's VPC)
await ctx.call.agent('agent://acme.invoice-validator', invoice);

// Runtime handles:
// - Registry lookup
// - Cross-network routing
// - Delegation verification
// - Provenance logging

MULTI-LANGUAGE SDK GENERATION

The SDK is auto-generated from protocol definitions for multiple languages.

Protocol Source of Truth

human/
├── protocol/
│   ├── schemas/
│   │   ├── context.proto       # Protocol Buffers
│   │   ├── handler.proto
│   │   └── agents.proto
│   ├── openapi/
│   │   └── human-api.yaml      # OpenAPI 3.1
│   └── json-schemas/
│       └── *.json
│
├── sdks/                        # Auto-generated
│   ├── typescript/              # @human/agent-sdk
│   ├── python/                  # human-agent-sdk
│   ├── go/                      # github.com/human-protocol/agent-sdk-go
│   └── rust/                    # human-agent-sdk (crates.io)

Generated SDK Examples

TypeScript:

import { handler, ExecutionContext } from '@human/agent-sdk';

export const processInvoice = handler({
  id: 'process_invoice',
  async execute(ctx: ExecutionContext, input: { documentId: string }) {
    const analysis = await ctx.llm.complete({ prompt: '...' });
    await ctx.call.agent('agent://...', { data: analysis });
    return analysis;
  }
});

Python:

from human_agent_sdk import handler, ExecutionContext

@handler(id='process_invoice')
async def process_invoice(ctx: ExecutionContext, document_id: str):
    analysis = await ctx.llm.complete(prompt='...')
    await ctx.call.agent(target='agent://...', input={...})
    return analysis

Go:

package main

import human "github.com/human-protocol/agent-sdk-go"

func ProcessInvoice(ctx human.ExecutionContext, input ProcessInvoiceInput) (*Analysis, error) {
    analysis, err := ctx.LLM.Complete(human.CompleteRequest{Prompt: "..."})
    if err != nil { return nil, err }
    
    _, err = ctx.Call.Agent("agent://...", map[string]interface{}{})
    return analysis, err
}

func init() {
    human.RegisterHandler("process_invoice", ProcessInvoice)
}

CI/CD Auto-Generation

# .github/workflows/generate-sdks.yml
on:
  push:
    paths: ['protocol/**']
    branches: [main]

jobs:
  generate-sdks:
    steps:
      - name: Generate SDKs
        run: |
          human-sdk-gen typescript --output sdks/typescript
          human-sdk-gen python --output sdks/python
          human-sdk-gen go --output sdks/go
          human-sdk-gen rust --output sdks/rust
          
      - name: Test all SDKs
        run: |
          cd sdks/typescript && npm test
          cd sdks/python && pytest
          cd sdks/go && go test ./...
          cd sdks/rust && cargo test
          
      - name: Publish
        run: |
          cd sdks/typescript && npm publish
          cd sdks/python && twine upload dist/*
          # Go auto-proxied by proxy.golang.org
          cd sdks/rust && cargo publish

AGENT-READABLE DOCUMENTATION

API documentation is published in formats optimized for both humans and AI agents.

Documentation Formats

docs.human.dev/api/v1.0.0/
├── index.html           # Human-readable (TypeDoc)
├── openapi.yaml          # Machine-readable (OpenAPI 3.1)
├── llms.txt              # LLM-optimized summary
└── context.json          # Structured for AI parsing

LLM-Optimized Summary (`llms.txt`)

# HUMAN Agent SDK - API Reference for LLMs

## ctx.llm
- `ctx.llm.complete({ prompt, tier? })` - Complete a prompt
- `ctx.llm.stream({ prompt })` - Stream completion
- `ctx.llm.embed({ text })` - Generate embeddings

## ctx.call
- `ctx.call.agent(target, input)` - Call another agent directly
- `ctx.call.route({ capability, input })` - Capability-based routing (HumanOS decides)
- `ctx.call.withDelegation({ scopes, budget?, expires? })` - Wrap call with specific delegation

## ctx.oversight
- `ctx.oversight.approve({ action, reason, risk })` - Request approval
- `ctx.oversight.decide({ question, options })` - Present decision
- `ctx.oversight.escalate({ why, findings?, recommendation? })` - Full handoff
- `ctx.oversight.notify(message, { urgency?, channel? })` - Non-blocking notification

## ctx.vaults
- `ctx.vaults.self` - Agent's own vault (always accessible)
- `ctx.vaults.list()` - List accessible vaults
- `ctx.vaults.get(uri)` - Get vault handle
- `vault.read(path)`, `vault.write(path, data)`, `vault.list(prefix?)`

## ctx.workforce
- `ctx.workforce.submit({ type, capability, input, instructions })` - Submit to human pool
- `ctx.workforce.await(taskId)` - Wait for completion

## ctx.capabilities
- `ctx.capabilities.find({ capability, minLevel? })` - Find entities with capability
- `ctx.capabilities.mine()` - Get current agent's capabilities

## ctx.secrets
- `ctx.secrets.get(key)` - Get secret (Passport > Vault > Env cascade)

Structured Context (`context.json`)

{
  "sdk_version": "1.0.0",
  "primitives": {
    "ctx.llm": {
      "methods": ["complete", "stream", "embed"],
      "complete": {
        "signature": "complete(options: CompleteOptions): Promise<CompleteResult>",
        "params": {
          "prompt": "string (required)",
          "tier": "fast | balanced | powerful (default: balanced)"
        },
        "returns": "{ content: string, cost: CostInfo }",
        "example": "await ctx.llm.complete({ prompt: 'Summarize...' })"
      }
    }
  }
}

CLI Docs Lookup

$ human-agent docs ctx.llm.complete

ctx.llm.complete(options)

Complete a prompt using auto-routed LLM.

Options:
  prompt: string (required) - The prompt to complete
  tier: 'fast' | 'balanced' | 'powerful' - Model tier (default: balanced)

Returns:
  { content: string, cost: CostInfo }

Example:
  const result = await ctx.llm.complete({
    prompt: 'Summarize this document...',
    tier: 'powerful'
  });

Agent Access to Docs

Agents can query documentation via MCP or API:

// Companion helping a developer
const docs = await mcp.call('human-sdk-docs', {
  query: 'ctx.call.agent',
  format: 'structured'
});

SDK ARCHITECTURE

Core Package Structure

@human/agent-sdk/
├── core/
│   ├── agent.ts              # Base agent class
│   ├── identity.ts           # Passport binding
│   ├── delegation.ts         # Authority management
│   ├── memory.ts             # Vault-bound memory
│   └── lifecycle.ts          # Agent lifecycle management
│
├── muscles/
│   ├── interface.ts          # Muscle base interface
│   ├── registry.ts           # Muscle registration
│   ├── authorization.ts      # Permission checking
│   └── audit.ts              # Action logging
│
├── safety/
│   ├── boundaries.ts         # Safety boundary definitions
│   ├── escalation.ts         # Human escalation triggers
│   ├── guardrails.ts         # Action guardrails
│   └── monitoring.ts         # Safety monitoring
│
├── orchestration/
│   ├── router.ts             # Multi-agent routing
│   ├── handoff.ts            # Agent-to-human handoffs
│   ├── coordination.ts       # Multi-agent coordination
│   └── provenance.ts         # Decision provenance
│
└── integrations/
    ├── humanos.ts            # HumanOS integration
    ├── passport.ts           # Passport API client
    ├── capability-graph.ts   # Capability verification
    └── workforce.ts          # Workforce Cloud integration

@human/connector-sdk/            # Separate package for connectors
├── core/
│   ├── connector.ts          # Base connector interface
│   ├── registry.ts           # Connector registration
│   ├── credentials.ts        # Credential management
│   └── testing.ts            # Test harness for connectors
│
├── interfaces/
│   ├── calendar.ts           # CalendarConnector interface
│   ├── videoconf.ts          # VideoConfConnector interface
│   ├── transcription.ts      # TranscriptionConnector interface
│   ├── scheduling.ts         # SchedulingConnector interface
│   ├── notes.ts              # NotesConnector interface
│   ├── tasks.ts              # TasksConnector interface
│   └── communication.ts      # CommunicationConnector interface
│
├── helpers/
│   ├── oauth.ts              # OAuth 2.0 helpers
│   ├── webhook.ts            # Webhook helpers
│   └── retry.ts              # Retry logic
│
└── templates/                 # Starter templates for new connectors
    ├── calendar/
    ├── videoconf/
    └── generic/

Relationship: Agents → Muscles → Connectors

┌─────────────────────────────────────────────────────────────────┐
│                          AGENT                                   │
│                    (e.g., MeetingFacilitator)                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Uses abstract capabilities via MUSCLES                          │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │   Calendar   │  │  VideoConf   │  │    Notes     │          │
│  │    Muscle    │  │    Muscle    │  │    Muscle    │          │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
│         │                 │                 │                    │
├─────────┼─────────────────┼─────────────────┼────────────────────┤
│         │                 │                 │                    │
│  Muscles delegate to platform-agnostic CONNECTOR INTERFACES      │
│         │                 │                 │                    │
│  ┌──────▼───────┐  ┌──────▼───────┐  ┌──────▼───────┐          │
│  │  Calendar    │  │  VideoConf   │  │    Notes     │          │
│  │  Connector   │  │  Connector   │  │  Connector   │          │
│  │  Interface   │  │  Interface   │  │  Interface   │          │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
│         │                 │                 │                    │
├─────────┼─────────────────┼─────────────────┼────────────────────┤
│         │                 │                 │                    │
│  User configures which VENDOR CONNECTORS to use                  │
│         │                 │                 │                    │
│    ┌────┴────┐       ┌────┴────┐       ┌────┴────┐              │
│    │ Google  │       │  Zoom   │       │ Notion  │              │
│    │Calendar │       │Connector│       │Connector│              │
│    └─────────┘       └─────────┘       └─────────┘              │
│    ┌─────────┐       ┌─────────┐       ┌─────────┐              │
│    │Outlook  │       │ Google  │       │Obsidian │              │
│    │   365   │       │  Meet   │       │Connector│              │
│    └─────────┘       └─────────┘       └─────────┘              │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

KEY PRINCIPLE: Agents and muscles are vendor-agnostic.
              Connectors are vendor-specific.
              Users choose their connectors.

Base Agent Interface

// @human/agent-sdk/core/agent.ts

import type { PassportId, DelegationScope } from "@human/passport-domain";
import type { VaultRef } from "@human/vault";
import type { AuditLogger } from "./audit";
import type { MuscleRegistry } from "../muscles/registry";
import type { BoundaryPolicy } from "../safety/boundaries";

/**
 * Base interface for all HAIO-compliant agents
 */
export interface HumanAgent {
  /** Unique agent identifier */
  readonly agentId: string;
  
  /** Human-readable agent name */
  readonly name: string;
  
  /** Agent version */
  readonly version: string;

  // ═══════════════════════════════════════════════════════════════
  // IDENTITY & AUTHORITY
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * The Passport this agent operates on behalf of.
   * All actions are attributed to this identity.
   */
  readonly passport: PassportBinding;
  
  /**
   * Delegations granted to this agent.
   * Defines what the agent is authorized to do.
   */
  readonly delegations: DelegationScope[];
  
  /**
   * Check if agent has a specific delegation.
   */
  hasAuthority(scope: DelegationScope): boolean;
  
  /**
   * Request additional delegation from the user.
   */
  requestDelegation(scope: DelegationScope, reason: string): Promise<DelegationResult>;

  // ═══════════════════════════════════════════════════════════════
  // MEMORY & STATE
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * Vault reference for persistent memory.
   * Agent memory is owned by the Passport, not the agent.
   */
  readonly vault: VaultRef;
  
  /**
   * Working memory for current conversation/session.
   */
  readonly workingMemory: WorkingMemory;
  
  /**
   * Save state to vault.
   */
  persistMemory(): Promise<void>;
  
  /**
   * Load state from vault.
   */
  restoreMemory(): Promise<void>;

  // ═══════════════════════════════════════════════════════════════
  // CAPABILITIES (MUSCLES)
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * Registry of muscles available to this agent.
   */
  readonly muscles: MuscleRegistry;
  
  /**
   * Execute a muscle action.
   * Automatically checks authorization and logs action.
   */
  executeAction<T>(
    muscleId: string,
    action: string,
    params: Record<string, unknown>
  ): Promise<ActionResult<T>>;

  // ═══════════════════════════════════════════════════════════════
  // SAFETY & BOUNDARIES
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * Boundary policies that constrain agent behavior.
   */
  readonly boundaries: BoundaryPolicy;
  
  /**
   * Audit logger for all agent actions.
   */
  readonly auditLog: AuditLogger;
  
  /**
   * Escalate to human when agent cannot proceed safely.
   */
  escalate(reason: EscalationReason, context: EscalationContext): Promise<void>;
  
  /**
   * Hand off to another agent or human.
   */
  handoff(to: PassportId | string, context: HandoffContext): Promise<void>;

  // ═══════════════════════════════════════════════════════════════
  // LIFECYCLE
  // ═══════════════════════════════════════════════════════════════
  
  /**
   * Initialize agent (called on startup).
   */
  initialize(): Promise<void>;
  
  /**
   * Process a message/request.
   */
  process(input: AgentInput): Promise<AgentOutput>;
  
  /**
   * Shutdown agent gracefully.
   */
  shutdown(): Promise<void>;
}

/**
 * Passport binding for agent identity
 */
export interface PassportBinding {
  /** The Passport ID the agent operates under */
  passportId: PassportId;
  
  /** The type of entity (Human, LegalEntity, AgentFuture) */
  personType: "Human" | "LegalEntity" | "AgentFuture";
  
  /** Whether this is the owner or a delegate */
  bindingType: "owner" | "delegate";
  
  /** If delegate, who granted the delegation */
  delegatedBy?: PassportId;
  
  /** Expiration of the binding */
  expiresAt?: Date;
}

/**
 * Working memory for session state
 */
export interface WorkingMemory {
  /** Conversation history */
  conversation: ConversationTurn[];
  
  /** Current context/state */
  context: Record<string, unknown>;
  
  /** Pending actions */
  pendingActions: PendingAction[];
  
  /** Clear working memory */
  clear(): void;
  
  /** Add to conversation */
  addTurn(turn: ConversationTurn): void;
}

Muscle Interface

// @human/agent-sdk/muscles/interface.ts

import type { PassportId, DelegationScope } from "@human/passport-domain";
import type { AuditLogger } from "../core/audit";

/**
 * Base interface for all muscles (agent capabilities)
 */
export interface Muscle {
  /** Unique muscle identifier */
  readonly muscleId: string;
  
  /** Human-readable name */
  readonly name: string;
  
  /** Description of what this muscle does */
  readonly description: string;
  
  /** Delegations required to use this muscle */
  readonly requiredDelegations: DelegationScope[];
  
  /** Actions available on this muscle */
  readonly actions: MuscleAction[];
  
  /** Audit logger for muscle actions */
  readonly auditLog: AuditLogger;
  
  /**
   * Check if the given passport has authority to use this muscle.
   */
  checkAuthorization(
    actor: PassportId,
    action: string,
    delegations: DelegationScope[]
  ): Promise<AuthorizationResult>;
  
  /**
   * Execute an action on this muscle.
   */
  execute<T>(
    action: string,
    params: Record<string, unknown>,
    context: ExecutionContext
  ): Promise<ActionResult<T>>;
}

/**
 * Defines a single action on a muscle
 */
export interface MuscleAction {
  /** Action identifier */
  id: string;
  
  /** Human-readable name */
  name: string;
  
  /** Description */
  description: string;
  
  /** Required delegation scope */
  requiredScope: DelegationScope;
  
  /** Parameter schema (JSON Schema) */
  parameters: JSONSchema;
  
  /** Return type schema */
  returns: JSONSchema;
  
  /** Whether this action requires human confirmation */
  requiresConfirmation: boolean;
  
  /** Risk level for safety boundaries */
  riskLevel: "low" | "medium" | "high" | "critical";
}

/**
 * Result of a muscle action
 */
export interface ActionResult<T> {
  success: boolean;
  data?: T;
  error?: ActionError;
  
  /** Audit trail for this action */
  audit: {
    actionId: string;
    muscleId: string;
    action: string;
    actor: PassportId;
    timestamp: Date;
    duration: number;
    authorized: boolean;
  };
}

Safety Boundaries

// @human/agent-sdk/safety/boundaries.ts

import type { PassportId } from "@human/passport-domain";

/**
 * Defines safety boundaries for agent behavior
 */
export interface BoundaryPolicy {
  /** Maximum actions per minute */
  rateLimits: {
    actionsPerMinute: number;
    actionsPerHour: number;
    actionsPerDay: number;
  };
  
  /** Spending limits (if applicable) */
  spendingLimits?: {
    perAction: number;
    perDay: number;
    perMonth: number;
    currency: string;
  };
  
  /** Actions that always require human confirmation */
  requireConfirmation: string[];
  
  /** Actions that are completely forbidden */
  forbidden: string[];
  
  /** Time windows when agent can operate */
  operatingHours?: {
    timezone: string;
    windows: TimeWindow[];
  };
  
  /** Escalation triggers */
  escalationTriggers: EscalationTrigger[];
}

/**
 * Defines when to escalate to human
 */
export interface EscalationTrigger {
  /** Trigger identifier */
  id: string;
  
  /** Condition that triggers escalation */
  condition: EscalationCondition;
  
  /** Who to escalate to */
  escalateTo: PassportId | "owner" | "any_human";
  
  /** Priority of escalation */
  priority: "low" | "medium" | "high" | "critical";
  
  /** Maximum time to wait for human response */
  timeout?: Duration;
  
  /** What to do if timeout reached */
  timeoutAction: "retry" | "abort" | "proceed_with_caution";
}

/**
 * Conditions that can trigger escalation
 */
export type EscalationCondition =
  | { type: "uncertainty"; threshold: number }      // Agent confidence below threshold
  | { type: "risk_level"; level: "high" | "critical" }
  | { type: "spending"; amount: number }
  | { type: "pattern"; pattern: string }            // Regex match on action
  | { type: "consecutive_errors"; count: number }
  | { type: "explicit_request" }                    // Agent decides to escalate
  | { type: "policy_violation"; policyId: string }
  | { type: "custom"; evaluator: (context: EscalationContext) => boolean };

Multi-Agent Coordination

// @human/agent-sdk/orchestration/coordination.ts

import type { HumanAgent } from "../core/agent";
import type { PassportId } from "@human/passport-domain";

/**
 * Coordinates multiple agents working together
 */
export interface AgentCoordinator {
  /** Register an agent with the coordinator */
  register(agent: HumanAgent): Promise<void>;
  
  /** Unregister an agent */
  unregister(agentId: string): Promise<void>;
  
  /** Route a request to the appropriate agent */
  route(input: AgentInput, context: RoutingContext): Promise<RoutingDecision>;
  
  /** Hand off from one agent to another */
  handoff(
    from: HumanAgent,
    to: string | PassportId,
    context: HandoffContext
  ): Promise<HandoffResult>;
  
  /** Broadcast a message to multiple agents */
  broadcast(message: CoordinationMessage, targets: string[]): Promise<void>;
  
  /** Get status of all registered agents */
  getStatus(): Promise<AgentStatus[]>;
}

/**
 * Routing decision for incoming requests
 */
export interface RoutingDecision {
  /** Selected agent ID */
  agentId: string;
  
  /** Confidence in this routing */
  confidence: number;
  
  /** Reasoning for selection */
  reasoning: string;
  
  /** Alternative agents that could handle this */
  alternatives: Array<{
    agentId: string;
    confidence: number;
  }>;
}

/**
 * Context for handing off between agents
 */
export interface HandoffContext {
  /** Reason for handoff */
  reason: string;
  
  /** Conversation history to transfer */
  conversation: ConversationTurn[];
  
  /** Relevant context/state */
  context: Record<string, unknown>;
  
  /** Pending actions to transfer */
  pendingActions: PendingAction[];
  
  /** Whether receiving agent can hand back */
  allowHandback: boolean;
}

USAGE EXAMPLES

Creating a Simple Agent

import { createAgent, defineMuscle } from "@human/agent-sdk";
import type { CalendarMuscle } from "@human/agent-muscles/calendar";

// Define a meeting scheduler agent
const schedulerAgent = createAgent({
  name: "Meeting Scheduler",
  version: "1.0.0",
  
  // Bind to a Passport
  passport: {
    passportId: "passport:human:corp",
    personType: "LegalEntity",
    bindingType: "delegate",
  },
  
  // Define required delegations
  delegations: [
    "calendar.read",
    "calendar.write",
    "notification.send",
  ],
  
  // Configure muscles
  muscles: {
    calendar: new CalendarMuscle({
      providers: ["google", "microsoft"],
    }),
  },
  
  // Define safety boundaries
  boundaries: {
    rateLimits: {
      actionsPerMinute: 10,
      actionsPerHour: 100,
      actionsPerDay: 500,
    },
    requireConfirmation: ["calendar.delete"],
    forbidden: ["calendar.delete_all"],
    escalationTriggers: [
      {
        id: "double-booking",
        condition: { type: "pattern", pattern: "conflict_detected" },
        escalateTo: "owner",
        priority: "medium",
      },
    ],
  },
});

// Process a request
const result = await schedulerAgent.process({
  type: "message",
  content: "Schedule a meeting with Mike tomorrow at 2pm",
  from: "passport:user:rick",
});

Creating a Custom Muscle

import { defineMuscle, type Muscle, type MuscleAction } from "@human/agent-sdk";

// Define a custom research muscle
export const researchMuscle = defineMuscle({
  muscleId: "research",
  name: "Research Assistant",
  description: "Performs research tasks using various sources",
  
  requiredDelegations: ["research.read", "research.summarize"],
  
  actions: [
    {
      id: "search",
      name: "Search",
      description: "Search for information on a topic",
      requiredScope: "research.read",
      parameters: {
        type: "object",
        properties: {
          query: { type: "string" },
          sources: { type: "array", items: { type: "string" } },
          maxResults: { type: "number", default: 10 },
        },
        required: ["query"],
      },
      returns: {
        type: "array",
        items: { $ref: "#/definitions/SearchResult" },
      },
      requiresConfirmation: false,
      riskLevel: "low",
    },
    {
      id: "summarize",
      name: "Summarize",
      description: "Summarize research findings",
      requiredScope: "research.summarize",
      parameters: {
        type: "object",
        properties: {
          content: { type: "string" },
          style: { type: "string", enum: ["brief", "detailed", "executive"] },
        },
        required: ["content"],
      },
      returns: { type: "string" },
      requiresConfirmation: false,
      riskLevel: "low",
    },
  ],
  
  // Implementation
  async execute(action, params, context) {
    switch (action) {
      case "search":
        return this.performSearch(params.query, params.sources, params.maxResults);
      case "summarize":
        return this.performSummarize(params.content, params.style);
      default:
        throw new Error(`Unknown action: ${action}`);
    }
  },
});

Multi-Agent Coordination

import { createCoordinator, createAgent } from "@human/agent-sdk";

// Create specialized agents
const schedulerAgent = createAgent({ /* ... */ });
const researchAgent = createAgent({ /* ... */ });
const documentAgent = createAgent({ /* ... */ });

// Create coordinator
const coordinator = createCoordinator({
  routingStrategy: "capability-based",
  defaultAgent: schedulerAgent.agentId,
});

// Register agents
await coordinator.register(schedulerAgent);
await coordinator.register(researchAgent);
await coordinator.register(documentAgent);

// Route incoming requests
const input = {
  type: "message",
  content: "Research the latest trends in AI safety and schedule a meeting to discuss",
  from: "passport:user:rick",
};

// Coordinator decides: research first, then schedule
const routing = await coordinator.route(input, {
  preferredAgents: [],
  context: {},
});

// Execute with handoff
const researchResult = await researchAgent.process({
  ...input,
  content: "Research the latest trends in AI safety",
});

await coordinator.handoff(researchAgent, schedulerAgent, {
  reason: "Research complete, scheduling meeting",
  conversation: researchResult.conversation,
  context: { researchFindings: researchResult.data },
  pendingActions: [],
  allowHandback: false,
});

INTEROP SDK: BUILDING HUMAN ADAPTERS FOR EXTERNAL PLATFORMS

Critical Strategic Capability:

The Agent SDK includes patterns for wrapping and governing agents built on other platforms (n8n, LangChain, OpenAI Assistants, etc.) without requiring rewrites.

This is transformative for market adoption: enterprises can gain HAIO's benefits while keeping their existing agent investments.

The Adapter Interface

// @human/agent-sdk/interop/adapter.ts

/**
 * Base interface for platform adapters
 */
export interface PlatformAdapter {
  /** Platform identifier */
  readonly platformId: string;
  
  /** Platform name */
  readonly platformName: string;
  
  /** Identity adapter */
  identity: IdentityAdapter;
  
  /** Delegation adapter */
  delegation: DelegationAdapter;
  
  /** Event/logging adapter */
  events: EventAdapter;
  
  /** Policy hooks */
  policy: PolicyHooks;
}

/**
 * Maps external identities to Passport DIDs
 */
export interface IdentityAdapter {
  /**
   * Map external user ID to Passport DID
   */
  mapUserToPassport(externalUserId: string): Promise<PassportId>;
  
  /**
   * Map external agent to HUMAN agent identity
   */
  mapAgent(externalAgentId: string): Promise<{
    agentDid: PassportId;
    principalDid: PassportId;  // Who does it act for?
  }>;
  
  /**
   * Create bidirectional mapping
   */
  establishMapping(external: string, passport: PassportId): Promise<void>;
}

/**
 * Checks delegations before external agent actions
 */
export interface DelegationAdapter {
  /**
   * Check if action is allowed under current delegations
   */
  checkDelegation(params: {
    agentDid: PassportId;
    action: string;
    context: Record<string, any>;
  }): Promise<DelegationDecision>;
  
  /**
   * Get required capabilities for an action
   */
  getRequiredCapabilities(action: string): string[];
}

/**
 * Streams events to HUMAN ledger
 */
export interface EventAdapter {
  /**
   * Log external agent action to HUMAN ledger
   */
  logEvent(event: ExternalAgentEvent): Promise<void>;
  
  /**
   * Stream events in real-time
   */
  streamEvents(handler: (event: ExternalAgentEvent) => Promise<void>): void;
}

/**
 * HAIO policy enforcement hooks
 */
export interface PolicyHooks {
  /**
   * Called when policy is violated
   */
  onPolicyViolation(params: {
    agentDid: PassportId;
    action: string;
    violation: string;
  }): Promise<void>;
  
  /**
   * Request human approval mid-execution
   */
  requestApproval(params: {
    agentDid: PassportId;
    action: string;
    context: Record<string, any>;
    requiredCapability: string;
  }): Promise<ApprovalResult>;
  
  /**
   * Cancel ongoing execution
   */
  cancelExecution(agentDid: PassportId, reason: string): Promise<void>;
}

Example: n8n Platform Adapter

// @human/agent-sdk/interop/adapters/n8n.ts

import { PlatformAdapter, IdentityAdapter, DelegationAdapter } from "../adapter";
import { HumanClient } from "../../client";

export class N8nAdapter implements PlatformAdapter {
  readonly platformId = "n8n";
  readonly platformName = "n8n Workflow Automation";
  
  constructor(
    private config: {
      n8nApiUrl: string;
      n8nApiKey: string;
      orgPassportId: PassportId;
    },
    private humanClient: HumanClient
  ) {}
  
  // Identity adapter
  identity: IdentityAdapter = {
    async mapUserToPassport(n8nUserId: string): Promise<PassportId> {
      // Check if mapping exists
      const existing = await this.humanClient.mappings.get({
        externalSystem: "n8n",
        externalId: n8nUserId
      });
      
      if (existing) {
        return existing.passportId;
      }
      
      // Create new mapping
      const passport = await this.humanClient.passport.resolveOrCreate({
        externalId: n8nUserId,
        externalSystem: "n8n",
        orgDid: this.config.orgPassportId
      });
      
      return passport.id;
    },
    
    async mapAgent(n8nWorkflowId: string): Promise<{agentDid: PassportId; principalDid: PassportId}> {
      // Each n8n workflow becomes a HUMAN agent
      const workflow = await this.fetchWorkflow(n8nWorkflowId);
      
      const agentDid = await this.humanClient.agents.registerOrUpdate({
        externalId: n8nWorkflowId,
        name: workflow.name,
        platform: "n8n",
        capabilities: this.mapWorkflowToCapabilities(workflow)
      });
      
      return {
        agentDid,
        principalDid: this.config.orgPassportId
      };
    },
    
    async establishMapping(n8nId: string, passportId: PassportId): Promise<void> {
      await this.humanClient.mappings.create({
        externalSystem: "n8n",
        externalId: n8nId,
        passportId
      });
    }
  };
  
  // Delegation adapter
  delegation: DelegationAdapter = {
    async checkDelegation(params) {
      return await this.humanClient.humanos.checkDelegation({
        agent: params.agentDid,
        action: params.action,
        context: params.context,
        requiredCapabilities: this.getRequiredCapabilities(params.action)
      });
    },
    
    getRequiredCapabilities(action: string): string[] {
      // Map n8n actions to HAIO capabilities
      const mapping: Record<string, string[]> = {
        'send_email': ['email_sender'],
        'update_database': ['database_writer'],
        'call_api': ['api_caller'],
        'process_payment': ['payment_processor']
      };
      return mapping[action] || ['generic_workflow_executor'];
    }
  };
  
  // Event adapter
  events: EventAdapter = {
    async logEvent(event: ExternalAgentEvent): Promise<void> {
      await this.humanClient.ledger.appendEvent({
        eventType: 'external_agent_action',
        actorDid: event.agentDid,
        action: event.action,
        result: event.result,
        context: event.context,
        timestamp: event.timestamp,
        externalSystem: 'n8n',
        externalEventId: event.externalEventId,
        signature: await this.humanClient.signature.sign(event)
      });
    },
    
    streamEvents(handler) {
      // Subscribe to n8n webhook events
      this.subscribeToN8nWebhooks(async (n8nEvent) => {
        const mappedEvent = await this.mapN8nEventToHuman(n8nEvent);
        await handler(mappedEvent);
      });
    }
  };
  
  // Policy hooks
  policy: PolicyHooks = {
    async onPolicyViolation(params) {
      // Halt n8n workflow execution
      await this.pauseWorkflow(params.agentDid, params.violation);
      
      // Log violation
      await this.events.logEvent({
        agentDid: params.agentDid,
        action: params.action,
        result: 'policy_violation',
        context: { violation: params.violation },
        timestamp: new Date(),
        externalEventId: `violation-${Date.now()}`
      });
    },
    
    async requestApproval(params) {
      return await this.humanClient.humanos.requestApproval({
        agent: params.agentDid,
        action: params.action,
        context: params.context,
        requiredCapability: params.requiredCapability
      });
    },
    
    async cancelExecution(agentDid, reason) {
      await this.pauseWorkflow(agentDid, reason);
    }
  };
  
  // Helper methods
  private async fetchWorkflow(workflowId: string) {
    // Call n8n API to get workflow details
    // ...implementation...
  }
  
  private mapWorkflowToCapabilities(workflow: any): string[] {
    // Analyze workflow nodes to determine capabilities
    // ...implementation...
  }
  
  private async pauseWorkflow(agentDid: PassportId, reason: string) {
    // Call n8n API to pause workflow
    // ...implementation...
  }
  
  private subscribeToN8nWebhooks(handler: (event: any) => Promise<void>) {
    // Set up webhook listener for n8n events
    // ...implementation...
  }
  
  private async mapN8nEventToHuman(n8nEvent: any): Promise<ExternalAgentEvent> {
    // Convert n8n event format to HUMAN event format
    // ...implementation...
  }
}

Example: LangChain Platform Adapter

// @human/agent-sdk/interop/adapters/langchain.ts

import { PlatformAdapter } from "../adapter";
import { HumanClient } from "../../client";

export class LangChainAdapter implements PlatformAdapter {
  readonly platformId = "langchain";
  readonly platformName = "LangChain Framework";
  
  /**
   * Wrap a LangChain agent with HUMAN governance
   */
  async wrapAgent(langchainAgent: any, config: {
    agentName: string;
    principalDid: PassportId;
    capabilities: string[];
  }) {
    // Register agent in HUMAN
    const agentDid = await this.humanClient.agents.register({
      name: config.agentName,
      platform: "langchain",
      principalDid: config.principalDid,
      capabilities: config.capabilities
    });
    
    // Wrap LangChain tool calls with HUMAN checks
    const originalTools = langchainAgent.tools;
    langchainAgent.tools = originalTools.map(tool => this.wrapTool(tool, agentDid));
    
    // Intercept execution
    const originalRun = langchainAgent.run.bind(langchainAgent);
    langchainAgent.run = async (input: string) => {
      // Log intent
      await this.events.logEvent({
        agentDid,
        action: 'start_execution',
        result: 'success',
        context: { input },
        timestamp: new Date(),
        externalEventId: `exec-${Date.now()}`
      });
      
      // Execute
      const result = await originalRun(input);
      
      // Log completion
      await this.events.logEvent({
        agentDid,
        action: 'complete_execution',
        result: 'success',
        context: { input, result },
        timestamp: new Date(),
        externalEventId: `exec-${Date.now()}-complete`
      });
      
      return result;
    };
    
    return langchainAgent;
  }
  
  private wrapTool(tool: any, agentDid: PassportId) {
    const originalFunc = tool.func;
    
    tool.func = async (...args: any[]) => {
      // Check delegation before tool execution
      const decision = await this.delegation.checkDelegation({
        agentDid,
        action: tool.name,
        context: { args }
      });
      
      if (!decision.allowed) {
        if (decision.requiresHumanApproval) {
          // Request approval
          const approval = await this.policy.requestApproval({
            agentDid,
            action: tool.name,
            context: { args },
            requiredCapability: decision.escalateTo || 'general_approval'
          });
          
          if (!approval.approved) {
            throw new Error(`Action denied: ${approval.reason}`);
          }
        } else {
          throw new Error(`Action forbidden: ${decision.reason}`);
        }
      }
      
      // Execute tool
      const result = await originalFunc(...args);
      
      // Log execution
      await this.events.logEvent({
        agentDid,
        action: tool.name,
        result: 'success',
        context: { args, result },
        timestamp: new Date(),
        externalEventId: `tool-${Date.now()}`
      });
      
      return result;
    };
    
    return tool;
  }
}

Using Adapters in Practice

import { N8nAdapter, LangChainAdapter } from "@human/agent-sdk/interop";
import { HumanClient } from "@human/agent-sdk";

// Initialize HUMAN client
const human = new HumanClient({ apiKey: process.env.HUMAN_API_KEY });

// === Example 1: Wrap existing n8n workflows ===

const n8n = new N8nAdapter(
  {
    n8nApiUrl: "https://n8n.acme.com",
    n8nApiKey: process.env.N8N_API_KEY,
    orgPassportId: "did:human:org:acme"
  },
  human
);

// Register all workflows as HUMAN agents
const workflows = await fetchAllN8nWorkflows();
for (const workflow of workflows) {
  const { agentDid } = await n8n.identity.mapAgent(workflow.id);
  console.log(`Registered n8n workflow ${workflow.name} as ${agentDid}`);
}

// Now all n8n workflows are governed by HUMAN:
// - Identity: Each workflow has a Passport DID
// - Delegation: Actions checked against policies
// - Logging: All executions logged to ledger


// === Example 2: Wrap LangChain agent ===

import { initializeAgentExecutorWithOptions } from "langchain/agents";
import { OpenAI } from "langchain/llms/openai";

// Create LangChain agent
const langchainAgent = await initializeAgentExecutorWithOptions(
  tools,
  new OpenAI({ temperature: 0 }),
  { agentType: "zero-shot-react-description" }
);

// Wrap with HUMAN governance
const langchain = new LangChainAdapter({...}, human);
const governedAgent = await langchain.wrapAgent(langchainAgent, {
  agentName: "Insurance Claims Processor",
  principalDid: "did:human:org:insurance-co",
  capabilities: ["claims_review", "payout_approval_under_10k"]
});

// Now LangChain agent is governed by HUMAN
const result = await governedAgent.run("Process claim #12345");
// Every tool call is checked for delegation, logged to ledger

Migration Patterns

Pattern 1: Gradual Migration

// Start: All agents on n8n, zero HUMAN
// Step 1: Wrap with HUMAN-around (identity + logging)
const adapter = new N8nAdapter(config, human);
await adapter.wrapAllWorkflows();

// Step 2: Add policy enforcement
await adapter.policy.enablePolicyChecks(['high-risk-actions']);

// Step 3: Migrate critical workflows to HUMAN-native
const criticalWorkflows = ['payment-processing', 'data-deletion'];
for (const id of criticalWorkflows) {
  await migrateTHumanNative(id);
}

// End: Critical on HUMAN-native, rest wrapped

Pattern 2: Hybrid Deployment

// Some agents native, some wrapped
const agents = [
  // HUMAN-native agents
  await createAgent({ name: "Compliance Reviewer", ... }),
  
  // Wrapped n8n workflows
  await n8n.wrapWorkflow("stripe-fulfillment"),
  
  // Wrapped LangChain agents
  await langchain.wrapAgent(researchAgent, {...})
];

// Coordinator routes across all of them
const coordinator = createCoordinator({ agents });

Benefits of Interop SDK

For Enterprises:

No forced migration off existing platforms
Immediate value: identity, delegation, logging, governance
Clear path to HUMAN-native when ready

For HUMAN:

Shorter sales cycles ("we wrap your stack")
Broader market (works with any platform)
Long-term stickiness (customers see limits of wrapped platforms, migrate to native)

For Developers:

Build adapters for any platform
Consistent interface across all platforms
Monetization via adapter marketplace

See also:

43_haio_developer_architecture.md - Complete interoperability architecture
22_humanos_orchestration_core.md - Orchestration patterns

AGENT AUTHENTICATION

Agents authenticate to HUMAN APIs using the same methods as other clients, but with agent-specific patterns.

How Agents Get Credentials

Scenario	Authentication Method	How It Works
Agent on behalf of user	Delegated Token	User grants delegation → agent receives scoped token
Agent on behalf of org	Service Account	Org creates service account → agent uses API key
Standalone agent	Agent Passport + PAT	Agent has own Passport → generates own tokens

Delegated Authentication (Most Common)

When an agent acts on behalf of a human or organization:

import { HumanClient, DelegationClient } from "@human/agent-sdk";

// Agent requests delegation from user
const delegation = await DelegationClient.request({
  fromPassport: userPassportId,           // Who's granting
  toAgent: agent.agentId,                 // Who's receiving
  scopes: ["calendar:read", "calendar:write"],
  reason: "Schedule meetings on your behalf",
  duration: "30d",
});

// User approves in Companion or app
// → Agent receives delegated credentials

// Agent uses delegated credentials
const client = new HumanClient({
  delegation: delegation.credentials,
  onBehalfOf: userPassportId,
});

// API calls are attributed to user, executed by agent
const events = await client.calendar.list();
// Audit log shows: "Agent X acting on behalf of User Y"

Service Account Authentication (Server-to-Server)

For agents running as backend services:

import { HumanClient } from "@human/agent-sdk";

// Service account key (stored securely, e.g., env var)
const client = new HumanClient({
  apiKey: process.env.HUMAN_SERVICE_KEY,  // hsk_live_...
  type: "service_account",
});

// API calls attributed to service account
const result = await client.capabilities.query({...});

Agent Passport (Autonomous Agents)

Agents can have their own Passports (Person:AgentFuture):

import { AgentPassport, HumanClient } from "@human/agent-sdk";

// Agent with its own identity
const agentPassport = await AgentPassport.create({
  name: "Meeting Facilitator Agent",
  owner: humanPassportId,              // Human who owns/controls this agent
  capabilities: ["scheduling", "transcription"],
  boundaries: agentBoundaries,
});

// Agent generates its own PAT
const agentToken = await agentPassport.createToken({
  scopes: ["calendar:read"],
  expiresIn: "24h",
});

// Agent authenticates as itself
const client = new HumanClient({
  token: agentToken,
  agentPassport: agentPassport.id,
});

Credential Storage in Agents

Agents should never store credentials in code. Use:

// Environment variables (recommended for service accounts)
const client = new HumanClient(); // Auto-reads HUMAN_API_TOKEN

// Vault storage (for delegated credentials)
const credentials = await agent.vault.getCredentials("human_api");
const client = new HumanClient({ delegation: credentials });

// Secure credential manager (for production)
const secret = await secretManager.get("human-api-key");
const client = new HumanClient({ apiKey: secret });

Token Refresh (Automatic)

The SDK automatically refreshes tokens before expiry:

const client = new HumanClient({
  delegation: delegatedCredentials,
  // SDK automatically:
  // - Monitors token expiry
  // - Refreshes before expiration
  // - Updates stored credentials
  // - Retries failed requests after refresh
});

// You never need to handle refresh manually

HAIO INTEGRATION

Passport Integration

All agents must bind to a Passport:

import { PassportClient } from "@human/agent-sdk/integrations/passport";

const passportClient = new PassportClient({
  endpoint: "https://api.human.protocol/passport",
  credentials: agentCredentials,
});

// Verify agent's Passport binding
const binding = await passportClient.verifyBinding(agent.passport);

// Check delegations
const delegations = await passportClient.getDelegations(
  agent.passport.passportId,
  agent.agentId
);

// Request new delegation
const result = await passportClient.requestDelegation({
  from: agent.passport.passportId,
  to: agent.agentId,
  scope: "calendar.write",
  reason: "Need to create meetings on your behalf",
  duration: "30d",
});

Capability Graph Integration

Agents can query and update capabilities:

import { CapabilityClient } from "@human/agent-sdk/integrations/capability-graph";

const capabilityClient = new CapabilityClient({
  endpoint: "https://api.human.protocol/capability-graph",
});

// Query capabilities for task matching
const capabilities = await capabilityClient.query({
  passport: userPassportId,
  domain: "software-engineering",
  minLevel: 0.7,
});

// Submit evidence of capability demonstration
await capabilityClient.submitEvidence({
  passport: userPassportId,
  capability: "meeting-facilitation",
  evidence: {
    type: "task-completion",
    taskId: meetingId,
    outcome: "successful",
    observedBy: agent.agentId,
  },
});

HumanOS Integration

Agents register with HumanOS for orchestration:

import { HumanOSClient } from "@human/agent-sdk/integrations/humanos";

const humanosClient = new HumanOSClient({
  endpoint: "https://api.human.protocol/humanos",
});

// Register agent with HumanOS
await humanosClient.registerAgent({
  agentId: agent.agentId,
  name: agent.name,
  capabilities: agent.muscles.listCapabilities(),
  boundaries: agent.boundaries,
  passport: agent.passport,
});

// Report action for provenance
await humanosClient.reportAction({
  agentId: agent.agentId,
  action: "calendar.create",
  input: { title: "Meeting", participants: [...] },
  output: { eventId: "..." },
  timestamp: new Date(),
});

// Request human escalation
const escalation = await humanosClient.escalate({
  agentId: agent.agentId,
  reason: "Uncertainty about meeting conflict resolution",
  context: { conflictingEvents: [...] },
  priority: "medium",
  escalateTo: agent.passport.passportId,
});

DOCUMENTATION REQUIREMENTS

For SDK Release

Getting Started Guide - 15 minutes to first agent
Core Concepts - Identity, muscles, safety, coordination
API Reference - Complete TypeScript docs
Example Agents - 5+ reference implementations
Best Practices - Security, performance, testing
Migration Guide - From other agent frameworks

Example Agent Library

Agent	Complexity	Demonstrates
Echo Agent	Trivial	Basic structure
Calendar Agent	Simple	Single muscle
Meeting Facilitator	Medium	Multiple muscles, coordination
Research Assistant	Medium	External APIs, summarization
Document Reviewer	Complex	Multi-step workflows
Workflow Coordinator	Complex	Multi-agent orchestration

RELEASE ROADMAP

Phase 1: Core Primitives (Month 1-2)

Protocol definitions (protobuf + OpenAPI for ctx API)
Unified ctx pattern implementation (llm, agents, db, secrets, memory, etc.)
handler() wrapper with capabilities and delegation
SDK generator for TypeScript (source language)
CLI tool (init, dev, deploy, test, vault, prompts, replay)
Credential cascade (Passport > Vault > Env)
Handler-level secret scoping

Phase 2: Trust & Cost (Month 2-3)

Agent-to-agent delegation chain (auto-scoping, chained provenance)
ctx.call.agent() with delegation validation
LLM tier-based routing (fast/balanced/powerful)
Cost controls with tiered thresholds
Human escalation at budget limits
Passport integration for user credentials

Phase 3: Testing & Debugging (Month 3-4)

Semantic test assertions (expectSemantic)
Golden output recording and approval
LLM fixture recording (record/replay modes)
Time-travel debugging (metadata default, full capture opt-in)
Execution replay CLI
Prompt versioning (repo → registry)
Prompt A/B testing

Phase 4: Infrastructure & Multi-Language (Month 4-5)

Declarative infrastructure (postgres, redis, s3, queue, vector)
Preview deployments (seeded DB, branch URLs)
SDK generator for Python, Go, Rust
CI/CD pipeline for SDK auto-generation
Multi-language SDK testing

Phase 5: Documentation & Launch (Month 5-6)

Agent-readable documentation (OpenAPI, llms.txt, context.json)
Complete API reference (auto-generated, versioned)
Interactive docs with runnable examples
Example agent library (6+ reference agents)
Developer portal with guides
CLI discoverability (human-agent ctx, human-agent docs)
Community launch

SUCCESS METRICS

Adoption Metrics

Metric	Year 1 Target	Measurement
SDK downloads	10,000+	npm/GitHub
Active developers	500+	Unique contributors
Agents built	100+	Registered with HumanOS
GitHub stars	1,000+	Community interest

Quality Metrics

Metric	Target	Measurement
Documentation coverage	100%	All public APIs documented
Test coverage	90%+	Unit + integration tests
Time to first agent	< 15 min	Developer experience testing
Support response time	< 24 hours	Community support

PUBLISHING TO MARKETPLACE

Overview

Agents built with the HUMAN SDK can be published to the Marketplace for discovery and installation by other users.

Publishing flow:

# Develop agent
$ human-agent init my-agent
$ cd my-agent
$ human-agent dev

# Test
$ human-agent test --golden

# Prepare for publishing
$ human-agent manifest

# Publish to Marketplace
$ human-agent publish

Manifest Requirements

Every published agent needs a complete human-agent.yaml:

# Marketplace metadata
marketplace:
  name: "Invoice Processor"
  description: "Extracts and validates invoice data"
  category: "finance"
  keywords: ["invoice", "ocr", "validation"]
  
  # Trust tier target
  trustTier: "verified"  # community, verified, or human_certified
  
  # Pricing
  pricing:
    free:
      executions: 100
      period: "month"
    pro:
      price: 49
      currency: "USD"
      period: "month"
      executions: "unlimited"
  
  # Screenshots and assets
  assets:
    icon: "./assets/icon.png"
    screenshots:
      - "./assets/screenshot1.png"
      - "./assets/screenshot2.png"
    demo_video: "https://youtube.com/..."
  
  # Support
  support:
    documentation: "https://docs.example.com"
    contact: "support@example.com"
    repository: "https://github.com/..."

# Agent capabilities (for discovery)
capabilities:
  - finance/invoice/process
  - document/extraction
  - data/validation

# Required permissions (shown to users during install)
requires:
  scope:
    - read:documents
    - write:structured_data
  escalate:
    - finance/approver  # When amount > $5000

App Review Process

Upon publishing, the agent enters the App Review pipeline:

Automated Review (<5 minutes for most agents)
- Security Scanner Agent checks for vulnerabilities
- Policy Compliance Agent verifies capability claims
- Quality Assessment Agent scores code quality
- Trust Scoring Agent classifies risk
Approval Decision
- Low risk + good quality → Auto-approved (Community tier)
- Medium risk → Fast-track human review (<4 hours)
- High risk → Full human review (1-3 days)

Publisher Dashboard

# Check review status
$ human-agent status

Status: APPROVED (Community tier)
Marketplace URL: https://marketplace.human.cloud/agents/invoice-processor

Stats:
├─ Installs: 234
├─ Active users: 189
├─ Invocations (30d): 45,293
├─ Revenue (30d): $2,341
└─ Rating: 4.7 / 5.0 (47 reviews)

Revenue Share

Trust Tier	Listing Fee	Revenue Share	Benefits
Community	Free	Developer: 85%HUMAN: 15%	Basic listing
Verified	$99/year	Developer: 90%HUMAN: 10%	Featured, verified badge
HUMAN Certified	$999/year	Developer: 95%HUMAN: 5%	Top placement, certified badge, SLA

Best Practices

For approval success:

Comprehensive README and documentation
Usage examples with sample data
Error handling for all failure modes
No hardcoded secrets or API keys
Clear capability claims matching actual functionality

For marketplace success:

Solve a common problem (browse existing agents for gaps)
Competitive pricing (check similar agents)
Responsive support (answer user questions fast)
Regular updates (fix bugs, add features)
Engage with reviews (thank users, address concerns)

For revenue success:

Free tier to get users (100-1000 executions)
Pro tier with real value (unlimited, premium features)
Enterprise tier with custom pricing
Bundle multiple agents into suites
Cross-promote your other agents

SDK Commands for Publishing

# Initialize with marketplace support
$ human-agent init my-agent --marketplace

# Validate manifest before publishing
$ human-agent validate

# Test locally with marketplace configuration
$ human-agent dev --marketplace-mode

# Generate marketplace assets
$ human-agent assets generate

# Publish to marketplace (staging first)
$ human-agent publish --staging

# Promote to production after testing
$ human-agent promote --production

# Update published agent
$ human-agent publish --version 1.1.0

# View marketplace analytics
$ human-agent analytics --period 30d

# Respond to reviews
$ human-agent reviews --respond

See also:

135_agent_marketplace_architecture.md - Complete marketplace architecture
139_app_review_agent_spec.md - App Review technical specification
137_companion_powered_builder.md - Builder Companion integration

AGENT DISCOVERY & CAPABILITY MANAGEMENT

The Challenge: Managing Hundreds of Agents

As organizations mature their agent ecosystem, they face a critical discovery problem:

Symptoms:

"Do we already have an agent that does X?"
"Which agents can process invoices?"
"Am I creating duplicates?"
"How do I know what capabilities we have?"
"Which marketplace agents would help us?"

Without proper discovery: Redundant agents, missed reuse opportunities, shadow AI.

The solution: Agent-aware Capability Graph + intelligent discovery interfaces.

Agent Storage: Multi-Tier Visibility

Agents are stored based on visibility:

┌─────────────────────────────────────────────────────────────┐
│  1. MARKETPLACE AGENTS (Public, Saleable)                   │
│  Storage: Global registry (HUMAN-hosted)                    │
│  Discovery: Anyone can browse                               │
│  Examples: Invoice processors, contract reviewers           │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  2. ORG-PRIVATE AGENTS (Internal, Never Public)             │
│  Storage: Org Vault (file-based)                            │
│  Discovery: Only within org                                 │
│  Examples: Internal workflow agents, proprietary tools      │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  3. SELF-HOSTED ENTERPRISE AGENTS (Behind Firewall)         │
│  Storage: On-premise vault + optional registry sync         │
│  Discovery: Only within enterprise network                  │
│  Examples: Regulated industry agents, air-gapped systems    │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  4. PERSONAL AGENTS (Individual's Vault)                    │
│  Storage: Personal Vault                                    │
│  Discovery: Only by owner                                   │
│  Examples: Personal assistants, custom automations          │
└─────────────────────────────────────────────────────────────┘

Agent Manifest: Visibility Control

Every agent declares its visibility:

# agent.yaml
id: acme-proprietary-workflow
name: ACME Proprietary Workflow Agent
version: 1.0.0
capabilities: [workflow/acme_internal]

# Visibility control
visibility:
  scope: org  # 'public' | 'org' | 'enterprise' | 'personal'
  ownerOrgId: org_acme_corp
  shareWith: []  # Optional: specific org IDs for B2B sharing

# Storage location
storage:
  type: vault  # 'marketplace' | 'vault' | 'self_hosted'
  vaultRef: vault://org_acme_corp/agents/acme-workflow

Vault-Based Agent Storage

Agents are files, not database records:

vault://org_acme_corp/agents/
  acme-invoice-processor/
    agent.yaml              # Manifest (metadata + config)
    handler.js              # Code bundle
    package.json            # Dependencies
    schemas/
      input.schema.json
      output.schema.json
    README.md               # Human-readable docs

  acme-ap-workflow/
    agent.yaml
    handler.js
    ...

Benefits:

✅ Zero-config: Just drop files in vault
✅ No database setup required
✅ Version control friendly
✅ Works offline
✅ Vault is source of truth

Automatic Agent Indexing

How discovery works without databases:

┌─────────────────────────────────────────────────────────────┐
│  1. Agent files live in vaults (source of truth)            │
│     - Org vaults: vault://org_id/agents/*                   │
│     - Personal vaults: vault://did/agents/*                 │
│     - Marketplace: vault://marketplace/agents/*             │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│  2. Agent Indexer watches vaults (background service)       │
│     - Scans vault://*/agents/ folders                       │
│     - Parses agent.yaml files                               │
│     - Builds in-memory search index                         │
│     - Refreshes on file changes                             │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│  3. Discovery API queries index (ephemeral, rebuildable)    │
│     - Fast search across all accessible agents              │
│     - Scoped by user's vault access                         │
│     - Index loss = rebuild from vaults (no data loss)       │
└─────────────────────────────────────────────────────────────┘

Agent Discovery Service

interface AgentDiscoveryService {
  // Context-aware search
  findAgents(query: AgentQuery, context: DiscoveryContext): Promise<Agent[]>;
  
  // Scoped to what the user can see
  listAvailable(context: DiscoveryContext): Promise<Agent[]>;
  
  // Suggest agents for a capability need
  suggestForCapability(
    capability: string, 
    context: DiscoveryContext
  ): Promise<AgentSuggestion[]>;
  
  // Check for duplicate/overlapping agents
  findSimilarAgents(agentId: string): Promise<AgentSimilarity[]>;
  
  // Get capability coverage for org
  getCapabilityCoverage(orgId: string): Promise<CapabilityCoverage>;
}

interface DiscoveryContext {
  passportId: string;
  orgId?: string;
  vaultAccess: VaultAccessToken[];
  deploymentProfile: 'hosted' | 'hybrid' | 'self_hosted' | 'desktop';
}

interface AgentQuery {
  capabilities?: string[];
  keywords?: string;
  category?: string;
  
  // Scope filters
  includeMarketplace?: boolean;  // Default: true
  includeOrgPrivate?: boolean;   // Default: true if orgId present
  includePersonal?: boolean;     // Default: true
  includeSelfHosted?: boolean;   // Default: true if on-premise
}

interface AgentSuggestion {
  installed: Agent[];       // You already have these
  marketplace: Agent[];     // Available to install
  orgAvailable: Agent[];    // In your org but not installed
}

Companion Agent Discovery

Natural language queries to discover agents:

User: "What agents do I have for processing invoices?"

Companion (querying discovery service):
→ Found 4 agents with invoice processing capabilities:

  📦 Installed (3):
  1. Invoice Fraud Detector [Marketplace, Gold certified]
     - Analyzes invoices for fraud patterns
     - Last used: 2 hours ago
     - 15.2K calls this month
  
  2. ACME AP Workflow [Org-private]
     - ACME-specific accounts payable automation
     - Last used: today
     - 8.9K calls this month
  
  3. My Invoice Parser [Personal]
     - Custom PDF extraction
     - Draft status, not deployed

  🛒 Available in marketplace (2):
  4. QuickBooks Sync Agent [Silver certified]
     - Bi-directional QuickBooks integration
     - $49/mo or 100 calls/mo free
  
  5. Invoice OCR Pro [Bronze certified]
     - Advanced OCR with 50+ templates
     - Free tier available

Would you like to:
  • See usage details for any agent
  • Install a marketplace agent
  • Create a new agent

Query patterns:

"What agents can do X?"
"Do I already have an agent for Y?"
"Show me all finance agents"
"Which agents use this capability?"
"What's available for contract review?"
"Compare agent X and agent Y"
"What capabilities am I missing?"

Agent Capability Graph

Building a semantic capability index:

interface CapabilityIndex {
  capabilities: {
    [capabilityId: string]: {
      name: string;
      description: string;
      installed_agents: AgentSummary[];
      marketplace_agents: AgentSummary[];
      related_capabilities: string[];
      common_use_cases: string[];
    };
  };
}

// Example:
{
  "finance/invoice/process": {
    "name": "Invoice Processing",
    "description": "Extract, validate, and route invoices for approval",
    "installed_agents": [
      {
        "id": "invoice-fraud-detector",
        "name": "Invoice Fraud Detector",
        "certification": "gold",
        "usage_stats": {
          "calls_last_30d": 15234,
          "avg_latency_ms": 1200,
          "success_rate": 0.997
        }
      },
      {
        "id": "acme-ap-workflow",
        "name": "ACME AP Workflow",
        "visibility": "org",
        "usage_stats": {
          "calls_last_30d": 8941
        }
      }
    ],
    "marketplace_agents": [
      {
        "id": "quickbooks-sync",
        "name": "QuickBooks Sync Agent",
        "certification": "silver",
        "pricing": "$49/mo or 100 calls/mo free"
      }
    ],
    "related_capabilities": [
      "finance/accounts_payable",
      "document_processing/invoice",
      "finance/fraud_detection"
    ],
    "common_use_cases": [
      "Invoice validation",
      "PO matching",
      "Approval routing"
    ]
  }
}

Duplicate Detection

Before creating a new agent:

// Check for duplicates
const similar = await agentDiscovery.findSimilarAgents({
  capabilities: ['finance/invoice/process'],
  description: 'Process invoices for ACME',
}, context);

// Returns:
{
  exact_matches: [],  // Same capabilities + description
  high_overlap: [
    {
      agent: acme_ap_workflow,
      overlap_score: 0.85,
      shared_capabilities: ['finance/invoice/process'],
      recommendation: 'Consider extending existing agent'
    }
  ],
  moderate_overlap: [
    {
      agent: invoice_fraud_detector,
      overlap_score: 0.45,
      shared_capabilities: ['document_processing/invoice'],
      recommendation: 'Could use as dependency'
    }
  ]
}

AUTO-GENERATED AGENT DOCUMENTATION

The Vision: API Reference for Agents

Every agent should have API-style documentation like Stripe or Twilio docs.

What developers need:

Capabilities: What does this agent do?
Entrypoints: How do I call it?
Input/Output schemas: What data does it expect/return?
Examples: Show me working code
Reasoning requirements: What models does it need?
Compatibility: Will it work in my org?
Usage stats: How reliable is it?

Auto-Generated from Manifest

Agent documentation is generated from agent.yaml:

# Invoice Fraud Detector

**Publisher:** acme-security-corp  
**Certification:** Gold  
**Visibility:** Marketplace (public)  
**Version:** 1.0.3  
**Last Updated:** 2025-12-15

## Description

Analyzes invoices for fraud patterns using AI + human review.

## Capabilities

- `finance/fraud_detection` - Detect fraudulent transactions
- `document_processing/invoice` - Extract and analyze invoice data

## Installation

```bash
human agent install invoice-fraud-detector

Usage

const result = await human.call({
  target: "agent://invoice-fraud-detector.analyze",
  input: {
    document_id: "inv_123",
    vendor_id: "vendor_456",
    amount: 50000,
  },
  delegation: passport.delegate({
    scope: ["read:documents", "write:audit_log"],
  }),
});

// Returns:
// {
//   risk_score: 0.87,  // 0-1
//   fraud_indicators: ["unusual_amount", "new_vendor"],
//   recommended_action: "review",
//   confidence: 0.92
// }

Entrypoints

`analyze`

Analyzes a single invoice for fraud indicators.

Input Schema:

{
  "type": "object",
  "required": ["document_id", "amount"],
  "properties": {
    "document_id": {
      "type": "string",
      "description": "ID of invoice document"
    },
    "vendor_id": {
      "type": "string",
      "description": "Vendor identifier"
    },
    "amount": {
      "type": "number",
      "description": "Invoice amount"
    },
    "context": {
      "type": "object",
      "properties": {
        "previous_invoices": {
          "type": "number",
          "description": "Number of prior invoices from this vendor"
        }
      }
    }
  }
}

Output Schema:

{
  "type": "object",
  "properties": {
    "risk_score": {
      "type": "number",
      "description": "Risk score from 0 (safe) to 1 (high risk)"
    },
    "fraud_indicators": {
      "type": "array",
      "items": { "type": "string" },
      "description": "List of detected fraud indicators"
    },
    "recommended_action": {
      "type": "string",
      "enum": ["approve", "review", "reject"],
      "description": "Recommended next action"
    },
    "confidence": {
      "type": "number",
      "description": "Confidence in assessment (0-1)"
    }
  }
}

Reasoning Requirements

Capabilities: classification, anomaly_detection, natural_language
Min context window: 16,000 tokens
Allows PHI: true
Regulatory domains: finance
Supported profiles: high_safety, standard_safety

Compatibility: Will work with your org if you have:

✅ GPT-4, Claude 3.5 Sonnet, or equivalent
✅ standard_safety or higher reasoning profile
✅ 16K+ context window models

Usage Statistics

Total installs: 1,247
Active orgs: 892
Calls (last 30 days): 2.4M
Avg latency: 1.2s
Success rate: 99.7%
Rating: 4.8/5 (230 reviews)

Pricing

Free tier: 100 calls/month
Pro tier: $49/month (1,000 calls)
Business tier: $199/month (unlimited)
Enterprise: Custom pricing

Support

Documentation: https://docs.acme-security.com/fraud-detector
Contact: support@acme-security.com
Repository: https://github.com/acme/invoice-fraud-detector

Generated from agent manifest on 2025-12-19


### Documentation Export Formats

**CLI commands:**

```bash
# Generate Markdown docs
human agent docs generate --format markdown --output ./docs/

# Generate OpenAPI spec
human agent docs generate --format openapi --output ./openapi.yaml

# Generate interactive HTML site
human agent docs generate --format html --output ./agent-docs/

# Generate for specific agents
human agent docs generate invoice-fraud-detector --format markdown

OpenAPI export:

# Auto-generated OpenAPI spec
openapi: 3.0.0
info:
  title: Invoice Fraud Detector Agent
  version: 1.0.3
  description: Analyzes invoices for fraud patterns

paths:
  /agents/invoice-fraud-detector/analyze:
    post:
      summary: Analyze invoice for fraud
      operationId: analyzeInvoice
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/AnalyzeInput'
      responses:
        '200':
          description: Analysis complete
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/AnalyzeOutput'

components:
  schemas:
    AnalyzeInput:
      type: object
      required: [document_id, amount]
      properties:
        document_id:
          type: string
        vendor_id:
          type: string
        amount:
          type: number
    AnalyzeOutput:
      type: object
      properties:
        risk_score:
          type: number
        fraud_indicators:
          type: array
          items:
            type: string
        recommended_action:
          type: string
          enum: [approve, review, reject]

Use with Postman, Swagger UI, etc.

Organization-Wide Agent Reference

Generate reference for all installed agents:

# Export all installed agents
human agent docs generate --scope org --output ./org-agent-reference/

# Creates:
org-agent-reference/
  index.html
  agents/
    invoice-fraud-detector.html
    acme-ap-workflow.html
    contract-risk-analyzer.html
    ...
  capabilities/
    finance.html
    legal.html
    ...
  search.js

Browsable site:

┌─────────────────────────────────────────────────────────────┐
│ ACME Corp Agent Reference                      [Search...]  │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│ Quick Links                                                  │
│ • All Agents (15)                                           │
│ • By Capability                                             │
│ • By Certification                                          │
│ • Usage Statistics                                          │
│                                                              │
│ ──────────────────────────────────────────────────────────  │
│                                                              │
│ Finance Agents (5)                                          │
│                                                              │
│ 🟢 Invoice Fraud Detector [Gold, Marketplace]              │
│    Analyzes invoices for fraud patterns                     │
│    15.2K calls/month • 99.7% success                        │
│    [View Docs] [Usage Stats] [Call Examples]               │
│                                                              │
│ 🟢 ACME AP Workflow [Org-private]                          │
│    ACME-specific accounts payable automation                │
│    8.9K calls/month • 98.2% success                         │
│    [View Docs] [Usage Stats] [Call Examples]               │
│                                                              │
│ ... more agents ...                                         │
│                                                              │
└─────────────────────────────────────────────────────────────┘

IDE INTEGRATION: AGENT-AWARE DEVELOPMENT

Cursor as Agent-Aware IDE

Modern AI-driven IDEs (Cursor, Windsurf, etc.) should be fully aware of the agent ecosystem.

MCP Server for Agent Discovery

HUMAN provides an MCP server that exposes agent information to IDEs:

// packages/mcp-server-agents/src/index.ts

import { Server } from '@modelcontextprotocol/sdk/server/index.js';

const server = new Server({
  name: 'human-agents-server',
  version: '0.1.0',
}, {
  capabilities: { tools: {}, resources: {} },
});

// Tool: Search for agents by capability
server.setRequestHandler('tools/call', async (request) => {
  if (request.params.name === 'search_agents') {
    const { capability, keywords, include_marketplace } = request.params.arguments;
    
    const agents = await agentIndexer.findAgents({
      capabilities: capability ? [capability] : undefined,
      keywords,
      includeMarketplace: include_marketplace,
    });
    
    return { content: [{ type: 'text', text: JSON.stringify(agents, null, 2) }] };
  }
  
  if (request.params.name === 'get_agent_details') {
    const { agent_id } = request.params.arguments;
    const agent = await agentIndexer.getAgent(agent_id);
    const docs = await generateAgentDocs(agent);
    return { content: [{ type: 'text', text: docs }] };
  }
  
  if (request.params.name === 'suggest_agents_for_code') {
    const { code_snippet, intent } = request.params.arguments;
    const suggestions = await suggestAgentsForIntent(intent, code_snippet);
    return { content: [{ type: 'text', text: JSON.stringify(suggestions, null, 2) }] };
  }
});

// Resource: List all installed agents
server.setRequestHandler('resources/list', async () => {
  return {
    resources: [
      {
        uri: 'human://agents/installed',
        name: 'Installed Agents',
        mimeType: 'application/json',
      },
      {
        uri: 'human://agents/marketplace',
        name: 'Marketplace Agents',
        mimeType: 'application/json',
      },
      {
        uri: 'human://agents/capabilities',
        name: 'Agent Capabilities Index',
        mimeType: 'application/json',
      },
    ],
  };
});

// Resource: Read agent data
server.setRequestHandler('resources/read', async (request) => {
  if (request.params.uri === 'human://agents/installed') {
    const agents = await agentIndexer.listInstalled();
    return {
      contents: [{
        uri: request.params.uri,
        mimeType: 'application/json',
        text: JSON.stringify(agents, null, 2),
      }],
    };
  }
  
  if (request.params.uri === 'human://agents/capabilities') {
    const capabilityIndex = await buildCapabilityIndex();
    return {
      contents: [{
        uri: request.params.uri,
        mimeType: 'application/json',
        text: JSON.stringify(capabilityIndex, null, 2),
      }],
    };
  }
});

Cursor Configuration

// .cursor/mcp.json
{
  "mcpServers": {
    "human-agents": {
      "command": "node",
      "args": [".human/mcp-server-agents/index.js"],
      "env": {
        "HUMAN_VAULT_PATH": "${workspaceFolder}/.human/vault",
        "HUMAN_ORG_ID": "org_acme_corp"
      }
    }
  }
}

Auto-Updated `.cursorrules`

HUMAN generates .cursorrules from agent ecosystem:

# .cursorrules (Auto-generated by HUMAN)

Last updated: 2025-12-19T16:30:00Z
Org: ACME Corp (org_acme_corp)
Installed agents: 15 org + 4 personal
Marketplace agents: 1,247 available

## Installed Agents

### Finance (5 agents)
- **invoice-fraud-detector**: Analyzes invoices for fraud patterns
  - Capabilities: finance/fraud_detection, document_processing/invoice
  - Usage: 15.2K calls (last 30d), 99.7% success rate
  - Example: `agent://invoice-fraud-detector.analyze`
  
- **acme-ap-workflow**: ACME-specific accounts payable automation
  - Capabilities: finance/accounts_payable, workflow/approval
  - Usage: 8.9K calls (last 30d), 98.2% success rate
  - Example: `agent://acme-ap-workflow.process`

... [full list of installed agents]

## Marketplace Highlights (Relevant to Your Work)

Based on your codebase analysis, these marketplace agents might help:

- **payment-risk-analyzer** (Silver, $99/mo): Similar to your fraud detector but for payments
- **quickbooks-sync** (Silver, free tier): You're calling QuickBooks API manually in 3 places
- **document-parser-pro** (Bronze, $29/mo): Better than your PDF parsing code

## Coding Guidelines

1. **Agent-first development**: Check for existing agents before implementing
2. **Use MCP**: Query `@human` in chat to search agents
3. **Compose agents**: Build workflows from existing agents when possible
4. **Create agents**: If logic is reusable, make it an agent

## Common Patterns in This Workspace

```typescript
// Invoice processing (standard pattern in this org)
const fraud = await human.call({ target: "agent://invoice-fraud-detector.analyze", ... });
const po = await human.call({ target: "agent://acme-po-validator.validate", ... });
if (fraud.risk_score > 0.7) {
  await human.call({ target: "agent://acme-approval-workflow.route", ... });
}

Capability Map

[Semantic index of all capabilities with installed/marketplace agents]

finance/fraud_detection:

invoice-fraud-detector (installed, gold)
payment-risk-analyzer (marketplace, silver)

legal/contract/risk:

contract-risk-analyzer (installed, silver)
msla-review-agent (marketplace, gold)

... [full capability graph]


### IDE Behaviors with Agent Context

**Scenario 1: Writing code that could use an agent**

```typescript
// Developer writes:
async function detectFraud(invoice: Invoice) {
  // User starts writing fraud detection logic

Cursor AI (with agent context) responds:

💡 I noticed you're implementing fraud detection. 

You have an installed agent that does this:
  • invoice-fraud-detector (finance/fraud_detection)
  
Instead of implementing this manually, you can use:

[Shows agent call code with proper types]

This agent is maintained by your security team and is already 
handling 10,000+ invoices/day.

Would you like to use it? [Yes] [Show alternatives] [Continue manual]

Scenario 2: Need capability not installed

async function extractNDATerms(contract: string) {
  // Need to parse NDA contract

Cursor AI:

💡 You don't have an agent for NDA extraction, but there's one 
in the marketplace:

  📦 nda-automation-suite (Gold certified)
     • Extracts terms, parties, dates, obligations
     • Handles 50+ NDA templates
     • $49/month or 100 free calls/month
     • 4.8/5 stars (230 reviews)

Install? [Yes, add to org] [Yes, personal trial] [Show alternatives]

Or I can help you create a custom agent if you need different 
functionality.

Scenario 3: Agent composition

async function processContract(contractPdf: Buffer) {
  // Extract text from PDF
  // Analyze for risks
  // Check compliance
  // Route for approval

Cursor AI:

💡 This workflow combines multiple agent capabilities:

1. PDF extraction: document-parser-pro (installed)
2. Risk analysis: contract-risk-analyzer (installed)  
3. Compliance: compliance-checker (installed)
4. Approval routing: acme-approval-workflow (installed)

Here's a composed implementation:

[Shows full workflow code using 4 agents]

All 4 agents are already installed and working together in your org.

Natural Language Agent Queries in IDE

Chat with Cursor about agents:

Developer: "@human what agents do I have for processing contracts?"

Cursor (via MCP):
You have 3 agents for contract processing:

1. **contract-risk-analyzer** (installed, org-wide)
   - Capabilities: legal/contract/risk, document_processing
   - Analyzes contracts for legal risks, unfavorable terms
   - Used 234 times this month
   - Avg latency: 2.3s

2. **compliance-checker** (installed, org-wide)
   - Capabilities: legal/compliance, regulatory/audit
   - Checks contracts against regulatory requirements
   - Used 156 times this month
   - Avg latency: 1.8s

3. **nda-automation-suite** (marketplace, not installed)
   - Capabilities: legal/nda, contract/automation
   - Gold certified, $49/mo or 100 calls/mo free
   - Extracts terms, generates summaries, compares versions
   - 4.8/5 stars

Would you like to:
- See example usage code for any of these
- Install nda-automation-suite
- Create a new contract agent

TypeScript Types from Agent Schemas

Auto-generated types for installed agents:

// Auto-generated: .human/types/agents.d.ts

declare module '@human/agents' {
  export namespace invoiceFraudDetector {
    interface AnalyzeInput {
      document_id: string;
      vendor_id: string;
      amount: number;
      context?: {
        previous_invoices?: number;
      };
    }
    
    interface AnalyzeOutput {
      risk_score: number;
      fraud_indicators: string[];
      recommended_action: 'approve' | 'review' | 'reject';
      confidence: number;
    }
    
    function analyze(input: AnalyzeInput): Promise<AnalyzeOutput>;
  }
  
  export namespace contractRiskAnalyzer {
    // ... types for this agent
  }
}

Developer uses types:

import { invoiceFraudDetector } from '@human/agents';

async function processInvoice(invoiceData: InvoiceData) {
  // TypeScript knows the shape!
  const result = await invoiceFraudDetector.analyze({
    document_id: invoiceData.id,
    vendor_id: invoiceData.vendorId,
    amount: invoiceData.amount,
  });
  
  // Autocomplete works:
  if (result.recommended_action === 'approve') {
    console.log(result.fraud_indicators); // string[]
  }
}

Zero-Config Workspace Setup

# In your HUMAN workspace:
human workspace init

✓ Scanned vaults (15 org + 4 personal agents)
✓ Built capability index
✓ Started MCP server
✓ Generated .cursorrules
✓ Generated TypeScript types
✓ Configured Cursor integration

Cursor AI is now agent-aware!

Try asking:
  @human what agents do I have?
  @human show me fraud detection agents
  @human help me build an invoice workflow

EXAMPLE USE CASES: PROVING THE SDK

Purpose: Concrete, working examples that demonstrate the SDK solves real problems.

These examples serve as:

Validation - If we can't write the example cleanly, the SDK needs work
Benchmark - Every new SDK feature must have a use case
Onboarding - Developers copy-paste and ship

See also: 106_use_case_library.md for complete examples with full code

Use Case 1: Invoice Processing (Medium Complexity)

Scenario: Email arrives with invoice PDF → Extract data → Approve if >$5000 → Update QuickBooks

Time to build: 1 hour
Complexity: Medium
Components: Agent + LLM + Database + Human Approval + External API

Key SDK Features Demonstrated

import { handler } from '@human/agent-sdk';

export const invoiceProcessor = handler({
  id: 'invoice-processor',
  capabilities: ['finance/invoice/process'],
  
  async execute(ctx) {
    const { pdfUrl } = ctx.input;
    
    // 1. Infrastructure-invisible: PDF extraction
    const pdfText = await ctx.call.muscle('pdf-extractor', { url: pdfUrl });
    
    // 2. Infrastructure-invisible: LLM with provider abstraction
    const extraction = await ctx.llm.complete({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: 'Extract invoice data as JSON' },
        { role: 'user', content: pdfText }
      ],
      responseFormat: 'json'
    });
    
    const invoice = JSON.parse(extraction.text);
    
    // 3. Infrastructure-invisible: Database (auto-provisioned)
    const invoiceId = await ctx.db.query(
      'INSERT INTO invoices (vendor, amount, status) VALUES ($1, $2, $3) RETURNING id',
      [invoice.vendor, invoice.amount, 'pending']
    );
    
    // 4. Human-in-the-loop: Conditional approval
    if (invoice.amount > 5000) {
      const approved = await ctx.oversight.approve({
        question: `Approve invoice from ${invoice.vendor} for $${invoice.amount}?`,
        context: invoice,
        requiredCapability: 'finance/invoice-approver',
        timeout: 86400  // 24 hours
      });
      
      if (!approved) {
        await ctx.db.query('UPDATE invoices SET status = $1 WHERE id = $2', ['rejected', invoiceId]);
        return { status: 'rejected' };
      }
    }
    
    // 5. External integration: QuickBooks muscle
    await ctx.call.muscle('quickbooks-create-bill', {
      vendor: invoice.vendor,
      amount: invoice.amount,
      lineItems: invoice.lineItems
    });
    
    // 6. Provenance: All logged automatically
    return {
      status: 'processed',
      invoiceId,
      amount: invoice.amount
    };
  }
});

What This Proves

SDK Principle	Demonstrated How
Infrastructure-invisible	No config for PDF, LLM, DB, or QB
Secure by default	Provenance logged automatically
Scale-to-zero	No min instances configured
Progressive permissions	Approval requested when needed
Provider-agnostic	Could swap LLM model with one line

Result: 60 lines of business logic. Zero infrastructure code.

Use Case 2: Contract Review (High Complexity)

Scenario: Analyze contract → Flag risks → Route to lawyer if high-risk

Time to build: 1 hour
Complexity: Medium
Components: Agent + LLM + Capability Routing + Risk-Based Escalation

Key SDK Features Demonstrated

import { handler } from '@human/agent-sdk';

export const contractReviewer = handler({
  id: 'contract-reviewer',
  capabilities: ['legal/contract-review'],
  
  async execute(ctx) {
    const { contractText } = ctx.input;
    
    // 1. Model selection: Claude for legal reasoning
    const analysis = await ctx.llm.complete({
      model: 'claude-3-5-sonnet',  // Best for legal
      messages: [
        {
          role: 'system',
          content: `You are a legal contract analyzer. Output as JSON: {riskLevel, concerns, recommendations}`
        },
        { role: 'user', content: contractText }
      ],
      responseFormat: 'json'
    });
    
    const review = JSON.parse(analysis.text);
    
    // 2. Capability-based routing: High risk → lawyer
    if (review.riskLevel === 'high') {
      const lawyerReview = await ctx.oversight.escalate({
        question: 'High-risk contract detected. Please review.',
        context: { contractText, riskLevel: review.riskLevel, concerns: review.concerns },
        requiredCapability: 'legal/contract-attorney',  // Only certified attorneys
        priority: 'high'
      });
      
      return {
        status: 'reviewed',
        riskLevel: 'high',
        humanReview: lawyerReview,
        concerns: review.concerns
      };
    }
    
    // 3. Medium risk: optional review (create review link)
    if (review.riskLevel === 'medium') {
      return {
        status: 'flagged',
        riskLevel: 'medium',
        concerns: review.concerns,
        reviewUrl: ctx.oversight.createReviewLink({ context: review })
      };
    }
    
    // 4. Low risk: auto-approve
    return {
      status: 'approved',
      riskLevel: 'low',
      concerns: review.concerns
    };
  }
});

What This Proves

SDK Principle	Demonstrated How
Capability-first routing	`legal/contract-attorney` ensures qualified reviewer
Risk-based escalation	High → lawyer, medium → optional, low → auto
Model abstraction	Selected Claude for legal reasoning
Escalation vs approval	`escalate()` for urgent, `createReviewLink()` for optional

Use Case 3: Multi-Agent Medical Triage (Very High Complexity)

Scenario: Patient symptoms → Triage agent analyzes → Route to specialist → Book appointment

Time to build: 2 hours
Complexity: High
Components: Multi-agent orchestration + Capability routing + External integrations

Key SDK Features Demonstrated

import { handler } from '@human/agent-sdk';

export const medicalTriage = handler({
  id: 'medical-triage',
  capabilities: ['healthcare/triage'],
  
  async execute(ctx) {
    const { patientId, symptoms } = ctx.input;
    
    // 1. Call marketplace agent (triage analysis)
    const triageResult = await ctx.call.agent('marketplace:medical-triage-analyzer', {
      symptoms,
      patientHistory: await getPatientHistory(ctx, patientId)
    });
    
    // 2. Severity-based routing
    switch (triageResult.severity) {
      case 'critical':
        // Immediate escalation to on-call physician
        const onCallResponse = await ctx.oversight.escalate({
          question: 'CRITICAL: Patient requires immediate attention',
          context: { patientId, symptoms, triageAnalysis: triageResult },
          requiredCapability: 'healthcare/physician-on-call',
          priority: 'critical',
          timeout: 300  // 5 minutes
        });
        
        // If no response, call emergency services
        if (!onCallResponse) {
          await ctx.call.muscle('emergency-dispatch', { patientId });
        }
        
        return { status: 'escalated', severity: 'critical' };
      
      case 'urgent':
        // Route to specialist via capability
        const specialist = await ctx.call.agent('cap:healthcare-specialist-router', {
          symptoms,
          specialty: triageResult.recommendedSpecialty
        });
        
        // Book urgent appointment (EMR integration)
        const appointment = await ctx.call.muscle('ehr-schedule-appointment', {
          patientId,
          providerId: specialist.providerId,
          urgency: 'same-day'
        });
        
        return { status: 'scheduled', appointmentTime: appointment.time };
      
      case 'routine':
        // Self-care advice
        const advice = await ctx.call.agent('marketplace:medical-self-care-advisor', {
          symptoms,
          severity: 'routine'
        });
        
        return {
          status: 'advised',
          advice: advice.recommendations,
          scheduleUrl: await ctx.call.muscle('ehr-generate-booking-link', { patientId })
        };
    }
  }
});

What This Proves

SDK Principle	Demonstrated How
Multi-agent orchestration	3 agents + 2 muscles coordinated
Capability routing	`cap:healthcare-specialist-router` finds right specialist
Marketplace integration	Used 2 marketplace agents
Emergency handling	Critical path with fallback to 911
External integrations	EMR scheduling via muscle

Use Case 4: Real-Time Debugging (Developer Experience)

Scenario: Production agent fails → Developer time-travels to failure → Fixes → Redeploys

Time to debug: 5 minutes (vs 2 hours without time-travel)
Complexity: SDK feature validation
Components: Time-travel debugging + Provenance

The Problem (Without HUMAN SDK)

// Production failure:
// Error: "Invoice approval failed"
// 
// Developer must:
// 1. Search logs (10 min)
// 2. Reconstruct state (20 min)
// 3. Find approval context (15 min)
// 4. Reproduce locally (30 min)
// 5. Debug (45 min)
// 
// Total: 2 hours

The Solution (With HUMAN SDK)

# 1. Open console, see failure (30 seconds)
human console

# 2. Click "Time Travel to Failure" (30 seconds)
# → Automatically reconstructs exact state at failure

# 3. See provenance chain (1 minute)
# → Who: Agent invoice-processor
# → What: Approval request
# → When: 2026-01-03 10:30:15Z
# → Why: Amount exceeded threshold ($6,234 > $5,000)
# → Result: Human reviewer rejected (reason: "Duplicate invoice")

# 4. Identify issue: No duplicate detection (1 minute)

# 5. Fix locally (2 minutes)
# Add duplicate check before approval request

# 6. Deploy (1 minute)
human deploy

Total: 5 minutes

What This Proves

SDK Principle	Demonstrated How
Provenance by default	Every step logged automatically
Time-travel debugging	Reconstruct exact state at failure
Exquisite DX	5 min vs 2 hours to debug
Infrastructure-invisible	No logging config required

Use Case 5: Cost Optimization (Automatic)

Scenario: Agent uses expensive model → SDK automatically optimizes → 80% cost reduction

Time to optimize: 0 minutes (automatic)
Complexity: SDK intelligence validation
Components: Model routing + Cost tracking

What Developer Writes

export const questionAnswerer = handler({
  id: 'question-answerer',
  capabilities: ['ai/question-answering'],
  
  async execute(ctx) {
    // Developer just says "use LLM"
    const answer = await ctx.llm.complete({
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: ctx.input.question }
      ]
      // No model specified - SDK chooses
    });
    
    return { answer: answer.text };
  }
});

What SDK Does Automatically

// Behind the scenes (developer never sees this):

// Week 1: Use default model (GPT-4)
// → Tracks: 1000 questions, $50 cost, 2.3s avg latency

// Week 2: SDK learns pattern
// → Analysis: Simple questions, high latency tolerance, no complex reasoning
// → Suggestion: Try GPT-3.5-turbo (10x cheaper)

// Week 3: SDK A/B tests
// → GPT-4: $50/1000 questions, 2.3s latency, 95% quality
// → GPT-3.5-turbo: $5/1000 questions, 1.8s latency, 94% quality
// → Decision: Switch to GPT-3.5-turbo (saves $45/1000 = 90% cost reduction)

// Week 4: Auto-optimized
// → All questions route to GPT-3.5-turbo
// → Complex questions (detected via routing logic) still use GPT-4
// → Developer never configured anything

What This Proves

SDK Principle	Demonstrated How
Smart defaults	SDK learns usage patterns
Cost optimization	90% reduction without developer action
Adaptive infrastructure	Model selection improves over time
Developer focus	Write business logic, SDK handles efficiency

Use Case Matrix: SDK Features Demonstrated

Use Case	Infrastructure-Invisible	Human-in-Loop	Multi-Agent	Cost Opt	Time-Travel	Capability Routing
Invoice Processing	✅	✅	⚠️ (muscles)	✅	✅	⚠️
Contract Review	✅	✅	❌	✅	✅	✅
Medical Triage	✅	✅	✅	✅	✅	✅
Time-Travel Debugging	✅	❌	❌	❌	✅	❌
Cost Optimization	✅	❌	❌	✅	❌	❌

Coverage: All core SDK principles demonstrated across 5 use cases.

Validation Checklist

Before shipping any SDK feature, validate against these use cases:

Can invoice processing use it? (general purpose validation)
Can contract review use it? (legal domain validation)
Can medical triage use it? (healthcare + multi-agent validation)
Does it help debugging? (DX validation)
Does it reduce cost? (efficiency validation)

If answer is "no" to all → Feature may not be necessary.

Developer Journey Validation

Time	Milestone	Use Case That Proves It
5 min	First agent deployed	Echo agent (not shown, but KB 106)
15 min	Add database	Invoice processing (simplified)
30 min	Add human approval	Invoice processing (complete)
1 hour	Multi-step workflow	Contract review
2 hours	Multi-agent orchestration	Medical triage
1 week	Production-ready	All 5 use cases + monitoring

Next: Full Use Case Library

For complete, copy-paste-ready examples:

See: kb/106_use_case_library.md - 20+ use cases with full code
See: human-labs/quickstart repo (planned) - Runnable examples with tests

Metadata

File: 105_agent_sdk_architecture.md
Created: November 25, 2025
Version: 1.5 (December 19, 2025 - Added: Agent Discovery & Capability Management, Auto-Generated Agent Documentation, IDE Integration patterns with MCP server, vault-based agent storage, Cursor agent-aware development)
Status: Canonical
Classification: Internal (SDK will be Open Source)

Cross-References:

See: 10_ai_internal_use_and_companion_spec.md - Companion as reference implementation
See: 20_passport_identity_layer.md - Delegation model, credential scoping, identity hierarchy
See: 22_humanos_orchestration_core.md - Agent-to-agent delegation chains, provenance
See: 104_companion_meeting_muscles_spec.md - Meeting muscles extracted to SDK
See: 111_consumer_companion_and_agent_store.md - Agent Store marketplace and consumer distribution
See: 112_extension_connector_gtm_roadmap.md - Complete SDK specifications, 25+ connector examples, developer onboarding flows, CLI tool documentation, and GTM strategy
See: 95_open_source_strategy_and_licensing.md - SDK licensing (Apache 2.0)
See: 43_haio_developer_architecture.md - HAIO protocol for developers
See: 50_human_agent_design.md - Agent design principles
See: 130_agent_design_patterns.md - Orchestration and trust patterns

Key Sections Added (December 2025):

CORE SDK PRIMITIVES - Unified ctx.* API pattern, handler() wrapper
AGENT-TO-AGENT DELEGATION MODEL - Chained delegation with auto-scoping
CREDENTIAL MANAGEMENT - Passport > Vault > Env cascade with handler scoping
LLM COST CONTROLS - Tiered thresholds with human escalation
TESTING PATTERNS - Semantic assertions, golden outputs, deterministic fixtures
TIME-TRAVEL DEBUGGING - Metadata vs full capture, replay system
PROMPT VERSIONING - Repo as source, registry as runtime, A/B testing
INFRASTRUCTURE PROVISIONING - Declarative YAML for DB, cache, storage, queue, vector
MULTI-LANGUAGE SDK GENERATION - Protocol Buffers + OpenAPI → TypeScript, Python, Go, Rust
AGENT-READABLE DOCUMENTATION - OpenAPI, llms.txt, context.json formats

For detailed implementation:

Connector SDK Architecture: Full BaseConnector class, OAuthHelper, governance integration patterns - see 112_extension_connector_gtm_roadmap.md lines 1772-2070
25+ Production Connector Examples: Google Calendar, Gmail, Salesforce, AWS Bedrock, Epic EHR, Bloomberg, and more with complete TypeScript interfaces and use cases
CLI Tool Specification: human init, human dev, human test, human publish with AI-assisted development features
Developer Onboarding Flows: 3-5 day path from discovery to first connector published

Next Review: Monthly during development

105. HUMAN AGENT SDK ARCHITECTURE

CORE PHILOSOPHY: INFRASTRUCTURE IS INVISIBLE

The Principle

Why This Matters

Design Principles

The Minimal Manifest

OVERVIEW

STRATEGIC RATIONALE

Why an Agent SDK?

The Companion as Blueprint

The Human App Store

MIGRATION & INTEROPERABILITY

The Migration Philosophy

The Muscle Adapter Pattern

How Muscle Adapters Work

What Muscle Adapters CAN Do

What Muscle Adapters CANNOT Do

Attestation Model for Muscles

Planned Framework Adapters

CLI Import Commands (Planned)

AgentField Interoperability Note

Migration Path to Native

OPEN SOURCE STRATEGY

Licensing Model

What's Open

What's Proprietary

DEVELOPER-FIRST EXPERIENCE

The 10-Minute Agent

CLI Commands

Agent Suites (Monorepo for Agents)

Deployment Profiles: Roll Your Own or Just Works

Hosted Profile: Zero Config

Hybrid Profile: Data Stays in Your VPC

Self-Hosted Profile: Full Control

Choosing a Deployment Profile

The human.call() Primitive

Parameter Semantics

Error Model

Agent Discovery

Manifest Format

Dev Mode Features

The Killer Feature Matrix

Learning Path

CORE SDK PRIMITIVES

The Execution Context Pattern

The ctx API Reference

Key Distinctions

Handler Definition

HUMAN-IN-THE-LOOP (ctx.oversight)

Why "oversight" not "human"?

ctx.oversight API

Example: High-Risk Approval

Example: Rich Escalation

Provenance for Oversight Interactions

ctx.workforce — Human Worker Pool

ctx.workforce API

Example: Human Review in Workflow

ctx.capabilities — Capability Graph

ctx.capabilities API

Example: Capability-Based Routing

UNIVERSAL ROUTING (ctx.call)

ctx.call API

Example: Universal Routing

AGENT-TO-AGENT DELEGATION MODEL

The Delegation Chain

How It Works

Provenance Chain Recorded

Explicit Pass-Through (When Required)

Delegation Validation Rules

ctx.passport — Identity Layer

Passport Model

Example: Checking Delegation

ctx.vaults — Multi-Vault Storage

Vault Model

ctx.vaults API

Manifest Configuration

Example: Multi-Vault Access

Vault Safety Mechanisms

ctx.memory — Memory Fabric

Why This Matters

The `human.call()` Primitive

The `ctx` API Reference

LLM-Optimized Summary (`llms.txt`)