From Hosted to Self-Hosted Without Rewriting: The Zero-Regret Architecture

"Start on HUMAN-hosted in 5 minutes. Move to your own infra in 5 days, without rewriting your app."

That's the barrier killer.

This document defines how HUMAN supports every deployment model — from 10-person teams to regulated enterprises — through clean architectural boundaries and zero-regret migration paths.

THE DESIGN PRINCIPLE: ZERO-REGRET HOSTING

We keep our core invariants:

Keys live on devices (Passport rooted in hardware, not our cloud)
Cloud is coordination + proofs, not raw user data
Storage is pluggable behind clean adapters

The mantra becomes:

"Start on HUMAN-hosted in 5 minutes. Move to your own infra in 5 days, without rewriting your app."

This solves the fundamental tension:

SMBs need: "Please don't make me think about infra. Just make it work."
Enterprises need: "Cool idea, but it has to run in our VPC / DB / SIEM / whatever."

One architecture. Multiple deployment profiles. Same APIs. Same semantics.

WHAT HUMAN HOSTS VS WHAT WE REFUSE TO HOST

Think in 3 layers:

🔐 Layer 0 – Devices (non-negotiable)

Passport keys live on:

iPhones, Macs, laptops, Android devices, etc.

We never host:

Root identity keys
Raw biometric / behavioral signals

We may host:

Public keys
Signed assertions ("this key belongs to Org X, role Y"), but not the keys themselves

This is structural: HUMAN cannot hold your identity even if we wanted to.

🧠 Layer 1 – HUMAN Control Plane (this is the "managed hosting" we do want)

This is the stuff we can run as HUMAN Cloud or you can self-host:

What Lives Here

Identity federation: Mapping "IdP user 123" → "Passport subject ABC"
Capability Graph: Roles, permissions, what a given human/agent is allowed to approve
HumanOS policy engine: "When an AI tries to do X, require human Y or escalate to group Z"
Attestation / ledger interface: Where "AI did X under these conditions with these humans" gets recorded

This is metadata about trust, not the org's documents/emails/PHI.

Deployment Options

We:

Offer this as managed for SMB / fast starts (HUMAN Cloud)
Offer hybrid / self-hosted for enterprises who want it in their VPC

📦 Layer 2 – Data & Systems (we stay aggressively out of here)

What lives in:

Salesforce, Google Workspace, O365, Epic, internal DBs, S3 buckets, etc.

We don't become their database:

We connect to these via:

OAuth / service accounts / private links

We only store:

IDs, hashes, and pointers needed for provenance and policy decisions

Result:

✅ Yes to hosting the trust fabric
❌ No to becoming their data warehouse

THE THREE DEPLOYMENT PROFILES

We make this a first-class concept:

Profile 1: Hosted (SMB Default)

What it means:

Everything in HUMAN Cloud, except:

Keys on devices
Primary data in their SaaS tools

Who it's for:

10–200 person companies
Teams without dedicated infra people
Organizations prioritizing speed over control

Setup time: 5 minutes

Monthly cost: Plan-based ($X–$XX per user + governed events)

Migration path: Can move to Hybrid or Self-hosted later with export/import tooling

Profile 2: Hybrid (Enterprise Common)

What it means:

HUMAN control plane in our cloud:

Policy engine
Capability Graph
Federation of identities

Data plane in their infra:

They run the ledger node(s)
They host any caches / sensitive stores

Connect via:

Outbound-only secure tunnels
Or VPC peering, depending on taste

Who it's for:

Mid-market to enterprise (200–10,000 employees)
Organizations with infra teams but want operational simplicity
Companies with data residency requirements but flexible on control plane

Setup time: 1–2 days

Monthly cost: Platform fee + usage + optional support

Migration path: Can move to full Self-hosted when compliance requires it

Profile 3: Self-Hosted (Regulated / Gov)

What it means:

We give them:

Helm charts / Terraform / Ansible
Reference architecture
AI-powered installation automation (see "AI-Powered Installation Automation" section)
Compliance templates (HIPAA, FedRAMP, PCI-DSS, GDPR)

They run (self-hosted in their infrastructure):

HumanOS orchestration engine
- Policy engine (escalation rules, safety boundaries)
- Routing logic (capability-first task assignment)
- Approval queue service
Capability Graph (org-scoped view)
- Internal employee and agent capabilities
- Skill tracking and growth
- Capability attestations (org-namespaced)
Workforce Cloud Runtime (internal routing layer)
- Agent-to-agent task orchestration
- Agent-to-human escalation (employees or customers)
- Capability-based task assignment within organization
- Internal workflow execution engine
MARA Runtime (agent execution environment)
- Agent pods and workloads
- Agent registry service
- Execution monitoring
Ledger nodes (local audit trail)
- Provenance logs
- Attestation storage
- Immutable audit trail
- Optional federation with HUMAN's public ledger
All storage (databases, object storage, caches)
- PostgreSQL (policies, workflows, history)
- Vector store (agent memory, capability embeddings)
- Object storage (MinIO, S3)
- Redis (caching, sessions)
Monitoring & observability stack
- Prometheus, Grafana
- Loki (log aggregation)
- Tempo (distributed tracing)

They access via API (HUMAN-hosted, optional services):

Workforce Cloud Global Marketplace (optional, pay-per-task)
- For routing to external trained humans beyond internal staff
- 24/7 coverage, surge capacity, specialized expertise
- Academy-trained workers globally
- Pricing: Usage-based ($50 per escalation + $75 per human-hour)
- Use cases: Overflow beyond org's staff, specialized skills, 24/7 ops
Academy Training Platform (free for individuals, volume pricing for enterprises)
- Web-based access for employee training
- Always free for displaced workers (Zero Barriers principle)
- Enterprise bulk programs: $500/employee/year (volume discounts available)
- Requirements: Internet connectivity (no self-hosted option by design)
- Integration: SSO, custom learning paths, capability sync via API
- See: KB 24 (Academy) for full deployment model
Global Capability Federation (optional, subscription)
- Cross-org credential verification
- Interoperability with other HUMAN deployments
- Verify capabilities from external organizations
- Pricing: $10K/year
Public Ledger Anchoring (included in platform license)
- Global attestation root of trust
- Distributed ledger for cross-org verification
- Customer can run federated ledger nodes
- Anchoring to HUMAN's public ledger for global validity

Deployment modes:

Air-Gapped (Full Isolation):
- Runs WCR, MARA, HumanOS, Capability Graph fully offline
- No access to Global Marketplace or Academy
- Local-only ledger (no global verification)
- Org-only capability tracking
- Requires internal image registries, Helm mirrors
Canonical note (Passport growth still happens): Even in full isolation, governed work events still update the local Capability Graph and create local attestations. The Passport evolves in-place via updated CapabilityGraphRoot (pointer to the head of the personal graph in the on-prem vault) and new LedgerRefs (attestation anchors on the on-prem ledger).

Example data placement (air-gapped):
```
personalGraphs:
  storage: on_prem_vault
evidence:
  storage: on_prem_vault
attestations:
  storage: on_prem_ledger
  federation:
    enabled: false
```
Hybrid (Internal + Global Services):
- Self-hosted control plane + optional HUMAN Cloud services
- Internal routing for org's tasks
- API access to Global Marketplace for overflow
- Employee access to Academy for training
- Most common for regulated enterprises

Who it's for:

Regulated industries (healthcare, finance, government)
Organizations with strict data sovereignty requirements
Companies with mature platform engineering teams
Air-gapped environments (defense, intelligence)
Multi-national corporations with data residency compliance

Setup time:

With AI installation assistant: 5-15 minutes (automated)
With intelligent CLI: 1-2 hours (semi-automated)
Manual Helm/Terraform: 1-2 weeks (depends on infra complexity)

Monthly cost:

Platform license: $30K-$150K+/year (based on scale, see KB 34)
Support contract: Included (24/7 for Enterprise Elite)
Optional services: Usage-based (Workforce Cloud, Academy bulk, Federation)
Infrastructure costs: Customer responsibility (compute, storage, networking, AI tokens)

Migration path: This is the end state; no further migration needed

SELF-HOSTED SECURITY BOUNDARIES

Overview

Self-hosted deployments provide maximum control and data sovereignty, but they do not grant identity minting authority or bypass cryptographic safeguards.

This section explicitly defines what self-hosted infrastructure CAN and CANNOT do, and why infrastructure compromise doesn't threaten human sovereignty.

Key Principle: Trust derives from cryptography, not operational control.

What Self-Hosted Infrastructure CAN Do

✅ Identity Verification

Verify Passport signatures cryptographically
Validate DID resolution and key ownership
Check delegation chains for authenticity
Verify attestation signatures

Why Safe: Verification requires only public keys. No private key access needed.

✅ Org-Scoped Attestations

Issue attestations within organizational namespace
Attest to employment, roles, permissions within the org
Sign attestations with org's private key (held in org HSM)

Why Safe: Org attestations are namespaced. They don't affect other organizations or create global identity.

✅ Policy Engine & Agent Runtime Hosting

Run HumanOS policy engine
Host agent execution environments
Enforce escalation rules and safety boundaries
Route tasks based on capability requirements

Why Safe: Policy enforcement is read-only verification. Cannot override cryptographic constraints.

✅ Org and Agent Key Custody

Hold Org Passport keys in organizational HSMs
Custody Agent Passport keys under policy constraints
Manage agent delegation certificates

Why Safe: These are delegated identities, not sovereign identities. They derive authority from humans, not from infrastructure.

What Self-Hosted Infrastructure CANNOT Do

These constraints are cryptographically enforced, not policy-based. Violating them renders the deployment non-compliant with the HUMAN Protocol.

❌ Server-Side Human Passport Creation

Forbidden:

Minting Human Passports on servers
Generating human identity keys in infrastructure
Creating "admin" identities that impersonate humans

Why Forbidden:

Human Passports MUST be created on-device (Secure Enclave, TEE, hardware key)
Private keys MUST NEVER leave the device
Only devices can prove human presence (biometric, passkey)

Technical Enforcement:

Device attestation required for Human Passport minting
Ledger rejects Human Passports without device signature
Other deployments reject server-minted identities

Result: Self-hosted infrastructure physically cannot create human identities that other systems will accept.

❌ Admin-Minted "Human" Identities

Forbidden:

Admins creating "human" accounts for convenience
Shared credentials representing multiple humans
Service accounts masquerading as humans

Why Forbidden:

Violates identity sovereignty (humans own their identity)
Breaks provenance (can't distinguish human from admin action)
Creates liability (who is responsible for actions?)

Technical Enforcement:

Human Passports require device-rooted keys
Capability Graph rejects capability updates from non-device sources
Attestations require human signature, not admin signature

Result: Admin convenience cannot override identity architecture.

❌ Shared or Pooled Human Signing Keys

Forbidden:

Multiple humans sharing one private key
"Team" identities with shared credentials
Delegating human signing authority to infrastructure

Why Forbidden:

Destroys accountability (who signed this?)
Breaks provenance chain (no attribution)
Enables impersonation (anyone with key = "you")

Technical Enforcement:

Each human has unique DID and keypair
Private keys never exported from device
Signature verification checks specific DID

Result: Infrastructure cannot hold human private keys, even if admins request it.

❌ Identity Recovery Performed by Infrastructure

Forbidden:

Admins "recovering" human identity without human authorization
Resetting human private keys from servers
Backdoor recovery mechanisms

Why Forbidden:

Recovery without human = identity theft
Breaks trust (infrastructure can impersonate)
Creates legal liability (unauthorized access)

Technical Enforcement:

Recovery requires guardian quorum (other humans, not servers)
Recovery process uses threshold cryptography (no single point of failure)
Ledger logs all recovery attempts

Result: Only humans (via guardian network) can recover human identity. Infrastructure cannot override.

❌ Silent Identity Creation or Modification

Forbidden:

Creating identities without human approval
Modifying identity records without signed consent
"Backdating" identity changes

Why Forbidden:

Violates consent (humans must approve)
Breaks provenance (no audit trail)
Enables fraud (who made this change?)

Technical Enforcement:

All identity changes require signature from identity owner
Ledger anchors record creation and modification timestamps
Unsigned changes rejected by protocol

Result: Infrastructure cannot modify identity, even with "good intentions."

Breach Blast Radius Analysis

Understanding what an attacker gains by compromising different deployment types:

Hosted Profile Breach (HUMAN Cloud Compromise)

What Attacker Gains:

Disruption of service (DoS)
Metadata about API usage (traffic patterns)
Ability to issue fake attestations (rejected by verification)

What Attacker CANNOT Gain:

Human private keys (never stored server-side)
Ability to mint Human Passports (device-only)
Ability to impersonate humans (no private keys)
Ledger modification (distributed, immutable)

Customer Impact:

Hosted customers: Service interruption (failover to backup region)
Self-hosted customers: Zero impact (independent deployments)

Mitigation:

Multi-region active-active (automatic failover)
Keys on devices (zero server-side exposure)
Ledger distribution (no single point of truth)

Hybrid Profile Breach (Data Plane Compromise)

What Attacker Gains:

Access to customer's ledger nodes (can disrupt sync)
Access to org attestations (can view org-specific data)
Potential ability to issue fake org attestations (namespaced)

What Attacker CANNOT Gain:

Human private keys (on devices)
Ability to mint Human Passports (device-only)
Access to other orgs' data (namespace isolation)
Ability to override human decisions (cryptographically enforced)

Customer Impact:

Affected customer: Must revoke org key and re-issue attestations
Other customers: Zero impact (namespace isolation)
Humans: Zero impact (keys on devices)

Mitigation:

Org key revocation via ledger broadcast
Namespace isolation prevents cross-org contamination
Audit trail reveals all actions during compromise window

Self-Hosted Profile Breach (Full Infrastructure Compromise)

What Attacker Gains:

Full access to org's deployment (database, services, keys)
Ability to issue fake org-scoped attestations
Ability to disrupt org's operations
Metadata about org's agent usage

What Attacker CANNOT Gain:

Human private keys (on devices, not in infrastructure)
Ability to mint Human Passports (device-only)
Ability to impersonate humans in other orgs (namespace isolation)
Ability to modify capability records for humans (requires human signature)
Access to distributed ledger state (replicated across network)

Customer Impact:

Affected org: Must revoke org key, rebuild infrastructure
Other orgs: Zero impact (namespace isolation)
Humans: Identity intact (keys on devices)

Mitigation:

Human keys never in infrastructure (zero exposure)
Org key revocation invalidates all attestations
Other orgs' verification rejects compromised attestations
Humans can revoke consent and move to new org deployment

Critical Insight: Even complete infrastructure compromise doesn't grant attacker human identity authority.

Open Source Safety Guarantees

Q: Can self-hosted customers modify HUMAN code to bypass these restrictions?

A: No. Protocol compliance is mathematically enforced, not code-enforced.

Why Code Modification Doesn't Grant Authority

Cryptographic Verification is Protocol-Level
- Even modified code must verify Ed25519 signatures
- Invalid signatures are rejected by other nodes
- Forked implementations cannot interoperate without compliance
Network Effects Enforce Standards
- Distributed ledger rejects non-compliant attestations
- Other deployments ignore invalid signatures
- Humans choose which implementations to trust
Device Keys are the Source of Truth
- Human identity keys live on devices, not in code
- Infrastructure verifies signatures, doesn't create them
- Modified infrastructure cannot access device keys
Interoperability Requires Compliance
- Non-compliant forks cannot participate in ledger
- Attestations from non-compliant deployments are rejected
- Enterprise customers lose certification

Example Attack (Why It Fails):

// Malicious self-hosted deployment tries to mint human identity
async function evilAdminMintHuman() {
  const fakePassport = {
    did: 'did:human:evil-admin-123',
    publicKey: generateKeyPair().publicKey,
    // ... other fields
  };
  
  await db.passports.insert(fakePassport);
  
  // ❌ This fails because:
  // 1. No device attestation (requires Secure Enclave signature)
  // 2. DID not registered on distributed ledger
  // 3. Cannot sign with fake private key (device holds real key)
  // 4. Other deployments reject attestations from this DID
  // 5. Humans won't trust this "passport" (no provenance)
  
  return fakePassport; // Locally stored, but useless
}

Result: Modified code can create database records, but not valid identities.

Comparison: Self-Hosted HUMAN vs Self-Hosted Traditional Identity

Dimension	Traditional IdP (Okta, Auth0)	HUMAN Self-Hosted
Identity Creation	Admin creates users	Only devices create humans
Key Storage	Server-side (HSM)	Device-only (Secure Enclave)
Admin Override	Admins can reset passwords	Admins cannot access human keys
Impersonation Risk	Admin can impersonate users	Cryptographically impossible
Compromise Blast Radius	All users (admin has master keys)	Org only (humans unaffected)
Recovery	Admin initiates	Guardian quorum (other humans)
Portability	Vendor lock-in	Globally portable DID

Why This Matters:

Traditional self-hosted identity gives administrators god-mode access. HUMAN self-hosted gives administrators operational control without identity authority.

This is the architectural innovation that makes self-hosted deployments safe at scale.

Compliance Statement

For regulated industries:

Self-hosted HUMAN deployments comply with:

HIPAA (patient identity sovereignty)
GDPR (data subject rights, right to portability)
eIDAS (qualified electronic signatures)
SOC 2 (cryptographic key management)
Zero Trust Architecture (continuous verification, no implicit trust)

Certification:

Self-hosted deployments that violate identity minting rules:

Lose HUMAN Protocol certification
Cannot interoperate with HUMAN Cloud or other compliant deployments
Lose vendor support and updates
Risk regulatory non-compliance

Audit Trail:

All self-hosted deployments must:

Log all identity verification events
Maintain provenance for all attestations
Participate in distributed ledger (or run private ledger node)
Submit to periodic compliance audits (for certification)

Why This Architecture Wins

For Enterprises:

Self-hosting gives control without creating liability
Infrastructure compromise doesn't expose human identity
Clear blast radius (org only, not global)
Regulatory compliance by design

For Humans:

Identity sovereignty maintained even in self-hosted deployments
Can leave org without losing identity
Cannot be impersonated by admins
Portable across all deployment types

For HUMAN:

Open source doesn't compromise security
Self-hosted doesn't fragment protocol
Network effects reinforce standards
Trust derives from cryptography, not vendor control

Result: Self-hosted deployments are safe, compliant, and strategically valuable — not a security liability.

AI-POWERED INSTALLATION AUTOMATION

The Problem with Traditional Self-Hosting:

1-2 weeks to deploy
Requires Kubernetes expertise
1266 lines of manual YAML
High error rate, high support burden
Blocks SMBs from self-hosting

HUMAN's Solution: Installation as a Conversation

Self-hosted HUMAN installs in 5-15 minutes through three paths:

Installation Path 1: Companion Installer (Conversational)

Natural language installation for technical decision-makers:

User: "I want to self-host HUMAN in our AWS VPC for 50 agents with HIPAA compliance"

Companion Installer:

Detects environment (AWS EKS, RDS available, VPC config)
Asks 5 clarifying questions (HA requirements, air-gap, integrations)
Generates optimal configuration (capacity planning, compliance hardening)
Executes automated installation with human approval at critical steps
Validates deployment health
Provides dashboard access + admin credentials

Time: 8-15 minutes
Human involvement: Approve 3-5 critical decisions
Expertise required: Understand business requirements (not YAML)

Installation Path 2: Intelligent CLI

For engineers who prefer CLI:

$ npx @human/installer init

🔍 Detecting environment...
   ✅ AWS EKS cluster detected (us-east-1)
   ✅ kubectl configured (v1.28)
   ✅ PostgreSQL RDS available

📋 Configuration wizard (5 questions):
   ? Agent capacity: 50 agents
   ? High availability: Yes (multi-AZ)
   ? Compliance: HIPAA
   ? Air-gapped: No

🏗️ Installing HUMAN...
   [Progress bars for each component]

✨ Complete (12m 34s)

Time: 10-20 minutes
Human involvement: Answer 5 questions
Expertise required: Basic cloud/k8s familiarity

Installation Path 3: Cloud Marketplace

One-click deployment for enterprises:

AWS Marketplace → Click "Launch" → HUMAN deployed in 10 minutes
GCP Marketplace → Same experience
Azure Marketplace → Same experience

Time: 5-10 minutes (fully automated)
Human involvement: Click "Subscribe"
Expertise required: None

Installation Path 4: Manual (Advanced)

For maximum customization or air-gapped environments with no installer access:

Follow detailed implementation spec: setup/agent_deployment_selfhosted_spec.md
Manual Helm/kubectl commands
Full control over every configuration detail

Time: 1-2 weeks
Human involvement: Full manual configuration
Expertise required: Deep Kubernetes/infrastructure knowledge

How AI-Powered Installation Works

Environment Detection:

Cloud provider (AWS, GCP, Azure, bare metal)
Kubernetes version and capabilities
Existing infrastructure (databases, storage, monitoring)
Network configuration (VPC, subnets, security groups)
Compliance posture (encryption, audit logs)

Configuration Generation:

Capacity planning (CPU, memory, storage based on agent count)
Compliance templates (HIPAA, FedRAMP, PCI hardening)
High availability (multi-AZ, failover, backup)
Cost optimization (minimum viable resources)
Security best practices (CIS benchmarks, zero-trust)

Automated Installation:

Pre-flight validation (capacity, permissions, connectivity)
Kubernetes resource creation (namespaces, deployments, services)
Database schema migration
Secrets management (encryption at rest)
Network policies (zero-trust networking)
Monitoring stack deployment (Prometheus, Grafana)

Post-Install Validation:

All pods healthy
Database connectivity
API responsiveness
Agent registration works
Storage accessible
Monitoring operational

Human-in-the-Loop:

Approve critical decisions (database connection, secrets creation)
Review generated configuration before apply
Escalation on errors (with remediation suggestions)

Why This Matters

Before AI-powered installation:

Self-hosting = enterprise-only (requires dedicated ops team)
SMBs blocked from data sovereignty
High support burden for HUMAN
Slow adoption, high friction

After AI-powered installation:

Self-hosting = accessible to SMBs
5-15 minute setup (vs 1-2 weeks)
95% success rate (vs ~60% manual)
Low support burden
Fast adoption, low friction

This is Living HAIO: AI agents installing and configuring AI agent infrastructure.

Status: Vision documented (this PRD). Implementation: Q1 2026.
See: KB 50 (Human Agent Design) for agent architecture.

SELF-HOSTED ENTERPRISE REQUIREMENTS

Licensing & Enforcement

License Types:

License Type	Annual Price	Agent Limit	Support Level	Use Case
Development	$0	5	Community	Testing, staging environments
Production (Node-Locked)	$30,000	200	Standard	Single datacenter deployment
Production (Floating)	$50,000	200	Standard	Multi-datacenter with failover
Enterprise (Unlimited)	$100,000+	Unlimited	Enterprise + TAM	Global deployments, MSPs

Enforcement Mechanism:

License key validated on control plane startup
Cryptographic signature verification
Phone-home validation (once per 24hr, optional for air-gapped)
Grace period: 30 days after expiry (with warnings)
Air-gapped: Offline license validation via signed JWT

License Renewal:

Automated renewal reminders (90, 60, 30, 7 days)
Zero-downtime renewal (hot-swap license keys)
Volume discounts for multi-year contracts

Support & Service Level Agreements

Support Tiers:

Severity	Response Time	Resolution Target	Channels	Included In
P0 (System Down)	<1 hour	<4 hours	Phone, Slack, Email	Enterprise
P1 (Critical Impact)	<4 hours	<24 hours	Slack, Email	Standard+
P2 (Moderate Impact)	<8 hours	<3 days	Email, Portal	Standard+
P3 (Low Impact)	<24 hours	<7 days	Portal	All (incl Community)

Support Access Requirements:

Standard: Business hours (9-5 local time), email + portal
Enterprise: 24/7, dedicated Slack channel, phone, TAM assigned
Community: Forums, GitHub issues, community Slack (best-effort)

Enterprise Support Add-Ons:

Technical Account Manager (TAM): +$10k/year
Professional Services: $250/hour
Onsite Training: $2k/person (2-day workshop)
Compliance Certification Support: $15k/year (HIPAA, FedRAMP guidance)

Total Cost of Ownership (TCO) Analysis

TCO Comparison: Hosted vs Self-Hosted (50 agents, 3 years)

Cost Component	HUMAN-Hosted	Self-Hosted
Software License	$0 (usage-based)	$30k/yr × 3 = $90k
Infrastructure	Included	$3.2k/mo × 36 = $115k
Operational Labor	Included	0.5 FTE × 3yr = $180k
Support	Included	$0 (Standard incl)
Upgrades	Automated	Included
Total (3yr)	~$180k	~$385k

Break-Even Analysis:

Self-hosted TCO higher for <100 agents
Break-even at ~150-200 agents (3-year horizon)
Self-hosted wins for >200 agents OR data sovereignty required

When Self-Hosted Makes Sense:

Regulated industries (HIPAA, FedRAMP, PCI)
Air-gapped environments (defense, classified)
Data sovereignty requirements (EU, China, government)
Very high scale (>200 agents)
Existing infrastructure (sunk costs in datacenter)

When Hosted Makes Sense:

Small deployments (<50 agents)
Fast time-to-value (no infrastructure burden)
Variable workloads (pay-as-you-go)
No ops team available

Reference Architectures

Small Enterprise (5-20 agents):

Kubernetes: 3 nodes, 4vCPU, 8GB each
Database: PostgreSQL (8vCPU, 32GB, Multi-AZ)
Storage: 500GB SSD
Estimated cost: $1,050/month infrastructure + $30k/yr license

Medium Enterprise (20-100 agents):

Kubernetes: 10 nodes, 8vCPU, 16GB each
Database: PostgreSQL (16vCPU, 64GB, Multi-AZ + replicas)
Storage: 2TB SSD
Estimated cost: $3,200/month infrastructure + $50k/yr license

Large Enterprise (100-500 agents):

Kubernetes: 30 nodes, 16vCPU, 32GB each
Database: PostgreSQL (32vCPU, 128GB, Multi-AZ + read replicas)
Storage: 10TB SSD
Multi-region deployment (primary + DR)
Estimated cost: $12k/month infrastructure + $100k/yr license

Global Deployment (500+ agents, multi-region):

Kubernetes: 100+ nodes across 3+ regions
Database: Distributed PostgreSQL (CitusDB or similar)
Storage: 50TB+ distributed
Multi-cloud (AWS + Azure for resilience)
Estimated cost: $50k+/month infrastructure + custom licensing

COMPLIANCE READINESS FOR SELF-HOSTED

HIPAA Compliance

HUMAN provides:

Encryption at rest (database, storage, secrets)
Encryption in transit (TLS 1.3)
Audit logging (all access, all actions)
Access controls (RBAC, MFA)
Business Associate Agreement (BAA) template

Customer responsible for:

Administrative safeguards (policies, training)
Physical safeguards (datacenter security)
Technical safeguards (network security, backups)

HIPAA-Specific Configuration:

compliance:
  hipaa:
    enabled: true
    auditLogging:
      retention: 6years  # HIPAA requirement
      immutable: true
    encryption:
      algorithm: AES-256-GCM
      keyRotation: 90days
    accessControls:
      mfaRequired: true
      sessionTimeout: 15min

HIPAA Checklist: See compliance document docs/compliance/self-hosted-checklists.md

FedRAMP Compliance (Moderate Baseline)

HUMAN provides:

Automated compliance configuration templates
Control implementation documentation
Continuous monitoring dashboards
Incident response runbooks

Customer responsible for:

Full FedRAMP authorization package
Third-party assessment organization (3PAO) audit
Continuous monitoring (ConMon) program

FedRAMP Support:

HUMAN can provide FedRAMP compliance support: $15k/year
Includes: Control mapping, documentation templates, audit support

Note: Full FedRAMP authorization is a 12-18 month process. HUMAN provides technical controls; customer owns authorization.

PCI-DSS Compliance

Applicable if: Processing, storing, or transmitting cardholder data

HUMAN provides:

Network segmentation (Kubernetes network policies)
Encrypted storage and transmission
Access control and logging
Vulnerability management guidance

Customer responsible for:

PCI-DSS compliance validation (QSA or SAQ)
Cardholder data environment (CDE) segmentation
Regular penetration testing

GDPR Compliance

HUMAN provides:

Data portability (export APIs)
Right to erasure (deletion APIs)
Data processing agreements (DPA)
Privacy-by-design architecture

Customer responsible for:

Lawful basis for processing
Data subject consent management
Data protection impact assessments (DPIA)
GDPR compliance program

AIR-GAPPED OPERATIONS (EXTENDED)

Update Distribution Methods

For environments with no external connectivity:

Method 1: USB Transfer

Download update bundle from HUMAN portal (authenticated)
Transfer via USB to air-gapped environment
Verify cryptographic signature
Apply via installer CLI

Method 2: Secure FTP

HUMAN pushes updates to customer-controlled SFTP
Customer pulls to air-gapped environment
Signature verification required

Method 3: Courier (High-Security)

Physical media shipment for classified environments
Tamper-evident packaging
Chain-of-custody documentation

Update Bundle Contents

human-v1.2.0-airgapped.tar.gz (signed)
├── helm-charts/           # Versioned Helm charts
├── container-images/      # All Docker images (no registry pulls)
├── database-migrations/   # SQL migration scripts
├── installer-cli/         # Offline installer binary
├── license-validator/     # Offline license validation
├── checksums.txt          # SHA256 of all files
└── signature.sig          # GPG signature for verification

Local LLM Integration

For air-gapped environments requiring AI capabilities:

Supported Local LLM Providers:

Ollama (easiest setup)
vLLM (high performance)
LocalAI (model-agnostic)

Configuration:

llm:
  provider: ollama
  endpoint: http://ollama.internal:11434
  model: llama2:70b
  airgapped: true
  fallback: none  # No external API calls

Model Distribution:

Models included in air-gapped bundle OR
Customer downloads separately and transfers

Air-Gapped Certificate Management

Challenge: No external Certificate Authority (CA) access

Solution: Internal CA

tls:
  ca: internal
  certPath: /etc/human/certs/
  keyPath: /etc/human/keys/
  renewalStrategy: manual  # No ACME in air-gapped

Process:

Generate internal CA (one-time)
Issue certificates for HUMAN components
Distribute CA cert to all clients
Manual renewal before expiry (alerts at 30/60/90 days)

Offline License Validation

Standard licensing: Phone-home validation (24hr interval)

Air-gapped licensing: Signed JWT with long expiry

license:
  type: airgapped
  key: <signed-jwt-with-6month-expiry>
  validation: offline
  renewal: manual  # Requires new JWT from HUMAN

Renewal process:

Generate renewal request (includes deployment ID)
Transfer request to connected environment
Submit to HUMAN portal
Receive new signed JWT
Transfer back to air-gapped environment
Apply new license (zero-downtime)

Fallback & Degraded Mode

If critical services unavailable in air-gapped:

LLM unavailable → Route to human-only workflow
Monitoring unavailable → Local logging only
License validation unavailable → Grace period (30 days warning)

Principle: System remains operational, degraded gracefully

PERFORMANCE BENCHMARKS & CAPACITY PLANNING

Capacity Planning Formulas

Kubernetes Nodes:

nodes_required = ceil(agent_count / 10)  # 10 agents per node (8vCPU, 16GB)
+ 3  # Control plane nodes (HA)
+ 2  # Monitoring nodes

Database:

db_cpu = max(8, agent_count / 25)  # 1 vCPU per 25 agents
db_memory_gb = max(32, agent_count * 0.5)  # 500MB per agent
db_storage_gb = max(100, agent_count * 2)  # 2GB per agent (logs, history)

Redis:

redis_memory_gb = agent_count * 0.1  # 100MB per agent (session cache)

Network Bandwidth:

bandwidth_mbps = agent_count * 5  # 5 Mbps per active agent

Example: 200 Agent Deployment

Kubernetes:

Nodes: ceil(200/10) + 3 + 2 = 25 nodes (8vCPU, 16GB each)
Total: 200 vCPU, 400GB RAM

Database:

CPU: max(8, 200/25) = 8 vCPU
Memory: max(32, 200*0.5) = 100GB
Storage: max(100, 200*2) = 400GB

Redis:

Memory: 200*0.1 = 20GB (clustered)

Network:

Bandwidth: 200*5 = 1 Gbps

Estimated Infrastructure Cost:

~$8k/month (AWS pricing)

Performance Targets

Metric	Target	Measurement
API Latency (p50)	<100ms	Time from request to response
API Latency (p99)	<500ms	99th percentile
Agent Registration	<5s	Time to register new agent
Task Assignment	<2s	Time from task creation to assignment
Database Query (p95)	<50ms	95th percentile query time
Failover Time	<30s	Primary node failure to recovery
Throughput	10k req/s	Sustained request rate (per region)

Load Testing Recommendations

Before production launch:

# Install k6 load testing tool
$ helm install k6-operator k6/k6-operator

# Run load test (simulates 100 agents)
$ k6 run --vus 100 --duration 30m load-test.js

Checks:
  ✅ API latency p95 < 200ms
  ✅ Error rate < 0.1%
  ✅ Database connections stable
  ✅ Memory usage < 80%
  ✅ CPU usage < 70%

Performance Tuning

Database Tuning (PostgreSQL):

-- Increase connection pool
max_connections = 500

-- Optimize for read-heavy workload
shared_buffers = 8GB
effective_cache_size = 24GB

Kubernetes Tuning:

# Horizontal Pod Autoscaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Redis Tuning:

# Redis cluster mode for >10GB data
redis:
  cluster:
    enabled: true
    nodes: 6  # 3 masters + 3 replicas
    maxmemory: 20gb
    maxmemory-policy: allkeys-lru

Monitoring Key Metrics

Infrastructure:

CPU utilization (target: <70%)
Memory utilization (target: <80%)
Disk IOPS (target: <80% capacity)
Network throughput

Application:

API request rate
API error rate (target: <0.1%)
API latency (p50, p95, p99)
Database query time
Cache hit rate (target: >90%)

Business:

Active agents
Tasks completed per hour
Agent utilization rate
Escalation rate

MULTI-TENANCY IN SELF-HOSTED DEPLOYMENTS

Use Cases

Managed Service Providers (MSPs):

MSP operates single HUMAN deployment
Serves multiple client organizations
Full isolation between clients

System Integrators (SIs):

SI deploys HUMAN for multiple divisions/subsidiaries
Shared infrastructure, isolated data

Holding Companies:

Parent company runs HUMAN
Subsidiaries use as tenants
Centralized billing, distributed usage

Architecture: Namespace Isolation

graph TB
    subgraph HUMAN_Control_Plane [HUMAN Control Plane]
        TenantRouter[Tenant Router]
    end
    
    subgraph Tenant_A [Tenant A: Acme Corp]
        NS_A[Namespace: tenant-acme]
        Agents_A[Agents 1-50]
        DB_A[Database Schema: acme]
    end
    
    subgraph Tenant_B [Tenant B: GlobalCo]
        NS_B[Namespace: tenant-globalco]
        Agents_B[Agents 1-100]
        DB_B[Database Schema: globalco]
    end
    
    TenantRouter --> NS_A
    TenantRouter --> NS_B
    NS_A --> Agents_A
    NS_A --> DB_A
    NS_B --> Agents_B
    NS_B --> DB_B

Isolation Guarantees

Network Isolation:

Kubernetes NetworkPolicy (deny-all by default)
Traffic between tenants blocked
Ingress only via tenant-specific endpoints

Compute Isolation:

Separate namespaces per tenant
ResourceQuotas enforced (CPU, memory, pods)
No shared pods between tenants

Data Isolation:

Separate database schemas per tenant
Row-level security (RLS) for shared tables
Encryption keys unique per tenant

Access Isolation:

Separate RBAC policies per tenant
Tenant admins cannot access other tenants
MSP admin has cross-tenant visibility (audit only)

Kubernetes Configuration

Namespace per Tenant:

apiVersion: v1
kind: Namespace
metadata:
  name: tenant-acme
  labels:
    tenant-id: acme
    msp-managed: "true"

ResourceQuota:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: acme-quota
  namespace: tenant-acme
spec:
  hard:
    requests.cpu: "50"      # 50 vCPU
    requests.memory: 100Gi  # 100GB RAM
    pods: "100"             # Max 100 pods
    persistentvolumeclaims: "10"

NetworkPolicy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-cross-tenant
  namespace: tenant-acme
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          tenant-id: acme  # Only same tenant

Licensing for Multi-Tenant MSPs

MSP License:

Unlimited tenants
Agent count = sum across all tenants
Pricing: $100k/yr base + $200/agent/yr

Example:

MSP serves 10 clients
Total agents: 500
Cost: $100k + (500 × $200) = $200k/yr

Alternative: Per-Tenant Licensing

Each tenant purchases own license
MSP provides infrastructure only
HUMAN bills tenants directly

Security Considerations

MSP Responsibilities:

Network segmentation enforcement
Resource quota management
Monitoring and alerting (per-tenant dashboards)
Backup and disaster recovery (tenant data isolated)

Tenant Responsibilities:

Application-level access control (who can use agents)
Compliance with regulations (HIPAA, etc.)
Agent configuration and management

HUMAN's Role:

Provide secure multi-tenant architecture
License enforcement (per tenant)
Support MSP and tenants (tiered support model)

ENTERPRISE INTEGRATION PATTERNS

Identity Federation

Supported Protocols:

Protocol	Use Case	Complexity	Recommended For
SAML 2.0	Enterprise SSO	Medium	Large enterprises, government
OAuth2/OIDC	Modern apps	Low	Tech companies, SaaS
LDAP/AD	Legacy systems	High	Traditional enterprises

SAML 2.0 Configuration:

auth:
  provider: saml
  saml:
    entryPoint: https://idp.acme.com/sso
    issuer: https://human.acme.internal
    cert: /etc/human/saml/idp-cert.pem
    identifierFormat: urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress
    attributeMapping:
      email: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress
      firstName: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname
      lastName: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/surname

LDAP Configuration:

auth:
  provider: ldap
  ldap:
    url: ldaps://ldap.acme.com:636
    bindDN: cn=human-service,ou=services,dc=acme,dc=com
    bindPassword: <secret>
    searchBase: ou=users,dc=acme,dc=com
    searchFilter: (uid={{username}})
    groupSearchBase: ou=groups,dc=acme,dc=com
    groupMemberAttribute: memberOf

Corporate Proxy Support

For enterprises with mandatory proxy:

network:
  proxy:
    http: http://proxy.acme.com:8080
    https: http://proxy.acme.com:8080
    noProxy:
      - localhost
      - 127.0.0.1
      - .acme.internal
      - .svc.cluster.local
  caCerts:
    - /etc/ssl/certs/acme-root-ca.crt

VPN & Private Connectivity

AWS Direct Connect:

Private connection to HUMAN-hosted (hybrid deployment)
Latency: <10ms
Bandwidth: 1-100 Gbps

Azure ExpressRoute:

Private peering to HUMAN control plane
Redundant connections across regions

GCP Cloud Interconnect:

Dedicated interconnect for high throughput

Site-to-Site VPN:

IPsec tunnels for smaller deployments
Encrypted traffic over internet

Custom Certificate Authority

For enterprises with internal CA:

tls:
  ca: custom
  customCA:
    rootCert: /etc/human/ca/root.crt
    intermediateCerts:
      - /etc/human/ca/intermediate1.crt
      - /etc/human/ca/intermediate2.crt
  certManager:
    enabled: true
    issuer: acme-internal-ca

SIEM Integration

Supported SIEM Platforms:

Splunk:

logging:
  siem:
    provider: splunk
    endpoint: https://splunk.acme.com:8088
    token: <hec-token>
    index: human_logs
    sourcetype: human:json

Microsoft Sentinel:

logging:
  siem:
    provider: sentinel
    workspaceId: <workspace-id>
    sharedKey: <shared-key>
    logType: HumanAgentLogs

IBM QRadar:

logging:
  siem:
    provider: qradar
    endpoint: https://qradar.acme.com
    syslogPort: 514
    protocol: tcp

DLP Integration

For enterprises with Data Loss Prevention:

security:
  dlp:
    enabled: true
    provider: symantec  # or forcepoint, mcafee
    endpoint: https://dlp.acme.com/api
    scanOutbound: true
    blockOnViolation: true
    alertOnSuspicious: true

UPGRADE STRATEGY & BREAKING CHANGES

Release Cadence

Release Type	Frequency	Version Change	Contents
Major	Annual	1.0 → 2.0	Breaking changes, new features
Minor	Quarterly	1.1 → 1.2	New features, no breaking changes
Patch	Monthly	1.1.1 → 1.1.2	Bug fixes, security patches

Semantic Versioning

Format: MAJOR.MINOR.PATCH (e.g., 1.2.3)

MAJOR: Breaking API changes, requires migration
MINOR: New features, backward compatible
PATCH: Bug fixes, security patches

Breaking Change Policy

Announcement: 90 days before release
Migration Guide: Published with announcement
Support: Old version supported for 12 months after new major release

Example Timeline:

Day 0: Announce v2.0 (breaking changes)
Day 90: Release v2.0
Day 90-Day 455: Support both v1.x and v2.x
Day 455: End support for v1.x

Upgrade Process (Zero-Downtime)

Blue-Green Deployment:

# 1. Deploy new version (green) alongside old (blue)
$ helm install human-v2 human/control-plane \
  --namespace human-green \
  --set version=2.0.0

# 2. Validate green environment
$ human-installer validate --namespace human-green
  ✅ All health checks passed

# 3. Switch traffic to green (gradual)
$ kubectl patch ingress human --type merge \
  -p '{"spec":{"rules":[{"host":"human.acme.internal","http":{"paths":[{"path":"/","pathType":"Prefix","backend":{"service":{"name":"api-gateway-green","port":{"number":8080}}}}]}}]}}'

# 4. Monitor for 24 hours
# 5. Decommission blue environment
$ helm uninstall human-v1 --namespace human

Compatibility Matrix

Control Plane	Agent SDK	Database Schema	Supported
1.2.x	1.2.x	v1.2	✅ Yes
1.2.x	1.1.x	v1.2	✅ Yes (N-1 support)
1.2.x	1.0.x	v1.2	❌ No (upgrade agents)
2.0.x	1.2.x	v2.0	❌ No (breaking change)

Policy: Control plane supports agent SDK from previous minor version (N-1).

Database Migration Safety

Automated Migrations:

All migrations tested against copy of production data
Rollback plan for every migration
Execution time estimated and validated

Example Migration:

-- Migration: v1.2.0 → v1.3.0
-- Estimated time: 15 minutes (10M rows)
-- Rollback: Available (DROP COLUMN is reversible)

BEGIN;

-- Add new column
ALTER TABLE agents ADD COLUMN status_v2 VARCHAR(50);

-- Migrate data (batched)
UPDATE agents SET status_v2 = status WHERE status_v2 IS NULL;

-- Once validated, drop old column (future migration)
-- ALTER TABLE agents DROP COLUMN status;

COMMIT;

Automated Upgrade Testing

Pre-release validation:

# Run upgrade test suite
$ human-test-upgrade --from 1.2.0 --to 1.3.0

Tests:
  ✅ Database migration (15m 32s)
  ✅ API compatibility (all endpoints)
  ✅ Agent SDK compatibility (1.2.x → 1.3.x)
  ✅ Zero-downtime switchover
  ✅ Rollback procedure
  ✅ Performance benchmarks (no regression)

Result: Safe to upgrade

Upgrade Checklist

Pre-Upgrade:

Review release notes and migration guide
Backup all data (database, configs, secrets)
Test upgrade in staging environment
Schedule maintenance window (or plan zero-downtime)
Notify users of potential disruption

During Upgrade:

Deploy new version (blue-green)
Run database migrations
Validate new environment health
Switch traffic gradually (10% → 50% → 100%)
Monitor error rates and latency

Post-Upgrade:

Validate all critical workflows
Check monitoring dashboards
Verify agent registration
Confirm database performance
Decommission old environment (after 24hr)

DAY 2 OPERATIONS FOR SELF-HOSTED

Operational Responsibilities

Task	Frequency	Owner	Automation
Database backups	Daily	Customer Ops	Automated (Velero)
Security patches	Weekly	Customer Ops	Semi-automated (Helm)
Certificate renewal	30 days before expiry	Customer Ops	Automated (cert-manager)
Capacity review	Monthly	Customer Ops	Dashboard-driven
Performance tuning	Quarterly	Customer Ops + HUMAN TAM	Guided
Disaster recovery drill	Quarterly	Customer Ops	Scripted
Compliance audit	Annual	Customer Compliance	HUMAN support available

Automated Backup Strategy

Velero Configuration:

# Backup schedule
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: human-daily-backup
spec:
  schedule: "0 2 * * *"  # 2 AM daily
  template:
    includedNamespaces:
    - human
    - human-runtime
    storageLocation: aws-s3
    volumeSnapshotLocations:
    - aws-ebs
    ttl: 720h  # 30 days retention

Database Backup:

# PostgreSQL automated backup (via cron)
0 2 * * * pg_dump -h postgres.acme.internal -U human human | \
  gzip > /backups/human-$(date +\%Y\%m\%d).sql.gz

# Retention: 30 days local, 1 year S3

Disaster Recovery Procedures

Recovery Time Objective (RTO): 1 hour
Recovery Point Objective (RPO): 24 hours

DR Scenario 1: Database Failure

# 1. Promote read replica to primary
$ aws rds promote-read-replica --db-instance-identifier human-db-replica-1

# 2. Update connection string
$ kubectl patch configmap human-config \
  -p '{"data":{"DB_HOST":"human-db-replica-1.xyz.rds.amazonaws.com"}}'

# 3. Restart affected pods
$ kubectl rollout restart deployment --all -n human

# RTO: ~15 minutes

DR Scenario 2: Complete Region Failure

# 1. Failover to DR region
$ kubectl config use-context human-dr-us-west-2

# 2. Restore from backup
$ velero restore create --from-backup human-daily-backup-20250101

# 3. Update DNS (Route53 or equivalent)
$ aws route53 change-resource-record-sets --hosted-zone-id Z123 \
  --change-batch file://failover-dns.json

# RTO: ~1 hour

DR Drill Procedure

Quarterly drill (4 hours):

Hour 1: Simulate region failure
- Take primary region offline (controlled)
- Measure detection time (<5 min target)
Hour 2: Execute failover
- Promote DR region
- Restore from backup
- Validate data integrity
Hour 3: Validate DR environment
- Run health checks
- Test agent registration
- Verify API functionality
Hour 4: Failback to primary
- Sync data from DR to primary
- Switchback to primary region
- Debrief and document improvements

Monitoring & Alerting

Critical Alerts (Page on-call):

Alert	Threshold	Response Time
Control plane down	>50% pods unhealthy	<5 min
Database connection failure	>10% error rate	<5 min
Disk space critical	>90% full	<15 min
Certificate expiring soon	<7 days to expiry	<24 hr
License expiring	<30 days to expiry	<24 hr

Warning Alerts (Review next business day):

Alert	Threshold	Response Time
High CPU usage	>80% for 30min	<4 hr
High memory usage	>85% for 30min	<4 hr
Slow API responses	p95 >500ms	<4 hr
Failed backups	2 consecutive failures	<12 hr

Performance Tuning (Monthly Review)

Checklist:

Review database slow query log (optimize queries >100ms)
Check cache hit rate (target: >90%)
Analyze resource utilization (CPU, memory, disk)
Review pod auto-scaling behavior (scale-up/down frequency)
Check for pod restarts (investigate if >5/day)
Review API error logs (investigate 4xx/5xx patterns)

Common Operations

Scale Up (Add Capacity):

# Add 10 more nodes
$ eksctl scale nodegroup --cluster=human-prod --nodes=35 --name=human-workers

# Adjust HPA max replicas
$ kubectl patch hpa human-api --patch '{"spec":{"maxReplicas":30}}'

Add Region (Multi-Region):

# Deploy to new region
$ human-installer install --region eu-west-1 --profile multi-region

# Configure cross-region replication
$ human-installer configure-replication \
  --primary us-east-1 \
  --replica eu-west-1 \
  --mode async

Rotate Secrets:

# Rotate database password (zero-downtime)
$ human-installer rotate-secret --name DB_PASSWORD --zero-downtime

Steps:
  1. Generate new password
  2. Add to database (secondary user)
  3. Update application to use new password
  4. Remove old password from database
  5. Validate no errors

IMPLEMENTATION DETAILS

For engineers building or deploying HUMAN agents, detailed implementation specs are available:

Setup Specifications

Each deployment profile has a complete implementation spec with infrastructure configs, monitoring setup, and deployment procedures:

Hosted Profile: setup/agent_deployment_hosted_spec.md
- Zero-config deployment flow
- What HUMAN manages (infrastructure, monitoring, security)
- API access and authentication
- Cost structure and visibility
Hybrid Profile: setup/agent_deployment_hybrid_spec.md
- Control plane in HUMAN Cloud, execution in customer VPC
- Secure tunnel configuration (mTLS, no inbound firewall rules)
- Monitoring options (push to HUMAN Cloud OR self-hosted)
- Data residency guarantees
Self-Hosted Profile: See implementation spec setup/agent_deployment_selfhosted_spec.md
- Complete infrastructure requirements
- Helm charts and Terraform modules
- Database setup and network topology
- Air-gapped deployment support

Monitoring Configurations

Comprehensive, copy-paste configs for all profiles:

setup/monitoring_configurations.md
- Prometheus scraping configs (self-hosted)
- Grafana dashboard JSONs (fleet overview, cost analytics, audit trail)
- Alert rules (agent down, high error rate, budget alerts)
- Distributed tracing setup (Tempo integration)
- Log aggregation (Loki configuration)

Also see: KB 103 (Monitoring & Observability) for architectural overview and best practices.

Control Plane Architecture

setup/mara_humanos_control_plane_v0.2.md
- Control plane deployment by profile
- Routing, policy engine, approval queue
- Async job system and workflow DAG construction
- Cross-profile consistency guarantees

Agent SDK Patterns

setup/human_agent_sdk_patterns_v0.2.md
- human.call() primitive (works identically across all profiles)
- Delegation and risk classification
- Context propagation and attestation generation
- Profile-aware SDK configuration

Quick Reference

Need	Document
Deploy to Hosted (zero-config)	`setup/agent_deployment_hosted_spec.md`
Deploy to Hybrid (data sovereignty)	`setup/agent_deployment_hybrid_spec.md`
Deploy Self-Hosted (full control)	See implementation spec `setup/agent_deployment_selfhosted_spec.md`
Configure Prometheus/Grafana	`setup/monitoring_configurations.md`
Understand control plane	`setup/mara_humanos_control_plane_v0.2.md`
Build agents	KB 105 (Agent SDK Architecture), KB 130 (Design Patterns)

MIGRATION PATHS: BORING AND REVERSIBLE

We bake migration paths in from day one:

From Hosted → Hybrid

Trigger: "We need attestations in our data lake" or "Compliance wants ledger in our region"

Process:

Export Capability Graph state
Export active policies
Stand up ledger nodes in their VPC
Configure HUMAN control plane to point to their ledger
Test with shadow traffic
Cut over

Downtime: Minutes (not hours)

Code changes required: Zero (just config)

From Hybrid → Self-Hosted

Trigger: "Audit says we need full operational control" or "We're going multi-cloud"

Process:

Deploy HumanOS services via Helm/Terraform
Deploy Capability Graph nodes
Configure storage adapters (their RDS, S3, etc.)
Migrate control plane state via export/import
Point apps to new HUMAN_BASE_URL
Decommission hosted control plane

Downtime: Hours (planned maintenance window)

Code changes required: URL + credential changes only

From Hosted → Self-Hosted (Skip Hybrid)

Trigger: "We're a 50-person healthcare startup, just got our first enterprise customer, need HIPAA self-hosted"

Process:

Same as Hosted → Hybrid → Self-hosted, but done in one shot with migration automation

Downtime: 1 day (planned)

Support: We provide migration engineer + runbooks

STORAGE ADAPTER ARCHITECTURE

Everything that persists state in HUMAN goes through a narrow interface:

The Three Stores

1. GraphStore

Stores: Capability Graph nodes and edges

Interface:

interface GraphStore {
  addNode(node: CapabilityNode): Promise<void>;
  addEdge(edge: CapabilityEdge): Promise<void>;
  queryCapabilities(query: CapabilityQuery): Promise<Capability[]>;
  updateCapability(id: string, update: CapabilityUpdate): Promise<void>;
}

Adapters:

HumanCloudGraphStore (our multi-tenant infra)
PostgresGraphStore (customer RDS/Aurora)
Neo4jGraphStore (native graph DB)
TigerGraphStore (high-performance alternative)

2. PolicyStore

Stores: HumanOS policies, rules, escalation configs

Interface:

interface PolicyStore {
  storePolicy(policy: Policy): Promise<void>;
  getPolicy(id: string): Promise<Policy>;
  evaluatePolicy(context: PolicyContext): Promise<PolicyDecision>;
  listPolicies(filter: PolicyFilter): Promise<Policy[]>;
}

Adapters:

HumanCloudPolicyStore
PostgresPolicyStore
S3PolicyStore (for large orgs with many policies)

3. LedgerStore

Stores: Attestations, provenance records, audit logs

Interface:

interface LedgerStore {
  anchor(attestation: Attestation): Promise<AnchorReceipt>;
  verify(id: string): Promise<VerificationResult>;
  query(filter: AttestationFilter): Promise<Attestation[]>;
  export(range: TimeRange): Promise<AuditExport>;
}

Adapters:

HumanCloudLedgerStore (hosted distributed ledger)
LocalLedgerStore (dev/test)
PrivateLedgerStore (customer-operated nodes)
SnowflakeLedgerStore (enterprise data lake integration)

ONBOARDING FLOWS BY PROFILE

Hosted Onboarding (SMB)

Step 1: Sign up with Google / O365

Auto-create HUMAN workspace tied to domain

Step 2: Install Companion

Browser extension + desktop app
Generates Passport keys locally on device

Step 3: Pick a starter pack

"AI customer support with human escalation"
"AI sales assistant with approval gates"
"AI recruiting assistant with human screen"

Step 4: Connect existing tools

OAuth to Gmail, Slack, CRM, etc.
We store pointers, not content

From their POV: No talk of VPC, DBs, S3 buckets. It just… works.

Hybrid Onboarding (Enterprise)

Step 1: Start with Hosted for pilot

Prove value with real workflows
Security evaluates during pilot

Step 2: Deploy data plane components

We provide Terraform modules
They deploy ledger + caches in VPC
Establish secure tunnel to HUMAN Cloud

Step 3: Migrate attestations

Historical data exports to their ledger
New attestations route to their infra

Step 4: Connect enterprise systems

SSO integration (Okta, Azure AD)
Private connectors to internal apps
VPC peering for sensitive workloads

From their POV: Same app experience, but attestations stay in our cloud, data plane in theirs.

Self-Hosted Onboarding (Regulated)

Step 1: Architecture review

HUMAN solutions architect + their platform team
Define: regions, storage, networking, compliance requirements

Step 2: Deploy via IaC

Helm charts for Kubernetes
Terraform for AWS/GCP/Azure
Ansible for on-prem

Step 3: Configure storage adapters

Point to their RDS, S3, Neo4j, Snowflake, etc.
Set retention policies, backup strategies

Step 4: Load test and validate

Run simulated governance load
Validate attestation integrity
Test failover scenarios

Step 5: Cut over production apps

Update HUMAN_BASE_URL in app configs
Monitor dashboards for anomalies

From their POV: Full control, full visibility, HUMAN becomes infrastructure they operate.

WHAT STAYS THE SAME ACROSS ALL PROFILES

No matter which deployment profile, these don't change:

1. API Surface

Same REST/GraphQL/gRPC endpoints:

/v1/passport/*
/v1/capabilities/*
/v1/humanos/*
/v1/attestations/*

2. SDKs

Same client libraries:

import { HumanClient } from '@human/sdk';

const client = new HumanClient({
  baseUrl: process.env.HUMAN_BASE_URL // <-- only thing that changes
});

3. Semantics

Same policy language, same attestation format, same capability model

4. Developer Experience

Same docs, same examples, same onboarding tutorials

Result: Moving between profiles is a URL change, not a rewrite.

PRICING IMPLICATIONS BY PROFILE

Updated: 2025-12-19

The Principle: Self-Hosting Changes Margin Mix, Not Core Engines

Whether HumanOS runs fully on HUMAN Cloud, hybrid, or fully self-hosted, we still charge for governed infrastructure, workforce access, and network effects.

What changes: Who pays for infra and our margin per customer
What doesn't change: Whether we get paid

The sovereign cockpit model means orgs pay for:

The Platform (HumanOS license, Policy Engine, Reasoning Service)
The Standards (certification, attestation formats, compliance)
The Network (optional: workforce services, marketplace, cross-org governance)

They DON'T pay for "permission to make decisions."

See 34_revenue_engines_and_tam.md for complete pricing tiers and revenue model.

Hosted (HUMAN Cloud)

What customer pays us:

Platform license (based on tier: agents + instance capacity)
- Free: $0/month (3 agents, 10 instances)
- Starter: $49/month (10 agents, 50 instances)
- Professional: $199/month (50 agents, 200 instances)
- Business: $799/month (200 agents, 800 instances)
- Enterprise: $2,500+/month (custom)
Infrastructure included (we run the compute)
Optional: HUMAN-managed reasoning (we front AI token costs)
Optional: Workforce services (when available in Phase 2)

What we pay (our COGS):

Compute per instance-hour (infrastructure costs)
AI tokens (if HUMAN-managed reasoning)
Support overhead

⚠️ Pricing Validation Note:

Hosted Infrastructure Costs: Instance-hour allowances and overage pricing require validation against production AWS/GCP costs. The tier structure and features are validated. Self-hosted pricing (below) is fully validated.

Economics:

Highest touch (we run everything)
Target margin: 60-70% after scale
Revenue: Platform license + infrastructure bundled

Customer profile:

Small businesses (5-100 people)
No IT/DevOps team
Want "it just works"
Comfortable with HUMAN-hosted

Example: 15-person law firm at $199/month Professional tier

Gets 50 agents, 200 concurrent instances
HUMAN handles all infrastructure
Firm focuses on using agents, not running them

Hybrid (HUMAN Cloud + Customer Infrastructure)

What customer pays us:

Platform license (same tiers as Hosted)
Partial infrastructure (we host some, they host sensitive workloads)
BYO keys (typically for on-prem reasoning)
Optional: Workforce services

What we pay:

Compute for HUMAN-hosted portion only
Zero costs for their self-hosted portion

What customer pays (their costs):

Their own infrastructure (VPC, compute for on-prem agents)
Their own AI token costs (for on-prem reasoning)

Economics:

Mixed margins (lower than pure hosted, higher than pure self-hosted)
Revenue: Platform license + partial infrastructure
Lower compute costs for us (they run sensitive stuff)

Customer profile:

Mid-size orgs (100-500 people)
Some IT capability
Mix of sensitive and non-sensitive workloads
Want flexibility (cloud for convenience, on-prem for compliance)

Example: 50-person hospital at $799/month Business tier

Runs PHI-touching agents on-prem (clinical notes, patient data)
Runs non-PHI agents on HUMAN Cloud (scheduling, billing)
Gets HIPAA compliance built-in
Hybrid = best of both worlds

Self-Hosted (Customer Infrastructure)

What customer pays us:

Platform license only (based on agents/users/scale)
- No infrastructure fees (they run it)
- No per-instance charges (they pay their own compute)
Support & certification (annual contract)
- Premium support included
- Quarterly business reviews
- Certification services
Optional: Workforce services (when available)
Optional: Marketplace (we take rev share on installed agents)

What we pay:

Minimal control plane infrastructure (metadata only)
Support team costs

What customer pays (their costs):

All infrastructure (VPC, Kubernetes, databases, compute)
All AI token costs (their BYO keys)
Their own DevOps/SRE team

Economics:

Lowest touch for us (they run it)
Pure software licensing margins (80%+)
Revenue: License + support + optional services
Highest ACVs (enterprises pay more for control)

Customer profile:

Large enterprises (500+ people)
Mature platform engineering team
Regulated industries (finance, healthcare, government)
Want full control and data sovereignty

Example: 500-person bank at $30k/year Enterprise license

Runs everything on their AWS
Uses their own LLM cluster
HUMAN provides: software license, certification, support
Bank's total cost: $30k license + ~$40k their infra = $70k/year
Bank's value: Replaced $500k BPO contract + $2M fraud savings = 40x ROI

Pricing Summary Table

Deployment	Platform License	Infrastructure	Support	Our Margin	Customer ACL
Hosted	$49-799/mo tiers	Included	Email/Phone	60-70%	$588-9.6k/year
Hybrid	Same tiers	Partial (we host some)	Business	50-60%	$1k-15k/year
Self-Hosted	$30k+/year	Customer pays	Enterprise + TAM	80%+	$30k-100k+/year

Key insight: Self-hosted has highest margin (pure software) but requires enterprise sales motion. Hosted has lower margin but scales via self-serve.

Cost Flows by Deployment Mode

Hosted:

Customer pays: $799/month (Business tier)
├─ To HUMAN: $799/month
   ├─ Platform license: $799
   ├─ Infrastructure: Included
   └─ AI tokens: Included (up to allowance)

HUMAN pays:
├─ Compute: ~$200/month (infrastructure for their agents)
├─ AI tokens: ~$150/month (reasoning calls)
├─ Support: ~$50/month (allocated)
└─ Margin: ~$400/month (50%)

Hybrid:

Customer pays: $799/month + their AWS costs
├─ To HUMAN: $799/month
   ├─ Platform license: $799
   ├─ Infrastructure: Partial (non-sensitive agents)
   └─ BYO keys for on-prem reasoning

├─ To AWS (their bill): ~$300/month
   ├─ VPC for on-prem agents
   ├─ Compute for sensitive workloads
   └─ Their LLM endpoints

HUMAN pays:
├─ Compute: ~$100/month (only non-sensitive portion)
├─ Support: ~$50/month
└─ Margin: ~$650/month (81%)

Customer total cost: $1,099/month

Self-Hosted:

Customer pays: $30k/year license + their infrastructure
├─ To HUMAN: $30k/year ($2,500/month)
   ├─ Platform license: $30k
   ├─ Support & certification: Included
   └─ Infrastructure: $0 (they run it)

├─ To their cloud provider: ~$40k/year
   ├─ Kubernetes cluster
   ├─ Databases
   ├─ Compute for agents
   └─ Their LLM cluster

HUMAN pays:
├─ Support team: ~$300/month (allocated)
├─ Minimal infrastructure: ~$50/month (control plane metadata)
└─ Margin: ~$2,150/month (86%)

Customer total cost: $70k/year
Customer value delivered: $2.8M/year (savings + revenue)
ROI: 40x

Why This Model Works

For Small Businesses (Hosted):

Zero infrastructure burden
Predictable monthly cost
Scale up as they grow
Can migrate to hybrid/self-hosted later if needed

For Mid-Market (Hybrid):

Best of both worlds
Keep sensitive data on-prem
Use cloud for convenience
Optimize costs (don't pay us for compute they can run cheaper)

For Enterprises (Self-Hosted):

Full control and sovereignty
Data never leaves their infrastructure
Regulatory compliance built-in
Still get platform innovation (we ship updates)

For HUMAN:

Hosted = lower margin, higher volume (SMB focus)
Self-hosted = higher margin, lower volume (enterprise focus)
Both are profitable at scale
Revenue model survives regardless of deployment choice

Revenue Impact: Deployment Mix Over Time

Year 1 (Platform Launch):

80% Hosted (SMBs discovering product)
15% Hybrid (early mid-market)
5% Self-Hosted (pilot enterprises)

Year 2 (Enterprise Adoption):

60% Hosted (SMB growth continues)
25% Hybrid (mid-market standard)
15% Self-Hosted (enterprise momentum)

Year 3 (Enterprise Dominance):

40% Hosted (by customer count, but lower ACVs)
30% Hybrid (sweet spot for many)
30% Self-Hosted (by revenue, highest ACVs)

Revenue distribution shifts even as customer mix doesn't:

Hosted customers: Many, but $49-799/month each
Self-hosted customers: Few, but $30k-100k/year each

By Year 3:

10,000 hosted customers × $200/month avg = $24M ARR
500 hybrid customers × $1,000/month avg = $6M ARR
200 self-hosted customers × $50k/year avg = $10M ARR
Total Platform Revenue: $40M ARR

Self-hosted is 2% of customers but 25% of platform revenue (and highest margin).

DECISION CRITERIA: WHICH PROFILE SHOULD A CUSTOMER CHOOSE?

Factor	Hosted	Hybrid	Self-Hosted
Team Size	<200	200–5,000	>1,000 or regulated
Infra Team	None / small	Exists	Mature platform eng
Data Sensitivity	Low–Medium	Medium–High	Highest
Compliance	General	Industry-specific	Regulated (HIPAA, FedRAMP)
Speed to Value	Minutes	Days	Weeks
OpEx Preference	High (pay us)	Mixed	Low (run it themselves)
CapEx Willingness	None	Some	High
Vendor Lock-in Concern	Low	Medium	High

AVOIDING HOSTING AS A BARRIER

The traditional problem:

Big enterprises: "Cool idea, but it has to run in our VPC"
SMB: "Please don't make me think about any of that"

Our solution:

SMB: "It just works, you never see infra"
Enterprise: "Same APIs, runs in your VPC when you're ready"

The messaging becomes:

For SMB:
"Start here, you never have to touch infra."

For Enterprise:
"Start here, prove value, then shift into your VPC with the same code."

STORAGE AS NON-ISSUE

When an enterprise says: "We only use Snowflake / RDS / Azure SQL / Splunk"

We say: "Cool — here are the adapters, here's a reference deployment, your apps don't change."

The adapter pattern means:

HUMAN Cloud: optimized multi-tenant storage
Customer-hosted: we support their preferred vendors
Migration: export from ours, import to theirs, done

Result: Storage preference becomes a config option, not a deal-breaker.

WHY THIS ARCHITECTURE WORKS

1. Clean Boundaries

Devices own keys (Layer 0)
HUMAN owns coordination (Layer 1)
Customers own data (Layer 2)

These layers never blur.

2. Pluggable Storage

Everything behind narrow interfaces
Swap PostgreSQL for Neo4j? Config change.
Add Snowflake export? New adapter.

3. Same Semantics Everywhere

Hosted, Hybrid, Self-hosted: same protocol
No "enterprise edition" with different behavior
Migration is boring (the best kind of boring)

4. Revenue Flexibility

SMB: SaaS economics (high margin)
Enterprise: Mixed (medium margin, high ACV)
Self-hosted: Services (lower margin, highest ACV)

Every segment has a profitable path.

HUMAN'S OWN PRODUCTION INFRASTRUCTURE (4-NINES ARCHITECTURE)

This section describes how HUMAN operates its own Hosted profile infrastructure to achieve 99.99% availability.

Multi-Region Active-Active Architecture

HUMAN targets 99.99% (4 nines) availability = 4.3 minutes downtime/month.

To achieve this, HUMAN operates multi-region active-active (not active-passive):

┌──────────────────────────────────────────────────────────────────────┐
│                    Global DNS (Route53)                               │
│          Latency-based routing + health checks (10s interval)         │
└───────────┬──────────────────────────────┬──────────────────────────┘
            │                               │
    ┌───────▼────────┐              ┌──────▼────────────┐
    │  US-East-1     │◄─────────────►│  US-West-2        │
    │  (Active)      │   Replication │  (Active)         │
    ├────────────────┤   <1s lag     ├───────────────────┤
    │ • EKS: 10 pods │               │ • EKS: 10 pods    │
    │ • Load: 50%    │               │ • Load: 50%       │
    │ • RDS: Primary │               │ • RDS: Replica    │
    │ • Redis: Pri   │               │ • Redis: Replica  │
    └────────────────┘               └───────────────────┘
           │                                  │
           └──────────┬───────────────────────┘
                      │
         ┌────────────▼──────────┐
         │  Global State         │
         │ • DynamoDB (global)   │
         │ • S3 (multi-region)   │
         └───────────────────────┘

Key Characteristics:

Both regions serve live traffic (50% each)
Either region can handle 100% load (capacity buffer)
Automated failover <30 seconds if one region fails
No single points of failure (distributed across 3+ AZs per region)
Data replicated in real-time (<1s lag)

Cost Impact:

Single region: ~$3,500/month
Multi-region active-active: ~$7,500/month
Additional cost: $4,000/month for 4-nines availability
ROI: Prevents customer SLA breaches and reputation damage

Regional Failover Automation

Terraform Configuration:

# terraform/modules/region/main.tf

module "us_east_1" {
  source = "./modules/region"
  
  region = "us-east-1"
  environment = "production"
  is_primary = true  # RDS primary (write)
  
  eks_node_count = 10
  rds_instance_class = "db.r6g.2xlarge"
  rds_multi_az = true
  
  replicate_to = ["us-west-2"]
}

module "us_west_2" {
  source = "./modules/region"
  
  region = "us-west-2"
  environment = "production"
  is_primary = false  # RDS read replica (can be promoted)
  
  eks_node_count = 10
  rds_instance_class = "db.r6g.2xlarge"
  rds_multi_az = true
  
  replicate_from = "us-east-1"
}

# Route53 health checks
resource "aws_route53_health_check" "us_east_1" {
  fqdn = "api.us-east-1.human.ai"
  port = 443
  type = "HTTPS"
  resource_path = "/health"
  request_interval = 10  # Check every 10 seconds
  failure_threshold = 2  # Fail after 2 consecutive failures (20s)
  
  tags = {
    Name = "US-East-1 Health Check"
  }
}

resource "aws_route53_health_check" "us_west_2" {
  fqdn = "api.us-west-2.human.ai"
  port = 443
  type = "HTTPS"
  resource_path = "/health"
  request_interval = 10
  failure_threshold = 2
  
  tags = {
    Name = "US-West-2 Health Check"
  }
}

# Global DNS with latency-based routing
resource "aws_route53_record" "api" {
  zone_id = aws_route53_zone.human_ai.id
  name = "api.human.ai"
  type = "A"
  
  set_identifier = "us-east-1"
  latency_routing_policy {
    region = "us-east-1"
  }
  health_check_id = aws_route53_health_check.us_east_1.id
  
  alias {
    name = module.us_east_1.load_balancer_dns
    zone_id = module.us_east_1.load_balancer_zone_id
    evaluate_target_health = true
  }
}

resource "aws_route53_record" "api_west" {
  zone_id = aws_route53_zone.human_ai.id
  name = "api.human.ai"
  type = "A"
  
  set_identifier = "us-west-2"
  latency_routing_policy {
    region = "us-west-2"
  }
  health_check_id = aws_route53_health_check.us_west_2.id
  
  alias {
    name = module.us_west_2.load_balancer_dns
    zone_id = module.us_west_2.load_balancer_zone_id
    evaluate_target_health = true
  }
}

Automated RDS Failover Lambda:

// lambda/regional-failover.ts

export async function handleRegionalFailover(event: CloudWatchEvent) {
  const unhealthyRegion = event.detail.region;
  
  logger.critical({ unhealthyRegion }, 'Regional failover triggered');
  
  if (unhealthyRegion === 'us-east-1') {
    // Promote us-west-2 RDS replica to primary
    await rds.promoteReadReplica({
      DBClusterIdentifier: 'human-production-us-west-2',
    });
    
    logger.info('RDS replica promoted to primary');
    
    // Update Route53 weights (100% to us-west-2)
    await route53.changeResourceRecordSets({
      HostedZoneId: ZONE_ID,
      ChangeBatch: {
        Changes: [
          {
            Action: 'UPSERT',
            ResourceRecordSet: {
              Name: 'api.human.ai',
              Type: 'A',
              SetIdentifier: 'us-east-1',
              Weight: 0,  // Stop sending to us-east-1
            },
          },
          {
            Action: 'UPSERT',
            ResourceRecordSet: {
              Name: 'api.human.ai',
              Type: 'A',
              SetIdentifier: 'us-west-2',
              Weight: 100,  // Send 100% to us-west-2
            },
          },
        ],
      },
    });
    
    logger.info('Route53 updated to route to us-west-2');
  }
  
  // Page on-call
  await pagerduty.trigger({
    severity: 'critical',
    summary: `AUTOMATED REGIONAL FAILOVER: ${unhealthyRegion} → healthy region`,
    details: {
      unhealthyRegion,
      estimatedDowntime: '20-30 seconds',
      rdsProm oted: true,
      dnsUpdated: true,
    },
  });
  
  // Log to provenance
  await provenance.log({
    actor: 'automation:regional-failover',
    action: 'promote_secondary_region',
    from: unhealthyRegion,
    automated: true,
  });
}

Database Multi-Region Strategy

Aurora Global Database:

# Primary cluster (us-east-1)
resource "aws_rds_cluster" "primary" {
  cluster_identifier = "human-production-primary"
  engine = "aurora-postgresql"
  engine_version = "15.3"
  engine_mode = "provisioned"
  
  master_username = "human_admin"
  master_password = data.aws_secretsmanager_secret_version.db_password.secret_string
  
  # Multi-AZ within region
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  
  # Global database for cross-region replication
  global_cluster_identifier = "human-production-global"
  
  # Automated backups
  backup_retention_period = 30
  preferred_backup_window = "03:00-04:00"
  
  # Connection pooling
  db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.human_pg.name
  
  # Enable Performance Insights
  enabled_cloudwatch_logs_exports = ["postgresql"]
}

# Secondary cluster (us-west-2) - read replica
resource "aws_rds_cluster" "secondary" {
  provider = aws.us_west_2
  
  cluster_identifier = "human-production-secondary"
  engine = "aurora-postgresql"
  engine_version = "15.3"
  
  # Replicate from primary
  replication_source_identifier = aws_rds_cluster.primary.arn
  
  # Can be promoted to primary on failover
  global_cluster_identifier = "human-production-global"
  
  # Multi-AZ
  availability_zones = ["us-west-2a", "us-west-2b", "us-west-2c"]
}

# Connection pooling with pgbouncer
resource "aws_ecs_service" "pgbouncer" {
  name = "pgbouncer"
  cluster = aws_ecs_cluster.human_production.id
  task_definition = aws_ecs_task_definition.pgbouncer.arn
  desired_count = 3
  
  load_balancer {
    target_group_arn = aws_lb_target_group.pgbouncer.arn
    container_name = "pgbouncer"
    container_port = 6432
  }
}

Replication Lag Monitoring:

# Prometheus alert for replication lag
- alert: RDSReplicationLagHigh
  expr: |
    aws_rds_replica_lag_seconds{cluster="human-production"} > 5
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: "RDS replication lag >5 seconds"
    description: "Replication lag &#123;&#123; $value &#125;&#125;s may impact failover RTO"
    action: "Investigate replication performance"

Zero-Downtime Deployment with Terraform

Kubernetes Blue/Green via Terraform:

# Blue deployment (current production)
resource "kubernetes_deployment" "companion_api_blue" {
  metadata {
    name = "companion-api-blue"
    labels = {
      app = "companion-api"
      deployment = "blue"
    }
  }
  
  spec {
    replicas = 10
    
    selector {
      match_labels = {
        app = "companion-api"
        deployment = "blue"
      }
    }
    
    template {
      metadata {
        labels = {
          app = "companion-api"
          deployment = "blue"
          version = var.current_version
        }
      }
      
      spec {
        container {
          name = "companion-api"
          image = "human/companion-api:${var.current_version}"
          
          resources {
            requests {
              cpu = "500m"
              memory = "512Mi"
            }
            limits {
              cpu = "1000m"
              memory = "1Gi"
            }
          }
        }
      }
    }
  }
}

# Green deployment (new version, starts at 0 replicas)
resource "kubernetes_deployment" "companion_api_green" {
  metadata {
    name = "companion-api-green"
    labels = {
      app = "companion-api"
      deployment = "green"
    }
  }
  
  spec {
    replicas = var.deploy_active ? 10 : 0  # Controlled by deploy script
    
    selector {
      match_labels = {
        app = "companion-api"
        deployment = "green"
      }
    }
    
    template {
      metadata {
        labels = {
          app = "companion-api"
          deployment = "green"
          version = var.new_version
        }
      }
      
      spec {
        container {
          name = "companion-api"
          image = "human/companion-api:${var.new_version}"
          
          resources {
            requests {
              cpu = "500m"
              memory = "512Mi"
            }
            limits {
              cpu = "1000m"
              memory = "1Gi"
            }
          }
        }
      }
    }
  }
}

# Service points to blue or green
resource "kubernetes_service" "companion_api" {
  metadata {
    name = "companion-api"
  }
  
  spec {
    selector = {
      app = "companion-api"
      deployment = var.active_deployment  # "blue" or "green"
    }
    
    port {
      port = 80
      target_port = 3000
    }
    
    type = "ClusterIP"
  }
}

Deploy Script with Automated Rollback:

#!/bin/bash
# scripts/deploy-production.sh

set -e

NEW_VERSION=$1

# 1. Update green deployment to new version
terraform apply \
  -var="new_version=${NEW_VERSION}" \
  -var="deploy_active=true" \
  -target=kubernetes_deployment.companion_api_green

# 2. Wait for green pods ready
kubectl wait --for=condition=available \
  deployment/companion-api-green \
  --timeout=5m

# 3. Smoke tests
curl -f http://companion-api-green:3000/health || exit 1

# 4. Switch traffic (instant)
terraform apply \
  -var="active_deployment=green"

# 5. Monitor for 5 minutes
sleep 300

ERROR_RATE=$(curl -s "http://prometheus:9090/api/v1/query" \
  --data-urlencode 'query=sum(rate(http_requests_total{version="'${NEW_VERSION}'",status=~"5.."}[5m])) / sum(rate(http_requests_total{version="'${NEW_VERSION}'"}[5m]))' \
  | jq -r '.data.result[0].value[1]')

if (( $(echo "$ERROR_RATE > 0.01" | bc -l) )); then
  echo "❌ Rollback: error rate ${ERROR_RATE} > 1%"
  
  # Instant rollback
  terraform apply -var="active_deployment=blue"
  
  exit 1
fi

# 6. Success - scale down blue
terraform apply -var="deploy_active=false" \
  -target=kubernetes_deployment.companion_api_blue

echo "✅ Deploy complete"

Infrastructure State Management

Remote State Backend:

# terraform/backend.tf

terraform {
  backend "s3" {
    bucket = "human-terraform-state"
    key = "production/terraform.tfstate"
    region = "us-east-1"
    
    # State locking
    dynamodb_table = "terraform-state-lock"
    encrypt = true
    
    # Versioning enabled on S3 bucket
  }
}

# State locking table
resource "aws_dynamodb_table" "terraform_state_lock" {
  name = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
  
  tags = {
    Name = "Terraform State Lock"
    Environment = "production"
  }
}

Deployment Rollout Timeline

Month	Milestone	Status
Month 1	Deploy US-East + US-West simultaneously	Pending approval (+$4k/month cost)
Month 1	Terraform IaC for all infrastructure	Pending
Month 1	Blue/green deployment automation	Pending
Month 2	Regional failover tested monthly	Pending
Month 3	Chaos engineering: kill region monthly	Pending
Month 6	Add EU-West (3-region active-active)	Planning

See Also:

kb/102_performance_engineering_guide.md - 4-nines architecture overview
kb/103_monitoring_and_observability_setup.md - Multi-region observability
kb/129_ai_driven_operations_strategy.md - AI-driven deployment automation

CROSS-REFERENCES

See: 26_hybrid_stack_architecture.md - Conceptual architecture and design philosophy
See: 49_devops_and_infrastructure_model.md - Operational infrastructure and multi-cloud strategy
See: 11_engineering_blueprint.md - System layers and component architecture
See: 107_developer_adoption_playbook.md - How deployment flexibility supports developer GTM
See: 109_pricing_mechanics_and_billing.md - How deployment profiles affect pricing
See: 43_haio_developer_architecture.md - API architecture that works across all profiles

Metadata

Created: November 26, 2025
Version: 1.0
Strategic Purpose: Enable every customer segment with zero-regret hosting
Audience: Technical decision-makers, solutions architects, platform teams
Related Docs: 26, 49, 11, 107, 109, 43

Line Count: ~590 lines
Status: ✅ Complete - Deployment Models and Hosting Strategy

108. DEPLOYMENT MODELS & HOSTING STRATEGY

THE DESIGN PRINCIPLE: ZERO-REGRET HOSTING

WHAT HUMAN HOSTS VS WHAT WE REFUSE TO HOST

🔐 Layer 0 – Devices (non-negotiable)

🧠 Layer 1 – HUMAN Control Plane (this is the "managed hosting" we do want)

What Lives Here

Deployment Options

📦 Layer 2 – Data & Systems (we stay aggressively out of here)

THE THREE DEPLOYMENT PROFILES

Profile 1: Hosted (SMB Default)

Profile 2: Hybrid (Enterprise Common)

Profile 3: Self-Hosted (Regulated / Gov)

SELF-HOSTED SECURITY BOUNDARIES

Overview

What Self-Hosted Infrastructure CAN Do

✅ Identity Verification

✅ Org-Scoped Attestations

✅ Policy Engine & Agent Runtime Hosting

✅ Org and Agent Key Custody

What Self-Hosted Infrastructure CANNOT Do

❌ Server-Side Human Passport Creation

❌ Admin-Minted "Human" Identities

❌ Shared or Pooled Human Signing Keys

❌ Identity Recovery Performed by Infrastructure

❌ Silent Identity Creation or Modification

Breach Blast Radius Analysis

Hosted Profile Breach (HUMAN Cloud Compromise)

Hybrid Profile Breach (Data Plane Compromise)

Self-Hosted Profile Breach (Full Infrastructure Compromise)

Open Source Safety Guarantees

Why Code Modification Doesn't Grant Authority

Comparison: Self-Hosted HUMAN vs Self-Hosted Traditional Identity

Compliance Statement

Why This Architecture Wins

AI-POWERED INSTALLATION AUTOMATION

Installation Path 1: Companion Installer (Conversational)

Installation Path 2: Intelligent CLI

Installation Path 3: Cloud Marketplace

Installation Path 4: Manual (Advanced)

How AI-Powered Installation Works

Why This Matters

SELF-HOSTED ENTERPRISE REQUIREMENTS

Licensing & Enforcement

Support & Service Level Agreements

Total Cost of Ownership (TCO) Analysis

Reference Architectures

COMPLIANCE READINESS FOR SELF-HOSTED

HIPAA Compliance

FedRAMP Compliance (Moderate Baseline)

PCI-DSS Compliance

GDPR Compliance

AIR-GAPPED OPERATIONS (EXTENDED)

Update Distribution Methods

Update Bundle Contents

Local LLM Integration

Air-Gapped Certificate Management

Offline License Validation

Fallback & Degraded Mode

PERFORMANCE BENCHMARKS & CAPACITY PLANNING

Capacity Planning Formulas

Example: 200 Agent Deployment

Performance Targets

Load Testing Recommendations

Performance Tuning

Monitoring Key Metrics

MULTI-TENANCY IN SELF-HOSTED DEPLOYMENTS

Use Cases

Architecture: Namespace Isolation

Isolation Guarantees

Kubernetes Configuration

Licensing for Multi-Tenant MSPs

Security Considerations

ENTERPRISE INTEGRATION PATTERNS

Identity Federation

Corporate Proxy Support

VPN & Private Connectivity

Custom Certificate Authority

SIEM Integration

DLP Integration

UPGRADE STRATEGY & BREAKING CHANGES