21. THE CAPABILITY GRAPH ENGINE v0.1

Technical Implementation Specification


The Capability Graph (CG) is not a model, a score, or a database.

It is the first living representation of human capability, built from:

  • real actions
  • real evidence
  • real decisions
  • real collaboration with AI
  • and real demonstrations of judgment

CAPABILITY ENGINE (PROTOCOL-LEVEL): CG ENGINE + HUMANOS CRE

In HUMAN canon, the protocol capability engine is the joint behavior of:

  • Capability Graph Engine: ingests governed events and updates capability (humans, agents, models) from evidence
  • HumanOS Capability Resolution Engine (CRE): uses the Capability Graph + task metadata + risk/policy constraints to route work and decide escalation

Academy is one (high-quality, guided) source of capability evidence. Production work (HumanOS logs), Workforce Cloud execution history, and external attestations are also first-class evidence sources.

How this shows up in Passport: the Capability Graph lives in the actor’s vault; the Passport exposes capability evolution through pointers (CapabilityGraphRoot) and proof references (LedgerRefs) rather than embedding the full graph directly. See: 20_passport_identity_layer.md → “Passport Growth”.

MVP: CAPABILITY-LITE (Foundation Phase, Week 1-2)

See: 15_protocol_foundation_and_build_sequence.md for the canonical build sequence.

Before building the full Capability Graph specification below, we build Capability-Lite — the minimum viable capability tracking that enables real "capability-weighted routing" from Day 1.

What Capability-Lite Includes

Component Foundation (Week 1-2) Full Spec (Wave 2+)
Nodes Simple capability strings with weights Full semantic ontology, taxonomies
Weights Manual weights (0.0-1.0) ML-derived, evidence-weighted
Evidence Task completion logs Multi-source (credentials, work, peer)
Updates Manual + basic rules Continuous learning, time decay
Queries "Does user X have capability Y?" Semantic similarity, inference
Proofs Signed capability assertions ZK proofs, selective disclosure

Capability-Lite Implementation

// Capability-Lite: Minimum viable capability tracking
interface CapabilityNode {
  id: string;
  passportDid: string;           // Owner's Passport DID
  name: string;                  // e.g., "ai_safety_evaluation", "rlhf_review"
  weight: number;                // 0.0 to 1.0
  evidenceCount: number;         // Number of supporting events
  lastUpdated: Date;
}

interface CapabilityEvidence {
  id: string;
  capabilityId: string;
  type: 'task_completion' | 'training' | 'manual';
  description: string;
  timestamp: Date;
  signedBy: string;              // DID of attestor
}

// Core operations
async function getCapabilities(passportDid: string): Promise<CapabilityNode[]>;
async function hasCapability(passportDid: string, name: string, minWeight: number): Promise<boolean>;
async function updateCapability(passportDid: string, name: string, delta: number, evidence: CapabilityEvidence): Promise<void>;
async function findQualifiedUsers(capability: string, minWeight: number): Promise<string[]>;

Capability-Lite Database Schema

PostgreSQL Implementation:

-- Capability Nodes
CREATE TABLE capability_nodes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  passport_did TEXT NOT NULL,
  name TEXT NOT NULL,
  category TEXT NOT NULL,
  weight NUMERIC(3,2) NOT NULL CHECK (weight >= 0 AND weight <= 1),
  confidence_interval JSONB,  -- {lower: 0.x, upper: 0.y}
  evidence_count INT DEFAULT 0,
  last_updated TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  created_at TIMESTAMPTZ DEFAULT NOW(),
  deleted_at TIMESTAMPTZ,
  UNIQUE(passport_did, name)
);

-- Evidence Records
CREATE TABLE capability_evidence (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  capability_node_id UUID REFERENCES capability_nodes(id) ON DELETE CASCADE,
  passport_did TEXT NOT NULL,
  evidence_type TEXT NOT NULL CHECK (evidence_type IN ('task_completion', 'training', 'credential', 'manual', 'peer_review')),
  description TEXT,
  metadata JSONB NOT NULL,
  weight_impact NUMERIC(3,2),
  signed_by TEXT NOT NULL,
  signature TEXT NOT NULL,
  recorded_at TIMESTAMPTZ DEFAULT NOW()
);

-- Critical indexes for routing queries
CREATE INDEX idx_capability_nodes_passport ON capability_nodes(passport_did) 
  WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_nodes_name_weight ON capability_nodes(name, weight DESC) 
  WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_nodes_category ON capability_nodes(category)
  WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_evidence_node ON capability_evidence(capability_node_id, recorded_at DESC);
CREATE INDEX idx_capability_evidence_metadata ON capability_evidence USING GIN(metadata);
CREATE INDEX idx_capability_evidence_passport ON capability_evidence(passport_did, recorded_at DESC);

Capability-Lite Query Patterns

Core Queries with Performance:

-- 1. Find users with specific capability above threshold (routing query)
SELECT passport_did, name, weight, confidence_interval
FROM capability_nodes
WHERE name = 'ai_safety_evaluation'
  AND weight >= 0.8
  AND deleted_at IS NULL
ORDER BY weight DESC
LIMIT 50;
-- Uses idx_capability_nodes_name_weight (Index Scan, <10ms at 100K rows)

-- 2. Get all capabilities for a user
SELECT id, name, category, weight, evidence_count, last_updated
FROM capability_nodes
WHERE passport_did = 'did:human:abc123'
  AND deleted_at IS NULL
ORDER BY category, weight DESC;
-- Uses idx_capability_nodes_passport (Index Scan, <5ms)

-- 3. Find users with MULTIPLE capabilities (complex routing)
SELECT cn.passport_did, 
       ARRAY_AGG(cn.name) AS capabilities,
       AVG(cn.weight) AS avg_weight
FROM capability_nodes cn
WHERE cn.name IN ('ai_safety', 'content_moderation', 'rlhf_review')
  AND cn.weight >= 0.7
  AND cn.deleted_at IS NULL
GROUP BY cn.passport_did
HAVING COUNT(DISTINCT cn.name) = 3  -- Must have ALL capabilities
ORDER BY avg_weight DESC
LIMIT 20;
-- Uses idx_capability_nodes_name_weight (Bitmap Index Scan, <50ms at 100K rows)

-- 4. Get evidence history for a capability
SELECT evidence_type, description, metadata, signed_by, recorded_at
FROM capability_evidence
WHERE capability_node_id = '<uuid>'
ORDER BY recorded_at DESC
LIMIT 100;
-- Uses idx_capability_evidence_node (Index Scan, <5ms)

-- 5. Query evidence metadata (e.g., find all task completions for a specific task type)
SELECT ce.passport_did, ce.evidence_type, ce.metadata, ce.recorded_at
FROM capability_evidence ce
WHERE ce.metadata @> '{"taskType": "safety_evaluation"}'
ORDER BY ce.recorded_at DESC
LIMIT 100;
-- Uses idx_capability_evidence_metadata GIN index (Bitmap Index Scan, <20ms)

Capability-Lite Routing Implementation

// How routing uses Capability-Lite
async function routeTask(task: Task): Promise<string> {
  const requiredCapabilities = task.requiredCapabilities;
  const minWeight = task.riskLevel === 'high' ? 0.8 : 0.6;
  
  // Find users with ALL required capabilities above threshold
  const qualified = await findQualifiedForAll(requiredCapabilities, minWeight);
  
  if (qualified.length === 0) {
    throw new NoQualifiedReviewerError(task);
  }
  
  // Among qualified, select by availability (simple for MVP)
  return selectByAvailability(qualified);
}

async function findQualifiedForAll(
  capabilities: string[], 
  minWeight: number
): Promise<string[]> {
  // SQL query with proper indexing
  const result = await db.query(`
    SELECT cn.passport_did, 
           ARRAY_AGG(cn.name) AS capabilities,
           AVG(cn.weight) AS avg_weight
    FROM capability_nodes cn
    WHERE cn.name = ANY($1::text[])
      AND cn.weight >= $2
      AND cn.deleted_at IS NULL
    GROUP BY cn.passport_did
    HAVING COUNT(DISTINCT cn.name) = $3
    ORDER BY avg_weight DESC
  `, [capabilities, minWeight, capabilities.length]);
  
  return result.rows.map(r => r.passport_did);
}

async function updateCapabilityFromEvidence(
  passportDid: string,
  capabilityName: string,
  evidence: CapabilityEvidence
): Promise<void> {
  await db.transaction(async (tx) => {
    // Insert evidence
    await tx.query(`
      INSERT INTO capability_evidence (
        capability_node_id, passport_did, evidence_type,
        description, metadata, weight_impact, signed_by, signature
      ) VALUES (
        (SELECT id FROM capability_nodes 
         WHERE passport_did = $1 AND name = $2),
        $1, $3, $4, $5, $6, $7, $8
      )
    `, [passportDid, capabilityName, evidence.type, evidence.description,
        evidence.metadata, evidence.weightImpact, evidence.signedBy, evidence.signature]);
    
    // Update capability weight and evidence count
    await tx.query(`
      UPDATE capability_nodes
      SET weight = LEAST(1.0, weight + $3),
          evidence_count = evidence_count + 1,
          last_updated = NOW()
      WHERE passport_did = $1 AND name = $2
    `, [passportDid, capabilityName, evidence.weightImpact]);
  });
}

Why Capability-Lite Matters

Without Capability-Lite:

  • "Capability-weighted routing" is just marketing
  • No difference from Scale.AI's crowd workers
  • No evidence trail for capability claims

With Capability-Lite:

  • Tasks route to actually qualified reviewers
  • Evidence accumulates with each completed task
  • Real differentiation from Day 1

Capability-Lite is the foundation. The full spec below is the vision.


FULL SPECIFICATION

This section explains how the engine actually works — technically, operationally, mathematically, and architecturally.


PURPOSE OF THE ENGINE

The engine must do four things simultaneously:

1. Observe

Capture meaningful, structured human behavior from:

  • training interactions
  • Workforce Cloud workflows
  • AI/human collaboration events
  • decisions made under HumanOS routing
  • peer interaction
  • moments of care, safety, nuance, escalation

2. Interpret

Turn those signals into:

  • capability nodes
  • weighted edges (strength of evidence)
  • pattern clusters
  • confidence scores
  • time-based decay and reinforcement

3. Represent

Produce a portable capability graph that:

  • lives inside the HUMAN Passport
  • updates continuously
  • is cryptographically anchored
  • is selectively revealable (e.g., "prove I'm qualified for X")

4. Protect

Ensure the graph is:

  • never comparative
  • never used to rank humans
  • never used to exclude someone
  • never owned by an employer
  • free from bias, gaming, or manipulation

CAPABILITY GRAPH STRUCTURE DIAGRAM

graph TB
    subgraph "Human Identity"
        Human[<b>Human</b><br/>via Passport DID]
    end
    
    subgraph "Capability Nodes (Dynamic & Evidence-Based)"
        subgraph "Core Capabilities"
            Judgment[<b>Judgment</b><br/>Weight: 0.85<br/>Evidence: 47 events]
            Empathy[<b>Empathy</b><br/>Weight: 0.78<br/>Evidence: 32 events]
            Safety[<b>Safety Detection</b><br/>Weight: 0.92<br/>Evidence: 58 events]
        end
        
        subgraph "Domain Capabilities"
            Healthcare[<b>Healthcare</b><br/>Weight: 0.68<br/>Evidence: 21 events]
            Legal[<b>Legal Reasoning</b><br/>Weight: 0.55<br/>Evidence: 12 events]
            Technical[<b>Technical Analysis</b><br/>Weight: 0.72<br/>Evidence: 35 events]
        end
        
        subgraph "Emerging Capabilities (Learning)"
            Finance[<b>Financial Analysis</b><br/>Weight: 0.35<br/>Evidence: 5 events<br/><i>Growing</i>]
            Education[<b>Educational Design</b><br/>Weight: 0.28<br/>Evidence: 3 events<br/><i>New</i>]
        end
    end
    
    subgraph "Evidence Sources"
        AcademyEvidence[<b>Academy</b><br/>Training Attestations]
        WorkforceEvidence[<b>Workforce Cloud</b><br/>Task Completions]
        HumanOSEvidence[<b>HumanOS</b><br/>Decision Logs]
        PeerEvidence[<b>Peer Validation</b><br/>Collaborative Evidence]
    end
    
    subgraph "Graph Properties"
        TimeDecay[<b>Time Decay</b><br/>Older evidence<br/>weighs less]
        Reinforcement[<b>Reinforcement</b><br/>Repeated success<br/>increases weight]
        CrossDomain[<b>Cross-Domain Patterns</b><br/>Transfer learning<br/>between capabilities]
    end
    
    Human --> Judgment
    Human --> Empathy
    Human --> Safety
    Human --> Healthcare
    Human --> Legal
    Human --> Technical
    Human --> Finance
    Human --> Education
    
    AcademyEvidence --> Judgment
    AcademyEvidence --> Empathy
    AcademyEvidence --> Safety
    AcademyEvidence --> Finance
    AcademyEvidence --> Education
    
    WorkforceEvidence --> Healthcare
    WorkforceEvidence --> Legal
    WorkforceEvidence --> Technical
    
    HumanOSEvidence --> Judgment
    HumanOSEvidence --> Safety
    
    PeerEvidence --> Empathy
    PeerEvidence --> Technical
    
    Judgment -.->|Correlation| Safety
    Empathy -.->|Correlation| Healthcare
    Technical -.->|Enables| Legal
    Healthcare -.->|Transfer Learning| Finance
    
    TimeDecay -.->|Governs| Judgment
    Reinforcement -.->|Strengthens| Safety
    CrossDomain -.->|Connects| Finance
    
    style Human fill:#2ECC71,stroke:#27AE60,stroke-width:4px,color:#fff
    style Judgment fill:#3498DB,stroke:#2E7CB8,stroke-width:3px,color:#fff
    style Safety fill:#E74C3C,stroke:#C73C2C,stroke-width:3px,color:#fff
    style Finance fill:#F39C12,stroke:#D68910,stroke-width:2px,color:#fff
    style AcademyEvidence fill:#9B59B6,stroke:#8E44AD,stroke-width:2px,color:#fff
    style TimeDecay fill:#95A5A6,stroke:#7F8C8D,stroke-width:2px

Key Graph Features:

  1. Node Weights - Each capability has a weight (0.0-1.0) based on quantity and quality of evidence
  2. Evidence Count - Number of events supporting each capability (transparent provenance)
  3. Time Decay - Older evidence naturally decreases in weight unless reinforced
  4. Reinforcement Learning - Repeated demonstrations strengthen capability nodes
  5. Cross-Domain Edges - Capabilities can correlate or transfer (e.g., healthcare → financial analysis)
  6. Selective Disclosure - Users can prove specific capabilities via zero-knowledge proofs without revealing full graph
  7. Non-Comparative - Each graph is individual; no ranking or comparison with others
  8. Privacy-First - Graph lives in user's Passport, not centralized database

MULTI-SOURCE CAPABILITY EVIDENCE ARCHITECTURE

The Capability Graph integrates three types of evidence, each serving a distinct purpose:

The strategic insight: Capability Graph doesn't replace traditional credentials—it integrates and validates them, then adds real-time work evidence on top.

This makes HUMAN valuable for:

  • Recent college grads (have education, need practical experience)
  • Displaced white-collar workers (have deep experience in one domain, pivoting to another)
  • Senior experts (have decades of mastery, learning AI collaboration)
  • Career changers (leveraging transferable capabilities)

The Three Evidence Types

Type 1: Foundational Credentials (Traditional Education & Licensing)

Purpose: Establish baseline competency via formal education and professional licensing.

Sources:

  • Universities (Bachelor's, Master's, PhD, professional degrees)
  • Certification bodies (CPA, PMP, AWS Certified, etc.)
  • Licensing boards (MD, JD, PE, RN, etc.)
  • Professional organizations (IEEE, ACM, ABA, etc.)

Verification Method:

  • Cryptographic attestation from issuing institution
  • Institution signs credential with their private key
  • Attestation includes: degree/license, date issued, student/licensee ID
  • Stored in Passport, verified on-chain

Weight Contribution:

  • Initial weight: 0.5-0.8 (high, but not maximum—degrees prove potential, not current ability)
  • Relevance decay: Slow (degrees don't expire, but become less relevant over time without practice)
  • Example: CS degree from MIT → 0.70 initial weight in "software-engineering" capability

Schema:

interface CredentialEvidence {
  type: "credential";
  source: string; // "MIT", "American Board of Radiology"
  credential: string; // "Bachelor of Science in Computer Science"
  specialization?: string; // "Machine Learning"
  issuedDate: Date;
  expirationDate?: Date; // For licenses/certifications that expire
  credentialId: string; // Unique ID from issuer
  verificationStatus: VerificationStatus;
  
  // Cryptographic proof
  issuerSignature: string;
  issuerPublicKey: string;
  attestationHash: string;
  
  // Weight contribution
  contribution: number; // 0.0-1.0
  relevanceDecayRate: number; // How fast this becomes less relevant
  lastValidated: Date;
}

Example: Recent College Grad (Alex)

{
  capabilityId: "software-engineering",
  weight: 0.65,
  evidence: [
    {
      type: "credential",
      source: "University of Michigan",
      credential: "Bachelor of Science in Computer Science",
      issuedDate: "2024-05-15",
      verificationStatus: "issuer_verified",
      issuerSignature: "0x7a8f...", // Cryptographic signature from U-M
      contribution: 0.60, // 60% of total weight comes from degree
      relevanceDecayRate: 0.05 // Decays 5% per year without practice
    },
    {
      type: "credential",
      source: "University of Michigan",
      credential: "Senior Capstone Project",
      description: "Built ML model for fraud detection",
      verificationStatus: "issuer_verified",
      contribution: 0.05
    }
  ],
  lastUpdated: "2024-05-20",
  freshness: 1.0 // Recently updated
}

Type 2: Professional Experience (Employer Attestations & Provenance)

Purpose: Validate practical experience and domain mastery from previous employment.

Sources:

  • Previous employers (HR departments, managers)
  • Clients (for consultants, freelancers)
  • Project collaborators (peer attestations)
  • HUMAN provenance logs (if previous work was done through HUMAN)

Verification Method:

  • Employer attestation: Signed document from HR/manager confirming role, duration, responsibilities
  • Provenance logs: If work was done through HUMAN, cryptographic logs of tasks completed
  • Peer attestations: Colleagues verify collaboration and capability
  • LinkedIn-style verification: Contacts can attest to working together (but weighted lower than formal attestations)

Weight Contribution:

  • Initial weight: 0.6-0.9 (very high—proven experience)
  • Experience weight formula: baseWeight × (years / 10)^0.5 × recencyFactor
    • 2 years experience: 0.6 × √0.2 = 0.27
    • 5 years experience: 0.6 × √0.5 = 0.42
    • 10 years experience: 0.6 × √1.0 = 0.60
    • 20 years experience: 0.6 × √2.0 = 0.85
  • Relevance decay: Medium (experience stays relevant for 5-10 years in most fields, then needs refreshing)

Schema:

interface ProfessionalExperienceEvidence {
  type: "professional_experience";
  source: string; // "TechCorp", "Mayo Clinic"
  role: string; // "VP of Marketing", "Attending Radiologist"
  duration: {
    start: Date;
    end: Date;
    yearsExperience: number;
  };
  
  // What they actually did
  responsibilities: string[];
  achievements?: string[];
  projectsCompleted?: number;
  peopleManaged?: number;
  
  // Verification
  employerAttestation?: {
    signedBy: string; // HR director, manager name
    signedDate: Date;
    signature: string; // Cryptographic signature
    attestationDocument: string; // PDF/document hash
  };
  
  provenanceLogs?: {
    tasksCompleted: number;
    averageQuality: number;
    domainsWorked: string[];
  };
  
  peerAttestations?: {
    attestorId: PassportId;
    relationship: "colleague" | "manager" | "client";
    attestationText: string;
    signedDate: Date;
  }[];
  
  // Weight contribution
  contribution: number;
  relevanceDecayRate: number;
}

Example: Displaced White-Collar Worker (Jennifer, Marketing VP)

{
  capabilityId: "strategic-planning",
  weight: 0.88,
  evidence: [
    {
      type: "credential",
      source: "Northwestern University",
      credential: "MBA",
      issuedDate: "2005-06-15",
      contribution: 0.40
    },
    {
      type: "professional_experience",
      source: "TechCorp",
      role: "VP of Marketing",
      duration: {
        start: "2016-03-01",
        end: "2024-11-01",
        yearsExperience: 8
      },
      responsibilities: [
        "Led marketing strategy for $500M revenue division",
        "Managed team of 25",
        "Launched 12 major campaigns"
      ],
      employerAttestation: {
        signedBy: "Jane Doe, Chief People Officer",
        signedDate: "2024-11-15",
        signature: "0x9f2a...",
        attestationDocument: "ipfs://Qm..."
      },
      contribution: 0.48, // 8 years experience = significant weight
      relevanceDecayRate: 0.08 // Decays 8% per year in fast-moving field
    }
  ],
  lastUpdated: "2024-11-20"
}

Example: Senior Expert (Dr. Patel, Radiologist)

{
  capabilityId: "radiology-diagnosis",
  weight: 0.96, // Near-maximum, decades of evidence
  evidence: [
    {
      type: "credential",
      source: "American Board of Radiology",
      credential: "Board Certification in Diagnostic Radiology",
      issuedDate: "2004-07-01",
      expirationDate: "2034-07-01", // Lifetime certification
      verificationStatus: "issuer_verified",
      contribution: 0.50
    },
    {
      type: "credential",
      source: "Johns Hopkins University",
      credential: "Doctor of Medicine",
      issuedDate: "2000-05-20",
      contribution: 0.30
    },
    {
      type: "professional_experience",
      source: "Mayo Clinic",
      role: "Attending Radiologist",
      duration: {
        start: "2004-08-01",
        end: "2024-10-01",
        yearsExperience: 20
      },
      responsibilities: ["Diagnostic radiology", "Training residents", "Research"],
      achievements: ["~50,000 diagnostic reads over career"],
      employerAttestation: {
        signedBy: "Dr. Sarah Johnson, Department Chair",
        signedDate: "2024-10-15",
        signature: "0x3d1b...",
        attestationDocument: "ipfs://Qm..."
      },
      contribution: 0.16, // Even with 20 years, not 100%—needs to stay current
      relevanceDecayRate: 0.03 // Medical knowledge decays slowly but does decay
    }
  ],
  lastUpdated: "2024-11-01",
  freshness: 0.95 // Recently active
}

Type 3: Demonstrated Work (Real-Time Task Performance)

Purpose: Prove current ability through actual task completion. This is the most dynamic and trustworthy evidence type.

Sources:

  • Workforce Cloud tasks (primary source)
  • Academy assessments (training tasks)
  • HumanOS-routed decisions (live work)
  • Peer reviews (quality validation)

Verification Method:

  • HumanOS provenance logs: Every task has cryptographic audit trail
  • Outcome metrics: Accuracy, speed, quality, client satisfaction
  • Peer review: Other workers or experts validate quality
  • A/B ground truth: Some tasks have known correct answers for validation

Weight Contribution:

  • Grows over time: Starts low (0.1-0.3), grows to high (0.8-0.9) as evidence accumulates
  • Formula: baseWeight + (tasksCompleted / 1000)^0.5 × qualityFactor
    • 10 tasks at 90% accuracy: 0.3 + √0.01 × 0.90 = 0.39
    • 100 tasks at 95% accuracy: 0.3 + √0.10 × 0.95 = 0.60
    • 500 tasks at 95% accuracy: 0.3 + √0.50 × 0.95 = 0.97
  • Relevance decay: Fast (if you stop doing the work, weight decays quickly—"use it or lose it")
  • Freshness bonus: Recent work gets higher weight

Schema:

interface DemonstratedWorkEvidence {
  type: "demonstrated_work";
  source: "workforce_cloud" | "academy" | "humanos" | "peer_review";
  description: string; // "Completed 500 code reviews"
  
  // Performance metrics
  tasksCompleted: number;
  accuracy?: number; // 0.0-1.0
  averageCompletionTime?: number; // milliseconds
  clientSatisfactionScore?: number; // 0.0-1.0
  peerReviewScore?: number; // 0.0-1.0
  
  // Time distribution
  timeRange: {
    firstTask: Date;
    lastTask: Date;
    totalDuration: number; // milliseconds
  };
  
  // Quality indicators
  errorsDetected: number; // Errors they caught
  errorsMade: number; // Errors they made
  escalationsHandled: number;
  edgeCasesResolved: number;
  
  // Provenance
  provenanceLogs: string[]; // Array of ledger transaction IDs
  
  // Weight contribution
  contribution: number;
  freshnessWeight: number; // Bonus for recent work
  relevanceDecayRate: number; // Decays fast if not maintained
}

Example: Recent College Grad (Alex) After 3 Months Workforce Cloud

{
  capabilityId: "software-engineering",
  weight: 0.82, // Grew from 0.65 initial (degree only)
  evidence: [
    {
      type: "credential",
      source: "University of Michigan",
      credential: "Bachelor of Science in Computer Science",
      contribution: 0.45 // Was 0.60, now diluted by new evidence (but still significant)
    },
    {
      type: "demonstrated_work",
      source: "workforce_cloud",
      description: "Completed 200 code reviews",
      tasksCompleted: 200,
      accuracy: 0.94,
      clientSatisfactionScore: 0.91,
      peerReviewScore: 0.88,
      timeRange: {
        firstTask: "2024-06-01",
        lastTask: "2024-09-01",
        totalDuration: 7776000000 // 90 days
      },
      errorsDetected: 47, // Caught 47 bugs in reviewed code
      errorsMade: 3, // Made 3 mistakes in reviews
      provenanceLogs: ["0xabc123...", "0xdef456...", ...],
      contribution: 0.37, // 37% of weight now comes from proven work
      freshnessWeight: 1.0, // Recent work
      relevanceDecayRate: 0.15 // Decays 15% per year if stops working
    }
  ],
  lastUpdated: "2024-09-01",
  freshness: 1.0
}

Example: Senior Expert (Dr. Patel) After 1 Month AI-Assisted Radiology

{
  capabilityId: "ai-assisted-radiology",
  weight: 0.88, // Rapid growth because foundation is so strong
  evidence: [
    {
      type: "academy_training",
      source: "academy",
      modulesCompleted: [
        "AI Radiology Systems Overview",
        "Reviewing AI Diagnostic Outputs",
        "When to Trust vs. Override AI"
      ],
      totalHours: 25,
      assessmentScores: [0.95, 0.98, 0.92],
      contribution: 0.30 // Training provides foundation
    },
    {
      type: "demonstrated_work",
      source: "workforce_cloud",
      description: "Reviewed 1,000 AI radiology cases",
      tasksCompleted: 1000,
      accuracy: 0.98, // 98% agreement with ground truth
      averageCompletionTime: 180000, // 3 minutes per case (expert speed)
      timeRange: {
        firstTask: "2024-10-01",
        lastTask: "2024-11-01",
        totalDuration: 2592000000 // 30 days
      },
      aiOverrideRate: 0.12, // Corrected AI 12% of the time
      errorsDetected: 120, // Caught 120 AI errors
      errorsMade: 2, // Made 2 mistakes (extremely low)
      provenanceLogs: [...],
      contribution: 0.58, // Demonstrated work carries most weight
      freshnessWeight: 1.0,
      relevanceDecayRate: 0.10
    }
  ],
  // Dr. Patel's radiology-diagnosis capability (0.96) acts as foundation
  transferredFrom: {
    capabilityId: "radiology-diagnosis",
    transferWeight: 0.85 // 85% of that capability transfers to AI-assisted version
  }
}

Weight Calculation: Multi-Source Fusion

How the three evidence types combine:

function calculateCapabilityWeight(
  credentials: CredentialEvidence[],
  experience: ProfessionalExperienceEvidence[],
  demonstratedWork: DemonstratedWorkEvidence[]
): number {
  // 1. Calculate contribution from each evidence type
  const credentialWeight = credentials.reduce((sum, c) => 
    sum + (c.contribution * getRelevancyFactor(c)), 0
  );
  
  const experienceWeight = experience.reduce((sum, e) => 
    sum + (e.contribution * getRelevancyFactor(e)), 0
  );
  
  const workWeight = demonstratedWork.reduce((sum, w) => 
    sum + (w.contribution * w.freshnessWeight * getRelevancyFactor(w)), 0
  );
  
  // 2. Combine with diminishing returns (can't exceed 1.0)
  // Formula: 1 - (1 - cred) × (1 - exp) × (1 - work)
  // This ensures:
  // - Multiple strong signals compound
  // - But never exceed 1.0
  // - Weak signals have less impact
  const combinedWeight = 1 - (
    (1 - credentialWeight) *
    (1 - experienceWeight) *
    (1 - workWeight)
  );
  
  // 3. Apply floor and ceiling
  return Math.max(0.0, Math.min(1.0, combinedWeight));
}

function getRelevancyFactor(evidence: Evidence): number {
  const ageInYears = (Date.now() - evidence.lastUpdated) / (365.25 * 24 * 60 * 60 * 1000);
  return Math.exp(-evidence.relevanceDecayRate * ageInYears);
}

Example calculation for Alex (recent grad after 3 months):

Credential: 0.60 contribution × 1.0 relevancy = 0.60
Experience: 0 (no prior work experience)
Demonstrated: 0.37 contribution × 1.0 freshness × 1.0 relevancy = 0.37

Combined: 1 - (1 - 0.60) × (1 - 0.0) × (1 - 0.37)
        = 1 - (0.40 × 1.0 × 0.63)
        = 1 - 0.252
        = 0.748
        ≈ 0.75 (but Alex actually has 0.82 because of additional evidence from projects)

Key insight: Recent demonstrated work (0.37) combined with degree (0.60) creates 0.75+ weight. This is higher than degree alone (0.60) but still shows the degree matters (without degree, just 0.37 from work alone).


Strategic Advantages of Multi-Source Architecture

1. Fair to Recent Grads

  • Degree provides strong initial weight (0.6-0.7)
  • But must prove practical ability through demonstrated work
  • Prevents "degree mills" from gaming system (low work performance = low final weight)

2. Fair to Career Changers

  • Previous experience provides foundation
  • Capability Graph identifies transferable capabilities
    • Example: Jennifer's "strategic-planning" (0.88) transfers 70% to "ai-workflow-design" → starts at 0.62 instead of 0.1
  • Academy training fills specific gaps
  • Demonstrated work proves ability in new domain

3. Fair to Senior Experts

  • Decades of experience + credentials provide near-maximum weight (0.9+)
  • Need minimal Academy training (just "AI collaboration" not "learn the field")
  • Start at L5 (expert tier) immediately in Workforce Cloud
  • Demonstrated work maintains and updates weight (prevents stagnation)

4. Anti-Gaming by Design

  • Can't fake credentials: Cryptographically verified by issuing institution
  • Can't fake experience: Requires employer attestation (signed by HR/manager)
  • Can't fake work: Provenance logs every task, peer review validates quality
  • Can't buy weight: Must actually complete tasks at high quality

5. Reveals Hidden Capability

  • Traditional model: "I don't have a degree so I can't prove my ability"
  • Capability Graph: "You don't have a degree, but you completed 1,000 tasks at 95% accuracy—you're proven capable"
  • Example: Jamal (no degree) reaches L4 (expert tier) through 3 years of demonstrated work

6. Keeps Seniors Current

  • Traditional model: "I have 20 years experience" (but when was last time you did the work?)
  • Capability Graph: Checks freshness—if no demonstrated work in 2 years, weight decays
  • Forces continuous learning and practice (anti-stagnation)

INPUT CHANNELS (Where the CG Gets Its Data)

The engine ingests from five primary sources plus three evidence types (credentials, experience, demonstrated work):

1. Academy Training Cycles

Signals include:

  • pattern recognition under time pressure
  • safety judgment calls
  • ethical decision branching
  • cognitive load management
  • escalation detection
  • attention switching
  • error correction

Each training block produces:

  • micro-attestations
  • capability deltas
  • weighted nodes

2. Workforce Cloud Assignments

Every real task yields:

  • provenance logs
  • correctness signals
  • escalation rationale
  • cooperation with AI companions
  • outcome quality and timeliness
  • repeatability

These generate:

  • reliability edges
  • situational judgment weights
  • domain-specific capability boosts

3. AI/Human Collaborative Events

Every AI decision that requires a human override or approval produces:

  • risk-class annotations
  • override justification
  • "pattern felt wrong" signals
  • counterfactual expectations

These strengthen:

  • meta-cognition nodes
  • anomaly detection weights
  • trust-sensing edges

4. Peer + Mentor Interactions

When a human:

  • helps someone else
  • mentors a junior worker
  • resolves interpersonal friction
  • contributes to group judgment

We generate:

  • empathy nodes
  • conflict resolution edges
  • communication capability weights

5. HumanOS Routing Context

The fact that HumanOS chose a person for a particular task is itself signal:

  • the system sees them as capable under certain constraints
  • repeated routing builds strong evidence

HumanOS → CG → HumanOS becomes a virtuous loop.


CAPABILITY VERIFICATION & PROGRESSIVE TRUST MODEL

The Capability Graph accepts capability data from multiple sources with varying levels of trust. Rather than requiring perfect verification upfront, the system implements a progressive trust model that allows capabilities to be claimed early and verified over time.

This pragmatic approach enables:

  • Rapid onboarding (humans don't wait weeks for credential verification)
  • Immediate participation (low-trust tasks available while verification proceeds)
  • Graceful integration (works before every institution has a VC issuer)
  • Continuous improvement (capabilities strengthen with evidence)

Verification Status Taxonomy

Each capability node and evidence item carries a verification status that influences its weight in the graph:

export type VerificationStatus =
  | "self_reported"      // User entered it, no external verification
  | "pending"            // Verification requested but not complete
  | "document_provided"  // User uploaded supporting document
  | "human_verified"     // HUMAN staff manually reviewed
  | "api_verified"       // Automated check against issuer registry
  | "issuer_verified"    // Direct VC from authoritative source
  | "proven"             // Demonstrated through task performance
  | "revoked"            // Issuer revoked the credential
  | "expired";           // Credential past expiration date

The Verification Ladder

Capabilities progress through verification stages, with each stage increasing the capability's weight and routing eligibility:

Level Status Suggested Weight Range Method Example
0 self_reported 0.10 - 0.30 User claims during onboarding "I'm a licensed RN in California"
1 document_provided 0.30 - 0.50 User uploads certificate/transcript PDF scan of nursing license
2 human_verified 0.40 - 0.60 HUMAN staff reviews document Reviewer confirms license format
3 api_verified 0.50 - 0.70 Automated registry lookup Query CA Board of Nursing API
4 issuer_verified 0.70 - 0.85 Direct VC from issuer California Board signs VC
5 proven 0.70 - 0.95 Evidence from completed tasks 50+ successful triage decisions

Note: These weight ranges are suggestions for initial implementation. Actual weights should be tuned based on:

  • Domain requirements (healthcare may need higher thresholds than general skills)
  • Risk tolerance (high-stakes tasks require higher verification)
  • Evidence accumulation patterns (multiple weak signals can exceed one strong credential)
  • Observed correlation between verification level and actual performance

Pre-Integration Strategy

Before institutions have native VC issuers, HUMAN uses a bridging strategy:

Phase 1 - Launch (Months 1-6):

  • Accept all self-reported credentials
  • Prompt users to upload supporting documents
  • Manual review by trained HUMAN staff for high-value credentials
  • Basic API checks where available (e.g., state licensing boards)
  • Low routing priority for unverified capabilities

Phase 2 - Partnerships (Months 6-18):

  • Partner with major issuers (universities, licensing boards, employers)
  • Build integration adapters for existing credential systems
  • Automated verification flows via OAuth + data sharing agreements
  • Retroactive upgrades: self-reported → issuer-verified automatically

Phase 3 - Protocol Adoption (18+ months):

  • Direct VC issuance from authoritative sources
  • Zero HUMAN intermediation for credential verification
  • Instant verification for participating institutions
  • Self-reported capabilities become rare edge cases

Evidence Accumulation & Weight Dynamics

Multiple evidence items combine to strengthen capability weight over time:

// Suggested weight calculation (simplified)
function calculateCapabilityWeight(
  capability: CapabilityNode
): number {
  const evidenceWeights = capability.evidence.map(e => {
    // Base quality score from verification status
    let weight = e.qualityScore;
    
    // Time decay: older evidence counts less (unless credential-based)
    if (e.source === 'task_completion' || e.source === 'training') {
      const ageMonths = monthsSince(e.timestamp);
      const decayFactor = Math.exp(-ageMonths / 12); // 12-month half-life
      weight *= decayFactor;
    }
    
    // Credential expiration
    if (e.expiresAt && e.expiresAt < Date.now()) {
      weight = 0; // Expired credentials contribute nothing
    }
    
    // Revocation
    if (e.verificationStatus === 'revoked') {
      weight = 0;
    }
    
    return weight;
  });
  
  // Combine evidence: diminishing returns (can't just stack weak evidence)
  // Use logarithmic aggregation to reward diverse evidence
  const totalEvidence = evidenceWeights.reduce((sum, w) => sum + w, 0);
  const evidenceDiversity = evidenceWeights.length;
  
  const finalWeight = Math.min(
    0.95, // Cap at 0.95 (perfect certainty is impossible)
    Math.log1p(totalEvidence) / Math.log1p(10) * // Logarithmic scaling
    Math.sqrt(evidenceDiversity) / 3  // Diversity bonus
  );
  
  return finalWeight;
}

Key Principles:

  1. Multiple weak signals < One strong credential - But many task completions can rival credential verification
  2. Time decay for behavioral evidence - Skills atrophy without practice
  3. Credentials don't decay - Until expiration/revocation
  4. Diversity matters - Evidence from multiple sources is stronger than repeated evidence from one source

Real-World Onboarding Scenarios

Scenario 1: Nurse with License (Verifiable Credential Flow)

Day 1 - Self-Reported:

{
  capability: "registered-nurse-license-ca",
  weight: 0.20,  // Low - self-reported only
  verificationStatus: "self_reported",
  evidence: [{
    source: "self_reported",
    qualityScore: 0.20,
    description: "California RN License #RN-123456"
  }]
}

→ System behavior: Can browse tasks, cannot be routed to patient care

Day 3 - Issuer Verified: User clicks "Verify with California Board of Nursing" → OAuth flow → Board issues VC

{
  capability: "registered-nurse-license-ca",
  weight: 0.75,  // High - issuer verified
  verificationStatus: "issuer_verified",
  evidence: [
    // ... previous self-reported entry
    {
      source: "credential",
      qualityScore: 0.80,
      verificationStatus: "issuer_verified",
      issuerDID: "did:org:california-board-nursing",
      referenceId: "vc:ca-nursing:RN-123456",
      expiresAt: "2027-12-01"
    }
  ]
}

→ System behavior: Now eligible for triage tasks, HIPAA workflows, higher pay tier

Week 2 - Proven Through Performance: Completes 10 successful triage tasks

{
  capability: "clinical-triage-judgment",
  weight: 0.68,  // Built from observed behavior
  verificationStatus: "proven",
  evidence: [
    {
      source: "task_completion",
      qualityScore: 0.70,
      referenceId: "task:triage-001",
      description: "Emergency triage: correctly escalated chest pain"
    },
    // ... 9 more
  ]
}

→ New capability emerges that no credential could prove: pattern recognition, escalation sense

Scenario 2: Software Engineer (Gradual Verification)

Day 1 - Resume Import:

{
  workHistory: [{
    employer: "Meta",
    role: "Senior Software Engineer",
    verificationStatus: "self_reported",
    skills: ["Python", "Distributed Systems", "ML Infrastructure"]
  }],
  capabilities: [
    { id: "skill:python", weight: 0.25 },
    { id: "skill:distributed-systems", weight: 0.20 }
  ]
}

→ Low priority in matching

Week 1 - Employer Integration: Meta has HUMAN integration → Issues employment VC

{
  capabilities: [
    { 
      id: "skill:python", 
      weight: 0.65,  // Jumped from 0.25!
      evidence: [{
        source: "credential",
        issuerDID: "did:org:meta",
        verificationStatus: "issuer_verified"
      }]
    }
  ]
}

Week 2 - Performance Exceeds Credentials: Completes 5 ML infrastructure tasks with peer review

{
  capability: "ml-infrastructure-design",
  weight: 0.78,  // Higher than credential verification!
  evidence: [
    { source: "credential", qualityScore: 0.65 }, // Meta VC
    { source: "task_completion", qualityScore: 0.82 },
    { source: "peer_review", qualityScore: 0.85 },
    // ... 3 more tasks
  ]
}

Result: Performance-based evidence can exceed credential-based claims

Scenario 3: Education Without Integration (Stopgap Flow)

Day 1 - Self-Reported Degree:

{
  education: [{
    institution: "Bangalore University",
    degree: "B.S. Computer Science",
    verificationStatus: "self_reported",
    weight: 0.15
  }]
}

Week 1 - Document Upload: User uploads transcript PDF → HUMAN verification service

{
  verificationStatus: "human_verified",
  weight: 0.50,  // Better than self-reported
  evidence: [{
    source: "attestation",
    issuerDID: "did:org:human-verification-service",
    documentHash: "sha256:abc123...",
    notes: "Transcript verified against known Bangalore University format"
  }]
}

Year 2 - University Integration: Bangalore University joins HUMAN → User re-verifies → Weight jumps to 0.75

Handling Expired & Revoked Credentials

Credentials can lose validity over time:

Expiration (Gradual):

  • 90 days before expiration: Prompt user to renew
  • At expiration: Weight drops to 0, routing eligibility lost
  • User can appeal with renewal proof
  • Historical evidence remains in graph (for provenance)

Revocation (Immediate):

  • Issuer publishes revocation to ledger
  • Next graph update detects revocation
  • Weight immediately drops to 0
  • User notified of revocation reason
  • No routing until issue resolved

Example - License Expiration:

// Before expiration
{ capability: "rn-license-ca", weight: 0.80, expiresAt: "2026-06-01" }

// After expiration (June 2, 2026)
{ 
  capability: "rn-license-ca", 
  weight: 0.00,  // No longer valid
  verificationStatus: "expired",
  note: "License expired. Renew to regain routing eligibility."
}

// After renewal (user uploads new license)
{ 
  capability: "rn-license-ca", 
  weight: 0.75,  // Restored
  verificationStatus: "issuer_verified",
  expiresAt: "2028-06-01"
}

Connection to Passkeys & Device Security

All credential verification flows are bound to device-level security:

Identity Flow:

  1. User initiates verification ("Verify my nursing license")
  2. HUMAN redirects to issuer (California Board portal)
  3. User authenticates with issuer (their existing login)
  4. Issuer asks: "Issue credential to which DID?"
  5. User's Passport provides: did:human:sarah-abc123
  6. Issuer signs VC with their private key
  7. VC delivered to Passport, encrypted with user's DeviceKey
  8. Hash anchored to ledger
  9. Capability Graph updated

Security Properties:

  • Issuer never sees user's private keys
  • HUMAN never sees full VC content (only hash)
  • User can revoke access anytime
  • Multi-device sync via encrypted vault
  • Biometric authentication required for high-value credentials

Summary: Progressive Trust in Practice

The HUMAN verification model is:

  1. Accept everything initially - No barriers to entry, but limited privileges
  2. Prompt for verification - Guided flows to official issuers
  3. Upgrade as evidence arrives - VCs, task performance, peer attestations
  4. Continuously reinforce - Every task adds evidence, strengthens weight
  5. Expire when needed - Licenses lapse, credentials revoke, skills atrophy
  6. Behavioral proof can exceed credentials - Demonstrated capability beats claimed capability

This makes HUMAN practical (works before universal integration) while being aspirational (incentivizes verified credentials). The capability graph becomes a living, breathing, evidence-based representation that's more trustworthy than any static resume—and you own it completely.


INTERNAL STRUCTURE OF THE GRAPH

The Capability Graph is composed of:

Nodes (Capabilities)

Each node is a capability primitive, like:

  • pattern recognition
  • escalation sense
  • safety triage
  • ethical judgment
  • domain fluency
  • attention stability
  • ambiguity resolution
  • empathy projection
  • context restoration
  • anomaly detection

Nodes are:

  • extensible
  • modular
  • hierarchical
  • domain-specific when needed

Edges (Evidence Relationships)

Edges represent:

  • how often
  • how strongly
  • and in what context

a capability manifested.

Edges include:

  • timestamp
  • weight
  • context class
  • risk level
  • domain
  • verification source
  • whether escalation occurred

Weights (Confidence)

Computed from:

  • repeated evidence
  • cross-channel consistency
  • error/override patterns
  • contextual diversity
  • recency decay

Weights are never used to rank humans — only to help HumanOS route safely.


CAPABILITY TAXONOMY & ONTOLOGY

The Capability Graph needs a semantic understanding of skills, not just flat string labels. The current 5-category system (skill, judgment, experience, trait, certification) is a v0.1 placeholder. This section defines the Living Capability Ontology—a dynamic, semantic taxonomy that evolves with the economy.

The Problem with Flat Taxonomies

Traditional skill taxonomies fail because:

  • No semantic relationships - "Python" and "Machine Learning" often appear together, but flat lists don't capture this
  • No synonyms - "ML", "Machine Learning", "AI/ML" are treated as different skills
  • No hierarchy - "React" is a "JavaScript Framework" is a "Programming Language" is a "Technical Skill"
  • Can't handle emergence - "AI Safety Auditing" didn't exist 2 years ago; how does it enter the taxonomy?
  • Static definitions - "Web Development" in 2010 ≠ "Web Development" in 2025

The HUMAN Capability Ontology

HUMAN uses a multi-layered semantic ontology:

interface CapabilityDefinition {
  // Identity
  id: string;                        // "cap:python-programming"
  canonicalName: string;             // "Python Programming"
  
  // Taxonomy (broad categorization)
  category: CapabilityCategory;      // 'skill', 'judgment', 'experience', 'trait', 'certification'
  subcategory?: string;              // "Programming Languages"
  domain?: string;                   // "Software Engineering"
  
  // Semantic relationships
  synonyms: string[];                // ["Python", "python", "py", "Python3"]
  relatedCapabilities: {
    capabilityId: string;
    relationshipType: 'prerequisite' | 'complementary' | 'specialization' | 'often-paired';
    strength: number;                // 0-1, how strong the relationship
  }[];
  
  // Semantic embedding (for similarity search)
  embedding: number[];               // 768-dim vector from capability description
  
  // Definition
  description: string;               // Rich description for LLM understanding
  examples: string[];                // Example tasks: "Build REST APIs", "Data analysis with pandas"
  
  // Lifecycle
  status: 'emerging' | 'active' | 'evolving' | 'deprecated';
  createdAt: Date;
  updatedAt: Date;
  
  // Usage statistics (for trend detection)
  supplyCount: number;               // How many humans claim this capability
  demandCount: number;               // How many tasks request it
  trendDirection: 'rising' | 'stable' | 'declining';
  
  // Version history (for evolving capabilities)
  semanticDrift: number;             // 0-1, how much meaning has changed over time
  historicalDefinitions?: {
    dateRange: [Date, Date];
    description: string;
    embedding: number[];
  }[];
}

Capability Relationship Types

Capabilities connect to each other in structured ways:

1. Prerequisite - One capability requires another

{
  from: "cap:react-development",
  to: "cap:javascript",
  type: "prerequisite",
  strength: 0.95  // Strong dependency
}

2. Complementary - Often learned/used together

{
  from: "cap:kubernetes",
  to: "cap:docker",
  type: "complementary",
  strength: 0.88
}

3. Specialization - One is a more specific version

{
  from: "cap:pytorch",
  to: "cap:machine-learning",
  type: "specialization",
  strength: 0.92
}

4. Often-Paired - Frequently appear together in job requirements

{
  from: "cap:python",
  to: "cap:data-analysis",
  type: "often-paired",
  strength: 0.85
}

Semantic Embeddings

Every capability has a vector embedding that captures its semantic meaning:

// Generate embedding from capability description + examples
async function generateCapabilityEmbedding(
  capability: CapabilityDefinition
): Promise<number[]> {
  const text = `
    ${capability.canonicalName}
    
    Description: ${capability.description}
    
    Examples:
    ${capability.examples.join('\n')}
    
    Related to: ${capability.relatedCapabilities.map(r => r.capabilityId).join(', ')}
  `;
  
  // Use embedding model (e.g., OpenAI ada-002, Cohere embed-v3)
  const embedding = await embeddingProvider.embed(text);
  
  return embedding; // 768 or 1536 dimensional vector
}

Embeddings enable:

  • Semantic similarity search (find capabilities close to "AI safety")
  • Fuzzy matching (match "ML Engineer" to "Machine Learning Specialist")
  • Cross-language support (embeddings work across languages)
  • Continuous evolution (re-embed as definitions change)

Storage Architecture

-- Capability definitions table
CREATE TABLE capabilities (
  id TEXT PRIMARY KEY,
  canonical_name TEXT NOT NULL,
  category TEXT NOT NULL,
  subcategory TEXT,
  domain TEXT,
  description TEXT NOT NULL,
  examples JSONB,  -- Array of example tasks
  
  -- Semantic data
  synonyms JSONB,  -- Array of strings
  embedding VECTOR(768),  -- pgvector extension for semantic search
  
  -- Relationships stored separately (see below)
  
  -- Lifecycle
  status TEXT NOT NULL DEFAULT 'active',
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW(),
  
  -- Usage statistics
  supply_count INTEGER DEFAULT 0,
  demand_count INTEGER DEFAULT 0,
  trend_direction TEXT,
  semantic_drift DECIMAL DEFAULT 0.0
);

-- Semantic search index (cosine similarity)
CREATE INDEX capabilities_embedding_idx ON capabilities 
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

-- Full-text search on names and synonyms
CREATE INDEX capabilities_name_search_idx ON capabilities 
  USING gin(to_tsvector('english', canonical_name || ' ' || synonyms::text));

-- Capability relationships (edges in the ontology graph)
CREATE TABLE capability_relationships (
  from_capability_id TEXT NOT NULL REFERENCES capabilities(id),
  to_capability_id TEXT NOT NULL REFERENCES capabilities(id),
  relationship_type TEXT NOT NULL,
  strength DECIMAL NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  PRIMARY KEY (from_capability_id, to_capability_id, relationship_type)
);

-- Capability evolution history
CREATE TABLE capability_history (
  capability_id TEXT NOT NULL REFERENCES capabilities(id),
  valid_from TIMESTAMPTZ NOT NULL,
  valid_to TIMESTAMPTZ,
  description TEXT NOT NULL,
  embedding VECTOR(768),
  related_capabilities JSONB,
  PRIMARY KEY (capability_id, valid_from)
);

Example: "Python Programming" in the Ontology

{
  "id": "cap:python-programming",
  "canonicalName": "Python Programming",
  "category": "skill",
  "subcategory": "Programming Languages",
  "domain": "Software Engineering",
  "description": "Ability to write, debug, and maintain code in the Python programming language. Includes understanding of Python syntax, standard library, common frameworks, and best practices.",
  "examples": [
    "Write Python scripts for data processing",
    "Build REST APIs with Flask or FastAPI",
    "Develop data analysis pipelines with pandas",
    "Create machine learning models with scikit-learn"
  ],
  "synonyms": ["Python", "python", "Python3", "py"],
  "relatedCapabilities": [
    {
      "capabilityId": "cap:programming-fundamentals",
      "relationshipType": "prerequisite",
      "strength": 0.85
    },
    {
      "capabilityId": "cap:data-analysis",
      "relationshipType": "often-paired",
      "strength": 0.78
    },
    {
      "capabilityId": "cap:django",
      "relationshipType": "specialization",
      "strength": 0.70
    },
    {
      "capabilityId": "cap:flask",
      "relationshipType": "specialization",
      "strength": 0.72
    }
  ],
  "embedding": [0.023, -0.156, 0.089, ...],  // 768-dim vector
  "status": "active",
  "supplyCount": 45203,  // 45k humans have this capability
  "demandCount": 12456,  // 12k tasks requested it
  "trendDirection": "stable",
  "semanticDrift": 0.15  // Low drift - Python is Python
}

Querying the Ontology

Exact match:

SELECT * FROM capabilities 
WHERE canonical_name = 'Python Programming'
OR synonyms @> '["Python"]';

Semantic search (find capabilities similar to a query):

-- Find capabilities semantically similar to "AI safety auditing"
SELECT 
  c.id,
  c.canonical_name,
  c.description,
  1 - (c.embedding <=> $query_embedding::vector) AS similarity
FROM capabilities c
WHERE 1 - (c.embedding <=> $query_embedding::vector) > 0.70  -- Min 70% similarity
ORDER BY c.embedding <=> $query_embedding::vector  -- Cosine distance (lower = more similar)
LIMIT 20;

Graph traversal (find related capabilities):

-- Find all capabilities related to "Python"
WITH RECURSIVE related AS (
  -- Start with Python
  SELECT id, canonical_name, 0 AS depth
  FROM capabilities
  WHERE id = 'cap:python-programming'
  
  UNION
  
  -- Find capabilities related to capabilities we've found
  SELECT c.id, c.canonical_name, r.depth + 1
  FROM capabilities c
  JOIN capability_relationships cr ON c.id = cr.to_capability_id
  JOIN related r ON cr.from_capability_id = r.id
  WHERE r.depth < 3  -- Max 3 hops
    AND cr.strength > 0.5  -- Only strong relationships
)
SELECT DISTINCT * FROM related;

CAPABILITY DISCOVERY & EVOLUTION

The capability taxonomy must evolve continuously as new skills emerge and old ones become obsolete. This section describes how HUMAN discovers, validates, and tracks capability evolution.

The Capability Lifecycle

┌─────────────┐
│  Candidate  │ ← Detected from tasks, resumes, training
└──────┬──────┘
       │
       ↓ (Human curator approves)
┌─────────────┐
│  Emerging   │ ← New capability, limited evidence
└──────┬──────┘
       │
       ↓ (Usage > threshold)
┌─────────────┐
│   Active    │ ← Mainstream capability, high demand/supply
└──────┬──────┘
       │
       ↓ (Meaning changes over time)
┌─────────────┐
│  Evolving   │ ← Definition shifting (e.g., "Web Dev" 2010→2025)
└──────┬──────┘
       │
       ↓ (Demand drops, replaced by newer skills)
┌─────────────┐
│ Deprecated  │ ← Obsolete (e.g., "Flash Development")
└─────────────┘

Discovery Pipeline

HUMAN discovers new capabilities from four sources:

Source 1: Task Requests (Demand Side)

When enterprises submit tasks with capability requirements:

// Enterprise: "Need someone with prompt engineering expertise"
interface TaskRequest {
  description: string;
  requiredCapabilities: string[];  // ["prompt-engineering", "LLM-evaluation"]
}

// System detects: "prompt-engineering" not in taxonomy
async function processNewCapabilityCandidate(capabilityName: string) {
  // 1. Check if already exists (exact or synonym match)
  const existing = await findCapabilityBySynonym(capabilityName);
  if (existing) {
    await incrementDemandCount(existing.id);
    return existing;
  }
  
  // 2. Check semantic similarity to existing capabilities
  const embedding = await embeddingProvider.embed(capabilityName);
  const similar = await findSimilarCapabilities(embedding, minSimilarity: 0.85);
  
  if (similar.length > 0) {
    // High similarity - likely a synonym of existing capability
    await addSynonym(similar[0].id, capabilityName);
    return similar[0];
  }
  
  // 3. Genuinely new capability - create candidate
  const candidate = {
    id: generateId(),
    canonicalName: capabilityName,
    status: 'candidate',
    source: 'task_request',
    description: await llm.generate(`Describe the skill: ${capabilityName}`),
    examples: await llm.generate(`Give 3 example tasks for: ${capabilityName}`),
    embedding: embedding,
    demandCount: 1
  };
  
  // 4. Queue for human curator approval
  await queueForCuration(candidate);
  
  return candidate;
}

Source 2: Human Self-Reports (Supply Side)

When humans add capabilities to their profiles:

// User adds: "I'm skilled in AI red-teaming"
async function processHumanCapabilityClaim(
  passportId: string,
  capabilityName: string,
  evidence?: string
) {
  // Similar flow to task requests
  const capability = await findOrCreateCapability(capabilityName);
  
  // Add to human's capability graph
  await addCapabilityToGraph(passportId, capability.id, {
    weight: 0.20,  // Low weight - self-reported
    verificationStatus: 'self_reported',
    source: 'user_claim'
  });
  
  // Increment supply count
  await incrementSupplyCount(capability.id);
}

Source 3: Academy Training Modules

When new Academy courses are created:

// New course: "RAG System Design"
async function createAcademyCourse(course: {
  title: string;
  description: string;
  learningOutcomes: string[];
}) {
  // Extract capabilities from learning outcomes
  const capabilities = await llm.extractCapabilities(course.learningOutcomes);
  
  // For each capability:
  for (const cap of capabilities) {
    const capability = await findOrCreateCapability(cap.name, {
      status: 'active',  // Academy-validated = auto-approve
      description: cap.description,
      examples: course.learningOutcomes,
      source: 'academy'
    });
    
    // Link course to capability
    await linkCourseToCapability(course.id, capability.id);
  }
}

Source 4: Workforce Evidence (Revealed Preferences)

Analyze patterns in successful task completions:

// Batch job: Analyze task completion patterns
async function analyzeWorkforcePatterns() {
  // Find tasks completed by humans with certain capability combinations
  const patterns = await db.query(`
    SELECT 
      array_agg(DISTINCT hc.capability_id) as capability_combo,
      COUNT(*) as completion_count,
      AVG(t.quality_score) as avg_quality
    FROM task_completions tc
    JOIN human_capabilities hc ON tc.human_passport_id = hc.human_passport_id
    JOIN tasks t ON tc.task_id = t.id
    WHERE tc.completed_at > NOW() - INTERVAL '30 days'
    GROUP BY tc.task_id
    HAVING COUNT(DISTINCT hc.capability_id) > 1  -- Multi-capability tasks
  `);
  
  // Look for emergent capability patterns
  for (const pattern of patterns) {
    if (pattern.completion_count > 50 && pattern.avg_quality > 0.8) {
      // Frequent, high-quality pattern - might indicate emergent capability
      const capabilityName = await llm.generate(
        `What capability do these skills represent together: ${pattern.capability_combo}`
      );
      
      await processNewCapabilityCandidate(capabilityName);
    }
  }
}

Curation Workflow

New capability candidates require human review:

interface CurationTask {
  candidateId: string;
  proposedName: string;
  description: string;
  examples: string[];
  source: 'task_request' | 'user_claim' | 'academy' | 'pattern_analysis';
  usageCount: number;  // How many times requested/claimed
  
  // Curator actions
  action?: 'approve' | 'merge' | 'reject';
  mergeIntoCapabilityId?: string;  // If merging with existing
  curatorNotes?: string;
}

// Curator dashboard shows:
// - High-demand candidates (many task requests)
// - Pattern-detected capabilities (strong evidence)
// - Similar existing capabilities (to prevent duplicates)

Curation rules:

  1. Auto-approve if:

    • From Academy (trusted source)
    • Matches known skill taxonomies (LinkedIn, O*NET)
    • High usage count (>100 requests)
  2. Human review if:

    • Ambiguous or vague name
    • Potential duplicate/synonym
    • Low usage but interesting pattern
  3. Auto-reject if:

    • Spam/gibberish
    • Offensive content
    • Already exists as synonym

Tracking Capability Evolution

Skills change meaning over time. HUMAN tracks this evolution:

interface CapabilityEvolution {
  capabilityId: string;
  
  // Snapshot definitions over time
  historicalDefinitions: {
    dateRange: [Date, Date];
    description: string;
    relatedCapabilities: string[];
    embedding: number[];
    exampleTasks: string[];
  }[];
  
  // Drift metrics
  semanticDrift: number;  // 0-1, cosine distance between first and current embedding
  definitionChangeRate: number;  // How often description changes
  relationshipChurn: number;  // How often related capabilities change
  
  // Trend analysis
  demandTrend: {
    direction: 'rising' | 'stable' | 'declining';
    velocity: number;  // Rate of change
    peakDate?: Date;   // When demand peaked (for declining skills)
  };
  
  // Morphing patterns
  evolvesInto?: string[];   // "Ruby on Rails" → ["Full Stack", "Backend Engineering"]
  replacedBy?: string[];    // "Flash" → ["JavaScript", "HTML5", "Canvas"]
}

Example: "Web Development" Evolution

{
  capabilityId: "cap:web-development",
  historicalDefinitions: [
    {
      dateRange: ["2010-01-01", "2015-12-31"],
      description: "Build websites using HTML, CSS, JavaScript, and server-side languages like PHP or Ruby.",
      relatedCapabilities: ["html", "css", "javascript", "jquery", "php", "mysql"],
      embedding: [0.123, -0.456, ...],  // 2010 embedding
      exampleTasks: [
        "Create responsive website layouts",
        "Build contact forms with PHP",
        "Implement jQuery animations"
      ]
    },
    {
      dateRange: ["2016-01-01", "2020-12-31"],
      description: "Build dynamic web applications using modern JavaScript frameworks, REST APIs, and Node.js.",
      relatedCapabilities: ["react", "angular", "vue", "nodejs", "rest-api", "webpack"],
      embedding: [0.234, -0.345, ...],  // 2016 embedding (drift detected)
      exampleTasks: [
        "Build single-page applications with React",
        "Create REST APIs with Express.js",
        "Implement real-time features with WebSockets"
      ]
    },
    {
      dateRange: ["2021-01-01", "present"],
      description: "Build full-stack applications with modern frameworks (React, Next.js, TypeScript), serverless architectures, and AI integrations.",
      relatedCapabilities: ["react", "nextjs", "typescript", "tailwind", "graphql", "vercel", "ai-apis"],
      embedding: [0.345, -0.234, ...],  // 2021 embedding (significant drift)
      exampleTasks: [
        "Build Next.js apps with TypeScript and Tailwind",
        "Integrate AI APIs (OpenAI, Anthropic) into web apps",
        "Deploy serverless functions to Vercel/Netlify",
        "Implement GraphQL APIs with type safety"
      ]
    }
  ],
  semanticDrift: 0.67,  // 67% change in meaning from 2010 to 2025
  definitionChangeRate: 0.15,  // Redefined ~15% per year
  relationshipChurn: 0.82,  // 82% of related capabilities changed
  demandTrend: {
    direction: 'rising',
    velocity: 0.12  // 12% annual growth in demand
  }
}

Drift detection triggers:

async function detectCapabilityDrift() {
  // Every quarter, re-analyze capability definitions
  const capabilities = await getActiveCapabilities();
  
  for (const cap of capabilities) {
    // Get current usage context (recent tasks, training, claims)
    const recentContext = await getRecentCapabilityContext(cap.id, days: 90);
    
    // Generate new description from context
    const newDescription = await llm.generate(
      `Based on recent usage, describe the capability: ${cap.canonicalName}\n\nContext:\n${recentContext}`
    );
    
    // Generate new embedding
    const newEmbedding = await embeddingProvider.embed(newDescription);
    
    // Compare to current embedding
    const similarity = cosineSimilarity(cap.embedding, newEmbedding);
    const drift = 1 - similarity;
    
    if (drift > 0.15) {  // >15% drift threshold
      // Create snapshot in history
      await snapshotCapabilityDefinition(cap, newDescription, newEmbedding);
      
      // Update capability
      await updateCapability(cap.id, {
        description: newDescription,
        embedding: newEmbedding,
        semanticDrift: calculateTotalDrift(cap)
      });
      
      // Notify curators of significant change
      await notifyCurators({
        type: 'capability_evolution',
        capabilityId: cap.id,
        drift: drift,
        message: `"${cap.canonicalName}" has evolved significantly. Review for accuracy.`
      });
    }
  }
}

Deprecation Strategy

When capabilities become obsolete:

async function evaluateDeprecation(capabilityId: string) {
  const capability = await getCapability(capabilityId);
  
  // Signals of obsolescence:
  const signals = {
    demandDropped: capability.demandCount < (capability.historicalPeakDemand * 0.1),  // <10% of peak
    noRecentUsage: capability.lastUsedAt < Date.now() - (365 * 24 * 60 * 60 * 1000),  // 1+ year
    replacementExists: await hasReplacementCapability(capabilityId),
    industryTrend: await checkIndustryTrend(capability.canonicalName)  // External data
  };
  
  if (signals.demandDropped && signals.noRecentUsage) {
    // Mark as deprecated
    await updateCapability(capabilityId, {
      status: 'deprecated',
      deprecatedAt: new Date(),
      replacementCapabilities: await findReplacements(capabilityId)
    });
    
    // Notify humans who have this capability
    await notifyHumansWithCapability(capabilityId, {
      message: `"${capability.canonicalName}" is becoming less relevant. Consider learning: ${replacements.join(', ')}`,
      suggestedTraining: await findRelevantCourses(replacements)
    });
  }
}

Example: Flash Development → Deprecated

{
  "id": "cap:flash-development",
  "canonicalName": "Adobe Flash Development",
  "status": "deprecated",
  "deprecatedAt": "2020-12-31",
  "description": "Historical: Building interactive content and animations using Adobe Flash/ActionScript. No longer supported by major browsers.",
  "replacementCapabilities": [
    "cap:javascript",
    "cap:html5-canvas",
    "cap:webgl",
    "cap:svg-animation"
  ],
  "demandTrend": {
    "direction": "declining",
    "velocity": -0.95,  // 95% decline
    "peakDate": "2010-03-15"
  },
  "supplyCount": 234,  // Still some humans with this skill
  "demandCount": 2     // Almost zero demand
}

THREE-LAYER CAPABILITY ARCHITECTURE

The Capability Graph uses a three-layer architecture to balance developer UX (easy discovery) with trust semantics (structured routing).

The Three Layers

Layer 1: Canonical Capabilities (Global)

Purpose: Standardized capabilities used for trust-aware routing and attestations.

Characteristics:

  • Owned and curated by HUMAN Foundation
  • Full semantic ontology (embeddings, relationships, lifecycle)
  • Used for HumanOS routing decisions
  • Cross-org interoperability
  • Governance tier: Canon

Examples:

  • cap:ai-safety-evaluation
  • cap:clinical-discharge-review
  • cap:contract-review-standard
  • cap:python-programming

Schema:

interface CanonicalCapability extends CapabilityDefinition {
  id: string;  // "cap:python-programming"
  canonicalName: string;
  status: 'emerging' | 'active' | 'evolving' | 'deprecated';
  governanceApproved: boolean;  // Requires curator approval
  crossOrgUsage: number;  // How many orgs use this capability
}

Layer 2: Org Capabilities (Scoped)

Purpose: Organization-specific capabilities that don't (yet) exist in the canonical ontology.

Characteristics:

  • Namespaced: cap:org:<orgSlug>:<slug>
  • Same schema as canonical capabilities (embeddings, relationships, etc.)
  • Curated by org capability admins + Capability Janitor agent
  • Can be promoted to canonical if widely adopted
  • Used for org-local routing only

Examples:

  • cap:org:acme:soc2-readiness-assessment
  • cap:org:healthcorp:patient-intake-specialist
  • cap:org:lawfirm:contract-negotiation-saas

Schema:

interface OrgCapability extends CapabilityDefinition {
  id: string;  // "cap:org:acme:soc2-readiness"
  orgId: string;
  status: 'draft' | 'active' | 'deprecated';
  curatorApproved: boolean;
  usageCount: number;
  
  // Optional: Mapping to canonical
  canonicalEquivalent?: string;  // "cap:security-audit"
  promotionCandidate?: boolean;  // High usage, consider canonical promotion
}

Lifecycle:

Developer proposes
    ↓
Agent drafts definition
    ↓
Org admin approves (or auto-approve for low-risk)
    ↓
Active (available for routing)
    ↓
(If widely used across orgs)
    ↓
Promoted to canonical

Layer 3: Dev Labels (Freeform)

Purpose: Easy discovery and grouping without polluting the capability ontology.

Characteristics:

  • Freeform strings or key/value pairs
  • NO trust semantics - never used for routing decisions
  • Used only for search, filtering, UI organization
  • Can be added/removed freely
  • No approval required

Examples:

const agent = {
  id: 'agent:soc2-checker',
  labels: ['soc2', 'compliance', 'audit', 'security'],  // Freeform
  capabilities: ['cap:security-audit', 'cap:compliance-review']  // Structured
};

const workflow = {
  id: 'workflow:invoice-processing',
  labels: ['finance', 'accounts-payable', 'automation'],
  steps: [
    {
      requiredCapabilities: ['cap:invoice-extraction', 'cap:accounting-review']
    }
  ]
};

Schema:

interface ResourceLabels {
  resourceId: string;  // Agent, workflow, human, etc.
  labels: string[];    // Freeform strings
  labelKV?: Record<string, string>;  // Optional key-value labels
}

When to Use Each Layer

Use Case Layer Example
Trust-aware routing Canonical or Org HumanOS routes high-risk task to humans with cap:medical-review >= 0.8
Agent discovery Labels Search agents with ['patient', 'billing']
Org-specific workflows Org capabilities Workflow requires cap:org:acme:hipaa-audit
Experimentation Labels Tag prototype agents with ['prototype', 'v2', 'beta']
Cross-org semantics Canonical Workforce Cloud matches humans across enterprises using canonical capabilities

Developer Experience

❌ Wrong: Creating micro-capabilities for everything

// BAD: Pollutes capability ontology
await capabilityGraph.createCapability({
  name: 'Agent that checks SOC-2 readiness for SaaS companies using AWS',
  // This is too specific and will never be reused
});

✅ Right: Use structured capabilities + labels

// GOOD: Reusable capabilities + discoverable labels
const agent = {
  id: 'agent:soc2-checker',
  capabilities: [
    'cap:security-audit',        // Canonical
    'cap:compliance-review',     // Canonical
    'cap:org:acme:soc2-readiness'  // Org-specific if truly unique
  ],
  labels: ['soc2', 'saas', 'aws', 'compliance'],  // Easy discovery
};

// Routing uses capabilities
await humanos.routeTask({
  requiredCapabilities: ['cap:security-audit'],
  minWeight: 0.7
});

// Search uses labels
await agentRegistry.search({
  labels: ['soc2', 'aws'],
  limit: 20
});

Capability vs Label Decision Tree

New requirement detected
    │
    ├─ Is this a trust/routing decision?
    │  ├─ YES → Use Capability (canonical or org)
    │  └─ NO  → Use Label
    │
    ├─ Will this be used across multiple orgs?
    │  ├─ YES → Canonical Capability
    │  └─ NO  → Org Capability or Label
    │
    └─ Is this a one-off descriptor?
       ├─ YES → Label
       └─ NO  → Capability

Integration with Capability Ontology

Canonical capabilities:

  • Full lifecycle management (emerging → active → deprecated)
  • Semantic embeddings for similarity search
  • Relationships (prerequisite, complementary, etc.)
  • Drift detection
  • Governed by HUMAN Foundation

Org capabilities:

  • Same lifecycle and semantics as canonical
  • Governed by org capability admins
  • Capability Janitor monitors for:
    • Duplicates (suggest merge)
    • High-usage candidates (suggest promotion to canonical)
    • Stale capabilities (suggest deprecation)

Labels:

  • No lifecycle (add/remove freely)
  • No semantics (just strings)
  • No governance (anyone can add)
  • Used only for UX

Data Placement

Layer Storage Access Control
Canonical capabilities Global capability ontology table Public read, curator write
Org capabilities Tenant-scoped capability tables Org-scoped read/write
Labels Resource metadata tables Resource owner read/write

Benefits

For Developers:

  • ✅ Easy discovery via labels
  • ✅ No bureaucracy for freeform tags
  • ✅ Reusable capabilities when needed

For HumanOS Routing:

  • ✅ Structured capabilities for trust decisions
  • ✅ No pollution from freeform descriptors
  • ✅ Clear semantics for routing logic

For Capability Graph:

  • ✅ Prevents capability sprawl
  • ✅ Maintains semantic quality
  • ✅ Enables cross-org interoperability

AGENT CAPABILITY PROFILES

Agents (not just humans) need capability profiles to enable capability-based agent routing in inter-agent workflows.

Schema

interface AgentCapabilityProfile {
  // Identity
  agentId: string;  // "agent:invoice-processor"
  agentName: string;
  version: string;
  
  // Capabilities (canonical or org)
  capabilities: {
    capabilityId: string;
    weight: number;  // 0.0-1.0, agent's proficiency
    confidence: number;  // How sure are we of this capability?
    evidenceCount: number;  // Task completions, human approvals
  }[];
  
  // Tools & Muscles
  tools: string[];  // Muscles the agent can invoke
  connectors: string[];  // External services (Stripe, Salesforce, etc.)
  
  // Context
  domains: string[];  // "healthcare", "finance", "legal"
  languages: string[];  // Natural languages supported
  
  // Trust
  trustLevel: 'verified' | 'community' | 'experimental';
  riskTier: 'low' | 'medium' | 'high' | 'critical';
  certifications: string[];  // Org-issued certifications
  
  // Permissions
  permissions: string[];  // What the agent is allowed to do
  
  // Performance
  taskCompletions: number;
  avgQualityScore: number;
  avgLatency: number;  // Milliseconds
  failureRate: number;  // 0.0-1.0
  
  // Activity
  lastActive: Date;
  createdAt: Date;
  updatedAt: Date;
}

Discovery

Agents are discovered via capability queries, just like humans:

// Find agent with capability
const agents = await capabilityGraph.findAgentsWithCapability(
  'cap:contract-review',
  {
    minWeight: 0.8,
    trustLevel: 'verified',
    riskTier: ['low', 'medium']
  }
);

// HumanOS routing (capability-first)
const routingDecision = await humanos.routeTask({
  taskId: 'task_123',
  requiredCapabilities: ['cap:invoice-extraction', 'cap:accounting-review'],
  riskLevel: 'medium',
  resourceTypes: ['agent', 'human']  // Consider both
});

// Result: Agent or human with best capability match

Capability Evolution for Agents

Agents gain capability evidence from:

1. Task Outcomes

// After agent completes task
await capabilityGraph.submitEvidence({
  resourceType: 'agent',
  resourceId: 'agent:invoice-processor',
  evidenceType: 'task_completion',
  taskId: 'task_123',
  capabilitiesDemonstrated: [
    {
      capabilityId: 'cap:invoice-extraction',
      performanceScore: 0.94,  // Quality of work
      reviewerDid: 'did:human:reviewer_xyz'  // Human who approved
    }
  ],
  context: {
    complexity: 'medium',
    timeToComplete: 1200  // ms
  }
});

// Capability Graph updates agent's capability weight
// Based on multi-source evidence algorithm (just like humans)

2. Human Approvals

// Agent proposes action → Human approves
await capabilityGraph.submitEvidence({
  resourceType: 'agent',
  resourceId: 'agent:contract-reviewer',
  evidenceType: 'human_approval',
  capabilitiesDemonstrated: [
    {
      capabilityId: 'cap:contract-review',
      approvedBy: 'did:human:legal_counsel',
      confidence: 0.95  // Human's confidence in agent's work
    }
  ]
});

3. Org Attestations

// Org certifies agent for specific domain
await capabilityGraph.submitEvidence({
  resourceType: 'agent',
  resourceId: 'agent:medical-triage',
  evidenceType: 'attestation',
  capabilitiesDemonstrated: [
    {
      capabilityId: 'cap:medical-triage',
      attestedBy: 'did:org:healthcorp',
      attestationLevel: 'certified',
      validUntil: '2026-12-31'
    }
  ]
});

Agent Manifest Integration

Agents declare capabilities in their manifest (YAML or SDK):

# agent-manifest.yaml
agent:
  id: invoice-processor
  name: Invoice Processing Agent
  version: 1.2.0
  
  capabilities:
    - cap:invoice-extraction
    - cap:data-validation
    - cap:accounting-review
  
  tools:
    - stripe-connector
    - quickbooks-connector
    - email-sender
  
  domains:
    - finance
    - accounts-payable
  
  trustLevel: verified
  riskTier: medium
  
  permissions:
    - read:invoices
    - write:invoice-records
    - call:accounting-review-agents

At registration, agent manifest is parsed into AgentCapabilityProfile:

// Agent SDK does this automatically
const manifest = await loadManifest('agent-manifest.yaml');
const profile = await capabilityGraph.registerAgent({
  agentId: manifest.agent.id,
  capabilities: manifest.agent.capabilities.map(capId => ({
    capabilityId: capId,
    weight: 0.50,  // Initial weight (no evidence yet)
    confidence: 0.70,
    evidenceCount: 0
  })),
  tools: manifest.agent.tools,
  domains: manifest.agent.domains,
  trustLevel: manifest.agent.trustLevel,
  riskTier: manifest.agent.riskTier,
  permissions: manifest.agent.permissions
});

Inter-Agent Capability Routing

When an agent needs to delegate to another agent:

// Inside an agent
export const handler = async (ctx: AgentContext) => {
  // Agent A needs help from agent with contract review capability
  const result = await ctx.call.agent('cap:contract-review', {
    contract: ctx.input.contract,
    requiredCapabilityWeight: 0.8,
    riskLevel: 'high'
  });
  
  // HumanOS routing engine:
  // 1. Queries Capability Graph for agents with 'cap:contract-review >= 0.8'
  // 2. Filters by risk tier (high-risk may require human)
  // 3. Routes to best match (agent or human)
  // 4. Logs provenance
  
  return result;
};

Benefits

For HumanOS:

  • Unified routing for humans AND agents
  • Capability-based agent discovery
  • Trust-aware agent selection

For Developers:

  • Declare capabilities in manifest
  • Automatic capability tracking
  • No manual capability management

For Agents:

  • Capability weights evolve with performance
  • Trust level increases with successful tasks
  • Clear path to "verified" status

CAPABILITY JANITOR (ANTI-SPRAWL)

As org-specific capabilities grow, sprawl becomes a problem. The Capability Janitor is an automated agent that:

  • Detects duplicate or near-duplicate capabilities
  • Suggests merges and aliases
  • Identifies stale capabilities
  • Proposes promotions to canonical

Purpose

Prevent capability graph pollution by:

  1. Clustering similar capabilities (embedding-based)
  2. Detecting duplicates (suggest merge)
  3. Flagging stale capabilities (no usage in 90+ days)
  4. Identifying promotion candidates (high cross-org usage)

Algorithm

async function runCapabilityJanitor(orgId: string) {
  // 1. Get all org capabilities
  const orgCaps = await getOrgCapabilities(orgId, { status: 'active' });
  
  // 2. Cluster by embedding similarity
  const clusters = await clusterByEmbedding(orgCaps, {
    threshold: 0.90,  // 90%+ similarity = potential duplicate
    method: 'cosine'
  });
  
  // 3. For each cluster with multiple capabilities, suggest merge
  for (const cluster of clusters) {
    if (cluster.length > 1) {
      // Primary = highest usage
      const primary = cluster.sort((a, b) => b.usageCount - a.usageCount)[0];
      const duplicates = cluster.filter(c => c.id !== primary.id);
      
      await suggestMerge({
        orgId,
        primary: primary.id,
        duplicates: duplicates.map(d => d.id),
        reason: `High semantic similarity (avg ${cluster.avgSimilarity})`,
        impact: {
          affectedAgents: await countAgentsUsingCapabilities(duplicates.map(d => d.id)),
          affectedWorkflows: await countWorkflowsUsingCapabilities(duplicates.map(d => d.id))
        },
        suggestedAction: 'merge_into_primary_and_create_aliases'
      });
    }
  }
  
  // 4. Flag stale capabilities (no usage in 90 days)
  const staleThreshold = Date.now() - (90 * 24 * 60 * 60 * 1000);
  const stale = orgCaps.filter(c => c.lastUsed < staleThreshold);
  
  for (const cap of stale) {
    await suggestDeprecation({
      orgId,
      capabilityId: cap.id,
      reason: 'No usage in 90 days',
      usageHistory: await getCapabilityUsageHistory(cap.id, days: 180),
      suggestedAction: cap.usageCount > 0 
        ? 'archive_with_migration_path'  // Used before, might return
        : 'delete'  // Never used, safe to remove
    });
  }
  
  // 5. Identify high-usage org caps → suggest canonical promotion
  const highUsage = orgCaps.filter(c => 
    c.usageCount > 100 &&  // Used frequently
    c.crossAgentUsage > 10  // Used by many agents
  );
  
  for (const cap of highUsage) {
    // Check if similar canonical capability exists
    const canonicalSimilar = await findCanonicalCapabilitySimilar(cap.embedding, threshold: 0.85);
    
    if (canonicalSimilar.length === 0) {
      // No similar canonical → suggest promotion
      await suggestCanonicalPromotion({
        orgId,
        capabilityId: cap.id,
        reason: 'High usage, no canonical equivalent',
        usageStats: {
          usageCount: cap.usageCount,
          crossAgentUsage: cap.crossAgentUsage,
          avgQuality: await getAvgCapabilityQuality(cap.id)
        },
        suggestedAction: 'promote_to_canonical_with_governance_review'
      });
    } else {
      // Similar canonical exists → suggest mapping
      await suggestCanonicalMapping({
        orgId,
        capabilityId: cap.id,
        canonicalId: canonicalSimilar[0].id,
        similarity: canonicalSimilar[0].similarity,
        suggestedAction: 'map_org_cap_to_canonical_and_deprecate'
      });
    }
  }
}

Clustering Algorithm

interface CapabilityCluster {
  capabilities: OrgCapability[];
  avgSimilarity: number;
  centroid: number[];  // Average embedding
}

async function clusterByEmbedding(
  capabilities: OrgCapability[],
  options: { threshold: number; method: 'cosine' | 'euclidean' }
): Promise<CapabilityCluster[]> {
  const clusters: CapabilityCluster[] = [];
  const visited = new Set<string>();
  
  for (const cap of capabilities) {
    if (visited.has(cap.id)) continue;
    
    // Find all capabilities similar to this one
    const similar = capabilities.filter(other => {
      if (visited.has(other.id)) return false;
      const similarity = cosineSimilarity(cap.embedding, other.embedding);
      return similarity >= options.threshold;
    });
    
    if (similar.length > 1) {
      // Calculate cluster centroid
      const embeddings = similar.map(c => c.embedding);
      const centroid = averageEmbedding(embeddings);
      
      // Calculate average pairwise similarity
      const similarities = [];
      for (let i = 0; i < similar.length; i++) {
        for (let j = i + 1; j < similar.length; j++) {
          similarities.push(cosineSimilarity(similar[i].embedding, similar[j].embedding));
        }
      }
      const avgSimilarity = similarities.reduce((a, b) => a + b, 0) / similarities.length;
      
      clusters.push({
        capabilities: similar,
        avgSimilarity,
        centroid
      });
      
      // Mark as visited
      similar.forEach(c => visited.add(c.id));
    }
  }
  
  return clusters;
}

Admin Workflow

Org admins receive quarterly Capability Janitor reports:

interface CapabilityJanitorReport {
  orgId: string;
  reportDate: Date;
  
  // Suggested merges
  merges: {
    primary: OrgCapability;
    duplicates: OrgCapability[];
    reason: string;
    impact: { affectedAgents: number; affectedWorkflows: number };
    action: 'approve' | 'reject' | 'defer';
  }[];
  
  // Stale capabilities
  staleCapabilities: {
    capability: OrgCapability;
    daysSinceLastUse: number;
    suggestedAction: 'archive' | 'delete';
    action: 'approve' | 'reject' | 'defer';
  }[];
  
  // Promotion candidates
  promotionCandidates: {
    capability: OrgCapability;
    usageStats: { usageCount: number; crossAgentUsage: number; avgQuality: number };
    suggestedCanonicalName: string;
    action: 'nominate' | 'reject' | 'defer';
  }[];
  
  // Canonical mappings
  canonicalMappings: {
    orgCapability: OrgCapability;
    canonicalCapability: CanonicalCapability;
    similarity: number;
    action: 'approve_mapping' | 'reject' | 'defer';
  }[];
}

Admin dashboard:

  • Review suggestions one-by-one
  • Approve/reject with one click
  • Bulk actions for obvious cases
  • Defer for manual review

Auto-Merge Rules

Some cases can be auto-merged without human approval:

async function shouldAutoMerge(cluster: CapabilityCluster): Promise<boolean> {
  // Auto-merge if:
  // 1. Very high similarity (>95%)
  // 2. Low impact (affects <5 agents)
  // 3. Recent creation (all capabilities created in last 30 days)
  
  const veryHighSimilarity = cluster.avgSimilarity > 0.95;
  const lowImpact = await countAgentsUsingCapabilities(
    cluster.capabilities.map(c => c.id)
  ) < 5;
  const recentCreation = cluster.capabilities.every(c => 
    Date.now() - c.createdAt.getTime() < (30 * 24 * 60 * 60 * 1000)
  );
  
  return veryHighSimilarity && lowImpact && recentCreation;
}

Execution Schedule

// Capability Janitor runs:
// - Weekly: Duplicate detection (quick wins)
// - Monthly: Stale capability flagging
// - Quarterly: Canonical promotion suggestions
// - Ad-hoc: On-demand when org cap count > threshold

const schedule = {
  duplicateDetection: 'weekly',
  staleDetection: 'monthly',
  promotionSuggestions: 'quarterly',
  onDemand: (orgCapCount) => orgCapCount > 100
};

Benefits

For Org Admins:

  • ✅ Automatic cleanup suggestions
  • ✅ No manual capability management needed
  • ✅ Prevents sprawl before it's a problem

For Capability Graph:

  • ✅ Maintains semantic quality
  • ✅ Prevents duplicate capabilities
  • ✅ Identifies canonical candidates

For Developers:

  • ✅ Cleaner capability search
  • ✅ Fewer "which capability do I use?" decisions
  • ✅ Auto-aliasing handles edge cases

DATA RESIDENCY BY DEPLOYMENT PROFILE

The Capability Graph respects data sovereignty across all three deployment profiles.

Data Placement

Data Type Hosted Hybrid Self-Hosted
Capability ontology HUMAN Cloud (global) Mirrored to customer Customer-controlled
Personal capability graphs Encrypted vault (HUMAN) Customer vault Customer vault
Capability evidence Encrypted vault (HUMAN) Customer vault Customer vault
Attestations Ledger (HUMAN-managed) Customer ledger + optional federation Customer ledger (air-gapped or federated)
Capability queries HUMAN Cloud Customer edge/cloud Customer edge/cloud
Org capabilities HUMAN Cloud (tenant-scoped) Customer database Customer database
Agent capability profiles HUMAN Cloud (tenant-scoped) Customer database Customer database

Sync Model: Device → Edge → Regional Cloud

The Capability Graph operates in a tiered sync model to support offline operation and low-latency queries:

┌─────────────────────────────────────────────────────────────────┐
│                     CAPABILITY GRAPH SYNC                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  DEVICE (Offline-Capable)                                       │
│  ├─ Full personal capability graph                             │
│  ├─ Cached org capabilities (relevant to user)                 │
│  ├─ Cached canonical ontology (for search)                     │
│  └─ Pending evidence submissions (queued for sync)             │
│                            │                                    │
│                            ↓ (sync when online)                 │
│                                                                 │
│  EDGE (CDN / Regional)                                          │
│  ├─ Cached capability profiles (for routing)                   │
│  ├─ Canonical ontology (full, frequently refreshed)            │
│  ├─ Org capability summaries (for fast lookup)                 │
│  └─ Recent evidence cache (TTL 5 min)                          │
│                            │                                    │
│                            ↓ (complex queries)                  │
│                                                                 │
│  REGIONAL CLOUD (Authoritative)                                 │
│  ├─ Full capability ontology (canonical + org)                 │
│  ├─ All personal capability graphs (encrypted)                 │
│  ├─ Capability evidence store                                  │
│  ├─ Capability Graph inference engine                          │
│  └─ Cross-region sync (eventual consistency)                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Data Flows

Evidence Submission (Device → Cloud)

// On device (can work offline)
await capabilityGraph.submitEvidence({
  passportDid: 'did:human:abc123',
  evidenceType: 'task_completion',
  capabilitiesDemonstrated: [
    { capabilityId: 'cap:python', performanceScore: 0.91 }
  ],
  // Queue if offline
  syncStrategy: 'queue_if_offline'
});

// Evidence stored locally in encrypted vault
// Synced to regional cloud when online
// Capability weights updated in cloud
// Updated profile synced back to device

Capability Query (Edge-First)

// HumanOS routing query (edge-first)
const matches = await capabilityGraph.findResourcesWithCapability(
  'cap:medical-review',
  { minWeight: 0.8, resourceTypes: ['human', 'agent'] }
);

// Execution:
// 1. Check edge cache (if profiles cached)
// 2. If cache miss or stale → query regional cloud
// 3. Cache result at edge for future queries

Hybrid Profile: Data Never Leaves VPC

In Hybrid deployment, capability data stays in customer infrastructure:

// Capability Graph configuration (Hybrid)
const config = {
  deploymentProfile: 'hybrid',
  
  // Ontology: Mirror from HUMAN Cloud (read-only)
  ontology: {
    source: 'mirror',
    syncFrom: 'https://ontology.human.ai',
    syncInterval: '1h',
    localPath: '/var/lib/human/ontology'
  },
  
  // Personal graphs: Customer vault (read/write)
  personalGraphs: {
    storage: 'customer_vault',
    encryption: 'customer_keys',
    backups: 'customer_controlled'
  },
  
  // Org capabilities: Customer database
  orgCapabilities: {
    storage: 'customer_database',
    endpoint: 'postgres.acme.internal'
  },
  
  // Evidence: Customer vault
  evidence: {
    storage: 'customer_vault',
    retentionPolicy: 'customer_defined'
  },
  
  // Attestations: Customer ledger (optional federation)
  attestations: {
    storage: 'customer_ledger',
    federation: {
      enabled: true,  // Can federate with HUMAN public ledger
      mode: 'write_only'  // Push hashes to HUMAN, keep data local
    }
  }
};

Self-Hosted Profile: Full Air-Gap Support

In Self-Hosted deployment, no external connectivity required:

// Capability Graph configuration (Self-Hosted, Air-Gapped)
const config = {
  deploymentProfile: 'selfhosted',
  
  // Ontology: Customer fork (can diverge from HUMAN)
  ontology: {
    source: 'customer_fork',
    initialImport: 'canonical_v1.0.0',  // One-time import
    updates: 'manual',  // Controlled updates via USB/sneakernet
    customCapabilities: 'allowed'  // Customer can extend ontology
  },
  
  // All data stays on-prem
  personalGraphs: { storage: 'on_prem_vault' },
  orgCapabilities: { storage: 'on_prem_database' },
  evidence: { storage: 'on_prem_vault' },
  attestations: { storage: 'on_prem_ledger', federation: { enabled: false } },
  
  // No external connectivity
  externalConnectivity: {
    enabled: false,
    updateChannel: 'disabled',
    telemetry: 'disabled'
  }
};

Privacy Guarantees

Profile Personal Data Location Org Data Location Ontology Location
Hosted Encrypted HUMAN vault Tenant-scoped HUMAN DB HUMAN Cloud
Hybrid Customer vault Customer DB Mirrored to customer
Self-Hosted Customer vault Customer DB Customer-controlled

Key Principle: Capability data is non-stick to HUMAN infrastructure in Hybrid/Self-hosted.


Compliance

This architecture supports:

  • GDPR (data residency in EU)
  • HIPAA (healthcare data stays in customer VPC)
  • CCPA (California data residency)
  • FedRAMP (air-gapped self-hosted)
  • DoD (ITAR compliance via self-hosted)

SEMANTIC CAPABILITY MATCHING

Traditional keyword matching fails for capability-based routing. "Machine Learning" doesn't match "ML", "Python Developer" doesn't match "Python Programming". HUMAN uses semantic matching powered by embeddings.

The Matching Problem

Naive string matching:

// Task requires: ["machine-learning", "healthcare"]
// Human A has: ["ML", "medical-data-analysis"]
// Result: NO MATCH ❌ (even though human is perfect!)

// Human B has: ["machine-learning", "finance"]
// Result: PARTIAL MATCH (but wrong domain)

Semantic matching:

// Task embedding: [0.234, -0.567, ...]
// Human A capabilities:
//   - "ML" embedding: [0.245, -0.554, ...] → 96% similar to "machine-learning"
//   - "medical-data-analysis" embedding: [0.123, 0.456, ...] → 89% similar to "healthcare"
// Result: STRONG MATCH ✅ (95% overall)

// Human B capabilities:
//   - "machine-learning" embedding: [0.234, -0.567, ...] → 100% similar
//   - "finance" embedding: [-0.345, 0.123, ...] → 25% similar to "healthcare"
// Result: WEAK MATCH (62% overall) ⚠️

Semantic Matching Algorithm

interface CapabilityMatchResult {
  human: Human;
  overallScore: number;               // 0-1, aggregate match quality
  capabilityMatches: {
    requiredCapability: string;
    matchedCapability: CapabilityNode;
    similarity: number;               // 0-1, semantic similarity
    weight: number;                   // Human's capability weight
    combinedScore: number;            // similarity * weight
  }[];
  weakestLink: number;                // Minimum similarity across required capabilities
  strengths: string[];                // Areas where human excels
  gaps: string[];                     // Missing or weak capabilities
}

async function matchHumansToTask(
  taskRequirements: {
    requiredCapabilities: string[];
    minSimilarity?: number;           // Default 0.70
    minWeight?: number;               // Default 0.50
    domainContext?: string;           // E.g., "healthcare", "finance"
  },
  candidateHumans: Human[]
): Promise<CapabilityMatchResult[]> {
  
  const minSim = taskRequirements.minSimilarity ?? 0.70;
  const minWeight = taskRequirements.minWeight ?? 0.50;
  
  // 1. Get embeddings for required capabilities
  const requiredEmbeddings = await Promise.all(
    taskRequirements.requiredCapabilities.map(async cap => {
      // Try exact match first
      const existing = await getCapabilityByName(cap);
      if (existing) return { name: cap, embedding: existing.embedding };
      
      // Generate embedding for ad-hoc capability name
      return { name: cap, embedding: await embeddingProvider.embed(cap) };
    })
  );
  
  // 2. For each human, compute match score
  const matches = await Promise.all(
    candidateHumans.map(async human => {
      const humanCapabilities = await getHumanCapabilityGraph(human.passportId);
      
      // For each required capability, find best match in human's graph
      const capabilityMatches = requiredEmbeddings.map(req => {
        // Find human's capability with highest semantic similarity
        const bestMatch = humanCapabilities.nodes.reduce(
          (best, humanCap) => {
            const similarity = cosineSimilarity(req.embedding, humanCap.embedding);
            const combinedScore = similarity * humanCap.weight;  // Factor in capability weight
            
            return combinedScore > best.combinedScore
              ? { capability: humanCap, similarity, combinedScore }
              : best;
          },
          { capability: null, similarity: 0, combinedScore: 0 }
        );
        
        return {
          requiredCapability: req.name,
          matchedCapability: bestMatch.capability,
          similarity: bestMatch.similarity,
          weight: bestMatch.capability?.weight ?? 0,
          combinedScore: bestMatch.combinedScore
        };
      });
      
      // 3. Aggregate scoring
      const validMatches = capabilityMatches.filter(m => 
        m.similarity >= minSim && m.weight >= minWeight
      );
      
      // All required capabilities must have valid matches
      if (validMatches.length < requiredEmbeddings.length) {
        return null;  // Human missing critical capabilities
      }
      
      // Overall score: weighted average of combined scores
      const overallScore = validMatches.reduce((sum, m) => sum + m.combinedScore, 0) / validMatches.length;
      
      // Weakest link: minimum similarity (chain is only as strong as weakest link)
      const weakestLink = Math.min(...validMatches.map(m => m.similarity));
      
      // Identify strengths and gaps
      const strengths = capabilityMatches
        .filter(m => m.combinedScore > 0.85)
        .map(m => m.matchedCapability.name);
      
      const gaps = capabilityMatches
        .filter(m => m.combinedScore < 0.60)
        .map(m => m.requiredCapability);
      
      return {
        human,
        overallScore,
        capabilityMatches,
        weakestLink,
        strengths,
        gaps
      };
    })
  );
  
  // 4. Filter nulls (humans who don't meet requirements) and sort by score
  return matches
    .filter(m => m !== null)
    .sort((a, b) => b.overallScore - a.overallScore);
}

Match Quality Tiers

function categorizeMatchQuality(match: CapabilityMatchResult): string {
  if (match.overallScore >= 0.90 && match.weakestLink >= 0.85) {
    return 'exceptional';  // Perfect fit, no weaknesses
  } else if (match.overallScore >= 0.80 && match.weakestLink >= 0.70) {
    return 'strong';       // Very good fit, minor gaps acceptable
  } else if (match.overallScore >= 0.70 && match.weakestLink >= 0.60) {
    return 'adequate';     // Meets requirements, some training may help
  } else if (match.overallScore >= 0.60) {
    return 'marginal';     // Risky, significant gaps
  } else {
    return 'poor';         // Should not be routed
  }
}

Domain-Aware Matching

Context matters. "Data analysis" in healthcare ≠ "data analysis" in finance:

async function domainAwareMatching(
  taskRequirements: {
    requiredCapabilities: string[];
    domainContext: string;  // "healthcare", "finance", "legal", etc.
  },
  candidateHumans: Human[]
): Promise<CapabilityMatchResult[]> {
  
  // 1. Standard semantic matching
  const baseMatches = await matchHumansToTask(taskRequirements, candidateHumans);
  
  // 2. Apply domain boost/penalty
  const domainAdjustedMatches = baseMatches.map(match => {
    const humanDomains = extractDomains(match.human.capabilityGraph);
    
    // Check if human has experience in the required domain
    const domainExperience = humanDomains.find(d => 
      d.domain === taskRequirements.domainContext
    );
    
    if (domainExperience) {
      // Boost score for domain expertise
      const domainBoost = domainExperience.weight * 0.15;  // Up to 15% boost
      match.overallScore = Math.min(1.0, match.overallScore + domainBoost);
      match.strengths.push(`${taskRequirements.domainContext} domain expertise`);
    } else {
      // Penalty for lack of domain experience
      const domainPenalty = 0.10;  // 10% penalty
      match.overallScore = Math.max(0, match.overallScore - domainPenalty);
      match.gaps.push(`Limited ${taskRequirements.domainContext} experience`);
    }
    
    return match;
  });
  
  // 3. Re-sort after domain adjustment
  return domainAdjustedMatches.sort((a, b) => b.overallScore - a.overallScore);
}

Fuzzy Synonym Detection

Handle variations in capability names:

async function fuzzyCapabilityMatch(
  queryCapability: string,
  threshold: number = 0.85
): Promise<CapabilityDefinition[]> {
  // 1. Exact synonym match
  const exactMatch = await db.query(`
    SELECT * FROM capabilities
    WHERE canonical_name = $1
    OR synonyms @> $2::jsonb
  `, [queryCapability, JSON.stringify([queryCapability])]);
  
  if (exactMatch.length > 0) return exactMatch;
  
  // 2. Fuzzy text match (Levenshtein distance)
  const fuzzyTextMatch = await db.query(`
    SELECT *, levenshtein(canonical_name, $1) AS distance
    FROM capabilities
    WHERE levenshtein(canonical_name, $1) < 3  -- Max 2 character difference
    ORDER BY distance
    LIMIT 5
  `, [queryCapability]);
  
  if (fuzzyTextMatch.length > 0) return fuzzyTextMatch;
  
  // 3. Semantic similarity (embedding-based)
  const queryEmbedding = await embeddingProvider.embed(queryCapability);
  const semanticMatch = await db.query(`
    SELECT 
      *,
      1 - (embedding <=> $1::vector) AS similarity
    FROM capabilities
    WHERE 1 - (embedding <=> $1::vector) > $2
    ORDER BY embedding <=> $1::vector
    LIMIT 10
  `, [queryEmbedding, threshold]);
  
  return semanticMatch;
}

Real-World Example: Healthcare Task Routing

// Task: Medical record review for triage
const taskRequirements = {
  requiredCapabilities: [
    "healthcare-triage",
    "medical-record-analysis",
    "HIPAA-compliance"
  ],
  domainContext: "healthcare",
  minSimilarity: 0.75,
  minWeight: 0.70  // High bar for healthcare
};

const candidateHumans = await getAvailableHumans();

const matches = await domainAwareMatching(taskRequirements, candidateHumans);

// Results:
[
  {
    human: { passportId: "did:human:sarah-rn", displayName: "Sarah J." },
    overallScore: 0.94,
    weakestLink: 0.88,
    capabilityMatches: [
      {
        requiredCapability: "healthcare-triage",
        matchedCapability: { name: "Clinical Triage", weight: 0.92 },
        similarity: 0.96,
        combinedScore: 0.88
      },
      {
        requiredCapability: "medical-record-analysis",
        matchedCapability: { name: "EHR Review", weight: 0.85 },
        similarity: 0.91,
        combinedScore: 0.77
      },
      {
        requiredCapability: "HIPAA-compliance",
        matchedCapability: { name: "HIPAA Certified", weight: 0.95 },
        similarity: 0.98,  // Nearly perfect match
        combinedScore: 0.93
      }
    ],
    strengths: [
      "Clinical Triage",
      "HIPAA Certified",
      "Healthcare domain expertise"
    ],
    gaps: []
  },
  {
    human: { passportId: "did:human:mike-emt", displayName: "Mike T." },
    overallScore: 0.78,
    weakestLink: 0.65,
    capabilityMatches: [
      {
        requiredCapability: "healthcare-triage",
        matchedCapability: { name: "Emergency Medical Response", weight: 0.88 },
        similarity: 0.82,
        combinedScore: 0.72
      },
      {
        requiredCapability: "medical-record-analysis",
        matchedCapability: { name: "Patient Assessment", weight: 0.70 },
        similarity: 0.72,
        combinedScore: 0.50  // Weaker here
      },
      {
        requiredCapability: "HIPAA-compliance",
        matchedCapability: { name: "Healthcare Privacy Training", weight: 0.65 },
        similarity: 0.88,
        combinedScore: 0.57
      }
    ],
    strengths: ["Emergency Medical Response"],
    gaps: ["Limited medical record analysis experience"]
  }
]

// Sarah (RN) routed to task, Mike (EMT) held in reserve

CAPABILITY-BASED ACCESS CONTROL

Capabilities don't just route work—they gate access to resources, knowledge, and privileges. This section defines how HUMAN uses capabilities for fine-grained access control.

Access Control Model

Traditional access control: "Is user X allowed to access resource Y?"

Capability-based access control: "Does user X have the required capabilities to access resource Y?"

interface AccessPolicy {
  resourceId: string;               // What's being protected
  resourceType: 'kb_document' | 'task_tier' | 'system_feature' | 'data_set' | 'api_endpoint';
  
  // Capability requirements
  requiredCapabilities: {
    capabilityId: string;
    minWeight: number;              // Minimum capability weight (0-1)
    minVerification?: VerificationStatus;  // Minimum verification level
  }[];
  
  // Logical operators
  operator: 'AND' | 'OR' | 'THRESHOLD';  // How to combine requirements
  threshold?: number;               // For THRESHOLD: how many capabilities needed (e.g., "2 of 3")
  
  // Additional constraints
  constraints?: {
    requiredPassportKind?: PassportKind[];  // E.g., only Founders can access
    minTrustLevel?: number;         // Overall trust score
    geographicRestrictions?: string[];  // Jurisdictional limits
    timeRestrictions?: {
      allowedHours?: string;        // E.g., "09:00-17:00"
      allowedDays?: string[];       // E.g., ["Monday", "Tuesday"]
    };
  };
  
  // Audit
  createdBy: PassportId;
  createdAt: Date;
  updatedAt: Date;
  rationale: string;                // Why this policy exists
}

Access Check Algorithm

async function checkAccess(
  actor: Passport,
  resourceId: string
): Promise<AccessDecision> {
  // 1. Get access policy for resource
  const policy = await getAccessPolicy(resourceId);
  if (!policy) {
    // No policy = default deny
    return { allowed: false, reason: 'No access policy defined' };
  }
  
  // 2. Check passport kind and constraints
  if (policy.constraints) {
    if (policy.constraints.requiredPassportKind && 
        !policy.constraints.requiredPassportKind.includes(actor.kind)) {
      return { allowed: false, reason: 'Passport kind not authorized' };
    }
    
    // Check time restrictions, geo restrictions, etc.
    const constraintCheck = await evaluateConstraints(actor, policy.constraints);
    if (!constraintCheck.passed) {
      return { allowed: false, reason: constraintCheck.reason };
    }
  }
  
  // 3. Get actor's capabilities
  const actorCapabilities = await getHumanCapabilityGraph(actor.id);
  
  // 4. Check each required capability
  const capabilityChecks = policy.requiredCapabilities.map(req => {
    const actorCap = actorCapabilities.nodes.find(c => c.id === req.capabilityId);
    
    if (!actorCap) {
      return {
        capability: req.capabilityId,
        satisfied: false,
        reason: 'Capability not present'
      };
    }
    
    if (actorCap.weight < req.minWeight) {
      return {
        capability: req.capabilityId,
        satisfied: false,
        reason: `Capability weight ${actorCap.weight} below required ${req.minWeight}`
      };
    }
    
    if (req.minVerification && 
        !meetsVerificationRequirement(actorCap.verificationStatus, req.minVerification)) {
      return {
        capability: req.capabilityId,
        satisfied: false,
        reason: `Verification level ${actorCap.verificationStatus} insufficient`
      };
    }
    
    return {
      capability: req.capabilityId,
      satisfied: true
    };
  });
  
  // 5. Apply logical operator
  const allowed = evaluateLogicalOperator(
    policy.operator,
    capabilityChecks,
    policy.threshold
  );
  
  // 6. Log access attempt
  await logAccessAttempt({
    actorId: actor.id,
    resourceId,
    allowed,
    capabilityChecks,
    timestamp: new Date()
  });
  
  return {
    allowed,
    reason: allowed ? 'Access granted' : 'Capability requirements not met',
    missingCapabilities: capabilityChecks.filter(c => !c.satisfied)
  };
}

function evaluateLogicalOperator(
  operator: 'AND' | 'OR' | 'THRESHOLD',
  checks: { satisfied: boolean }[],
  threshold?: number
): boolean {
  switch (operator) {
    case 'AND':
      return checks.every(c => c.satisfied);
    
    case 'OR':
      return checks.some(c => c.satisfied);
    
    case 'THRESHOLD':
      const satisfiedCount = checks.filter(c => c.satisfied).length;
      return satisfiedCount >= (threshold ?? checks.length);
  }
}

Real-World Access Policies

Example 1: KB Document Access (PHI Data)

{
  resourceId: "kb:patient-health-information",
  resourceType: "kb_document",
  requiredCapabilities: [
    {
      capabilityId: "cap:hipaa-compliance",
      minWeight: 0.75,
      minVerification: "issuer_verified"  // Must have official HIPAA certification
    },
    {
      capabilityId: "cap:healthcare-license",
      minWeight: 0.80,
      minVerification: "issuer_verified"  // Must have verified healthcare license
    }
  ],
  operator: "AND",  // Must satisfy BOTH
  constraints: {
    requiredPassportKind: ["Founder", "InternalTeam", "PartnerExternal"],
    geographicRestrictions: ["US", "CA"]  // HIPAA only applies in US/Canada
  },
  rationale: "PHI requires HIPAA training and healthcare licensure"
}

Example 2: High-Value Task Access

{
  resourceId: "task-tier:enterprise-ml-architecture",
  resourceType: "task_tier",
  requiredCapabilities: [
    {
      capabilityId: "cap:ml-systems",
      minWeight: 0.70  // Strong ML systems knowledge
    },
    {
      capabilityId: "cap:distributed-systems",
      minWeight: 0.65
    },
    {
      capabilityId: "cap:production-experience",
      minWeight: 0.60
    }
  ],
  operator: "AND",
  constraints: {
    minTrustLevel: 0.80  // High trust score required
  },
  rationale: "High-stakes ML infrastructure design requires proven expertise"
}

Example 3: Feature Access (Threshold Model)

{
  resourceId: "feature:advanced-analytics-dashboard",
  resourceType: "system_feature",
  requiredCapabilities: [
    { capabilityId: "cap:data-analysis", minWeight: 0.65 },
    { capabilityId: "cap:statistics", minWeight: 0.60 },
    { capabilityId: "cap:data-visualization", minWeight: 0.60 },
    { capabilityId: "cap:sql", minWeight: 0.55 }
  ],
  operator: "THRESHOLD",
  threshold: 2,  // Must have at least 2 of the 4 capabilities
  rationale: "Analytics dashboard requires data literacy, but not all specific skills"
}

Example 4: Time-Restricted Access

{
  resourceId: "data:financial-transactions",
  resourceType: "data_set",
  requiredCapabilities: [
    {
      capabilityId: "cap:financial-analysis",
      minWeight: 0.70,
      minVerification: "issuer_verified"
    }
  ],
  operator: "AND",
  constraints: {
    timeRestrictions: {
      allowedHours: "09:00-17:00",  // Business hours only
      allowedDays: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
    },
    geographicRestrictions: ["US"],
    minTrustLevel: 0.85
  },
  rationale: "Financial data access restricted to business hours for audit trail"
}

Dynamic Access Elevation

Capabilities can be temporarily elevated for specific tasks:

interface TemporaryAccessGrant {
  actorId: PassportId;
  resourceId: string;
  grantedBy: PassportId;
  reason: string;
  expiresAt: Date;
  
  // Elevated capabilities (temporary boost)
  elevatedCapabilities?: {
    capabilityId: string;
    temporaryWeight: number;  // Override weight for this grant
  }[];
}

// Example: Junior developer needs access to production for emergency fix
{
  actorId: "did:human:junior-dev",
  resourceId: "api:production-database",
  grantedBy: "did:human:senior-engineer",
  reason: "Emergency hotfix for payment processing bug",
  expiresAt: "2025-12-01T20:00:00Z",  // 2 hours
  elevatedCapabilities: [
    {
      capabilityId: "cap:production-access",
      temporaryWeight: 0.75  // Boost from 0.30 to 0.75 for this session
    }
  ]
}

Capability-Gated KB Access

Different KB documents require different capabilities:

// Governance tier + capability requirements
const kbAccessPolicies = {
  "Canon": {
    requiredCapabilities: [
      { capabilityId: "cap:strategic-thinking", minWeight: 0.70 }
    ],
    requiredPassportKind: ["Founder", "InternalTeam"]
  },
  
  "Working": {
    requiredCapabilities: [
      { capabilityId: "cap:product-knowledge", minWeight: 0.50 }
    ],
    requiredPassportKind: ["Founder", "InternalTeam", "PartnerExternal"]
  },
  
  "Public": {
    // No capability requirements, but maybe basic trust level
    constraints: { minTrustLevel: 0.20 }
  },
  
  // Document-specific overrides
  "kb:ml-infrastructure-design": {
    requiredCapabilities: [
      { capabilityId: "cap:ml-systems", minWeight: 0.65 },
      { capabilityId: "cap:architecture-design", minWeight: 0.60 }
    ],
    operator: "AND"
  }
};

Audit Trail

Every access decision is logged:

interface AccessAuditLog {
  id: string;
  timestamp: Date;
  actorId: PassportId;
  resourceId: string;
  resourceType: string;
  decision: 'granted' | 'denied';
  
  // What capabilities were checked
  requiredCapabilities: {
    capabilityId: string;
    required: { minWeight: number; minVerification?: string };
    actual: { weight: number; verificationStatus: string };
    satisfied: boolean;
  }[];
  
  // Why access was granted/denied
  reason: string;
  missingCapabilities?: string[];
  
  // Session context
  sessionId?: string;
  ipAddress?: string;
  deviceId?: string;
}

This audit trail enables:

  • Compliance: Prove who accessed what and why
  • Security: Detect unusual access patterns
  • Capability insights: See which capabilities are most frequently required
  • Training recommendations: Identify common capability gaps

COMPUTATION MODEL

The engine runs on a three-pass update model:

Pass 1 — Immediate Update (Fast Path)

Triggered by a workflow event.

Produces:

  • lightweight edge updates
  • provisional capability deltas
  • timestamped micro-attestations

Must complete within <50ms to keep workflows smooth.

Pass 2 — Contextual Update (Slow Path)

Runs asynchronously.

Calculates:

  • pattern clusters
  • long-range capability arcs
  • cross-domain generalization
  • bias correction

Pass 3 — Periodic Reconciliation (Scheduled)

Daily or weekly.

Performs:

  • weight smoothing
  • anomaly detection
  • gaming detection
  • drift correction
  • deletion and revocation propagation

PRIVACY ARCHITECTURE

All capability data is:

  • stored locally in the Human Vault
  • anchored in the ledger by hash only
  • selectively revealable via zk-like mechanisms

People can reveal:

  • capabilities relevant to a role

without revealing:

  • how they were gained
  • where they were gained
  • or what tasks they did

No employer owns the graph.
No system can retain it after revocation.
No algorithm can reverse-engineer private context.


SELECTIVE DISCLOSURE ENGINE

Examples:

Prove capability in "AI safety triage" → reveal only the node + confidence
→ do NOT reveal workflows, errors, history, or training pathways.

Prove compliance with a role requirement → reveal exactly the needed subgraph
→ redact everything else.

Prove you worked for a company → reveal a signed employment attestation
→ no dates unless you choose

The graph is yours. Always.


GAMING PREVENTION & ANTI-SPECIFICATION-GAMING MEASURES

Source: Stuart Russell (value alignment research - specification gaming)

The Capability Graph is only valuable if it's tamper-proof and gaming-resistant. Without rigorous anti-gaming measures, humans could artificially inflate capability weights, destroying enterprise trust and graph integrity.

HUMAN implements comprehensive, multi-layered gaming prevention that makes fraudulent capability claims economically irrational and technically infeasible.

The Gaming Threat Model

Potential gaming vectors:

  1. Credential farming - Rapidly completing easy tasks to inflate scores
  2. Collusion - Peers artificially vouching for each other
  3. Bot assistance - Using AI to complete human assessments
  4. Context manipulation - Cherry-picking favorable task contexts
  5. Temporal gaming - Timing completions to exploit system patterns
  6. Multi-account gaming - Creating multiple identities to game reputation
  7. Social engineering - Manipulating review processes

Anti-Gaming Architecture

1. Task Diversity Requirements

Principle: Capability requires success across varied contexts, not just one test.

interface CapabilityDiversityRequirement {
  capabilityId: string;
  
  // Evidence must span multiple dimensions
  minContextVariety: {
    domains: number;           // Must demonstrate in N different domains
    taskTypes: number;         // Must complete N different task types
    complexityLevels: number;  // Must succeed at various difficulty levels
    timeWindows: number;       // Must perform over N distinct time periods
  };
  
  // Prevents "farming" the same easy task repeatedly
  maxRepetitionWeight: number;  // Cap weight from repeated similar tasks
}

Example - Healthcare Triage:

{
  capabilityId: "cap:healthcare-triage",
  minContextVariety: {
    domains: 3,           // Pediatrics, adult, geriatric
    taskTypes: 4,         // Assessment, escalation, documentation, communication
    complexityLevels: 3,  // Routine, urgent, emergency
    timeWindows: 6        // At least 6 separate weeks of performance
  },
  maxRepetitionWeight: 0.40  // Repeated similar tasks capped at 40% of total weight
}

Implementation:

  • Task completion events tagged with context fingerprints
  • Weight calculation penalizes repetition
  • Capability weight plateaus until diversity threshold met

2. Peer Review & Random Auditing

Principle: 10% of capability assessments randomly audited by other humans.

interface PeerReviewProcess {
  // Random audit selection
  auditProbability: number;     // 10% of all assessments
  
  // Reviewer selection criteria
  reviewerRequirements: {
    minCapabilityWeight: number;      // Reviewer must have higher capability
    noConflictOfInterest: boolean;    // No prior collaboration with subject
    geographicDiversity: boolean;     // Prefer reviewers from different regions
  };
  
  // Review process
  blindReview: boolean;               // Reviewer doesn't see original score
  reviewCriteria: string[];           // Specific rubric for review
  
  // Dispute resolution
  thresholdForEscalation: number;     // If reviewer disagrees by >20%, escalate
  tiebreaker: 'third_reviewer' | 'trust_and_safety_team';
}

Audit triggers:

  • Random selection (10% baseline)
  • High-value capabilities (healthcare, finance) → 20% audit rate
  • Rapid capability gain (>0.20 weight increase in 30 days) → 50% audit rate
  • Anomaly detection flags → 100% audit rate

Reviewer compensation:

  • Paid per review (aligned incentive to be thorough)
  • Own reputation at stake (bad reviews damage their graph)
  • Blind reviews prevent social pressure

3. Temporal Validation & Capability Decay

Principle: Capabilities decay without continued demonstration. Skills atrophy.

interface CapabilityDecayModel {
  capabilityId: string;
  decayFunction: 'exponential' | 'linear' | 'step';
  
  // Time-based decay parameters
  halfLife: number;             // Months until weight halves (if not reinforced)
  minRetentionRate: number;     // Floor below which capability removed
  
  // Reinforcement resets decay
  reinforcementEvents: {
    taskCompletion: { decayReset: 'full' | 'partial', months: number };
    training: { decayReset: 'partial', months: number };
    peerValidation: { decayReset: 'partial', months: number };
  };
  
  // Exemptions (credentials don't decay until expiration)
  exemptIfCredentialBacked: boolean;
}

Example decay curves:

Capability Type Half-Life Decay Function Rationale
Technical Skills (Python, SQL) 12 months Exponential Skills rust without practice
Judgment Capabilities (Triage, Safety) 18 months Linear Judgment degrades slower but steadily
Certifications (HIPAA, CPR) No decay Step (expires at cert expiration) Binary valid/expired
Soft Skills (Communication, Empathy) 24 months Linear Stable but can degrade

Anti-gaming benefit:

  • Can't "farm" capability once and coast forever
  • Forces continuous demonstration
  • Aligns with real-world skill maintenance

4. Anomaly Detection & Pattern Analysis

Principle: Statistical analysis flags suspicious patterns.

interface AnomalyDetectionSystem {
  // Velocity anomalies
  rapidGain: {
    threshold: number;           // >0.30 weight gain in <30 days = suspicious
    compareToPopulation: boolean; // Compare to other humans gaining same capability
  };
  
  // Perfection anomalies
  unrealisticSuccess: {
    threshold: number;           // 100% success rate over 50+ tasks = suspicious
    expectedErrorRate: number;   // Humans make mistakes; 0 errors is a red flag
  };
  
  // Temporal anomalies
  offHoursPatterns: {
    detectBotPatterns: boolean;  // Activity at 3 AM every night = bot?
    regularityThreshold: number; // Too-regular timing patterns
  };
  
  // Social anomalies
  collusion: {
    detectPeerReviewClusters: boolean;  // Same 3 people always review each other?
    maxReviewOverlap: number;            // Max % of reviews from same reviewers
  };
  
  // Context anomalies
  taskSimilarity: {
    detectRepetition: boolean;   // Completing nearly-identical tasks repeatedly
    maxSimilarity: number;       // Tasks >95% similar flagged
  };
}

Anomaly response workflow:

async function handleAnomalyDetection(
  passportId: PassportId,
  capabilityId: string,
  anomalyType: string,
  confidence: number
) {
  if (confidence > 0.90) {
    // High confidence anomaly - immediate quarantine
    await quarantineCapability(passportId, capabilityId, {
      reason: anomalyType,
      confidence: confidence,
      status: 'under_review'
    });
    
    // Notify Trust & Safety team
    await notifyTrustAndSafety({
      passportId,
      capabilityId,
      anomalyType,
      confidence,
      priority: 'high'
    });
    
    // Human review required before un-quarantine
    await createReviewTask({
      type: 'capability_integrity_review',
      subject: passportId,
      capability: capabilityId,
      evidence: await gatherAnomalyEvidence(passportId, capabilityId)
    });
  } else if (confidence > 0.70) {
    // Medium confidence - increase audit rate
    await flagForEnhancedAudit(passportId, capabilityId, {
      auditRate: 0.50,  // 50% of future tasks audited
      duration: '60 days',
      reason: anomalyType
    });
  } else {
    // Low confidence - log for pattern monitoring
    await logAnomalySignal(passportId, capabilityId, anomalyType, confidence);
  }
}

5. Multi-Source Validation (Cross-Channel Consistency)

Principle: Combine evidence from Academy, Workforce Cloud, and external attestations.

interface MultiSourceValidation {
  capabilityId: string;
  
  // Evidence sources must corroborate
  requiredSources: {
    academy: { minTasks: number; minSuccessRate: number };
    workforceCloud: { minTasks: number; minQualityScore: number };
    peerReview: { minReviews: number; minConsensus: number };
    externalAttestation?: { minCredentials: number; minTrustLevel: number };
  };
  
  // Cross-source consistency check
  maxVariance: number;  // If sources disagree by >20%, flag for review
  
  // Weighting by source reliability
  sourceWeights: {
    academy: number;              // Training ≠ real performance
    workforceCloud: number;       // Real tasks = highest weight
    peerReview: number;           // Social validation
    externalAttestation: number;  // Official credentials
  };
}

Example - Software Engineering:

{
  capabilityId: "cap:full-stack-development",
  requiredSources: {
    academy: { minTasks: 10, minSuccessRate: 0.80 },
    workforceCloud: { minTasks: 5, minQualityScore: 0.75 },
    peerReview: { minReviews: 3, minConsensus: 0.70 },
    externalAttestation: { minCredentials: 0, minTrustLevel: 0.60 }  // Optional
  },
  maxVariance: 0.20,  // Academy says 0.90, Workforce says 0.60 = flag for review
  sourceWeights: {
    academy: 0.20,              // Training is lowest weight
    workforceCloud: 0.50,       // Real performance is highest
    peerReview: 0.20,           // Social validation
    externalAttestation: 0.10   // Credentials add credibility but not primary
  }
}

Anti-gaming benefit:

  • Can't just ace Academy training (must perform in real tasks)
  • Can't just game Workforce Cloud (peers must validate)
  • Can't just get peer vouches (must have training + performance)

6. Economic Disincentives

Principle: Make gaming economically irrational.

Cost of gaming > Benefit of inflated capability:

Gaming Vector Cost to Gamer Detection Probability Expected Loss
Credential farming Time wasted on repetitive tasks 90% (diversity check fails) Lost time + quarantine
Collusion Coordinating with peers 85% (pattern detection) Both parties banned
Bot assistance Risk of platform ban 95% (timing anomalies) Loss of all capability data
Multi-account gaming Infrastructure cost 99% (device fingerprinting) All accounts banned

Penalties:

  1. First offense: Capability quarantine (30 days)
  2. Second offense: Capability revoked, 90-day probation
  3. Third offense: Account suspension, potential ban
  4. Severe cases: Permanent ban + notification to ecosystem partners

Incentive alignment:

  • Honest capability building is faster than gaming
  • Real capability = real earnings in Workforce Cloud
  • Quarantine = lost earning opportunity
  • Reputation damage is permanent

7. Transparency & Explainability

Principle: Humans must understand HOW capabilities are assessed to trust the system.

Public documentation:

  • Capability assessment criteria (see Capability Cards below)
  • Weighting formulas (no black boxes)
  • Audit processes (how peer review works)
  • Appeal processes (how to dispute a quarantine)

Per-capability transparency:

interface CapabilityTransparencyReport {
  capabilityId: string;
  
  // How was this capability assessed?
  assessmentMethod: {
    evidenceSources: string[];           // Academy, Workforce, Peer, Credential
    diversityRequirements: object;       // Context variety needed
    decayModel: object;                  // How capability decays
    peerReviewRate: number;              // % of assessments audited
  };
  
  // What's measured vs. NOT measured?
  measuredAttributes: string[];
  excludedAttributes: string[];
  
  // Fairness considerations
  biasAudit: {
    lastAuditDate: Date;
    auditor: string;                     // Third-party auditor
    findings: string;
  };
  
  // How to improve this capability
  improvementPath: {
    recommendedTraining: string[];
    requiredTasks: string[];
    estimatedTimeToMastery: string;
  };
}

Anti-gaming benefit:

  • Transparency reduces "find the loophole" gaming
  • Humans see legitimate path is faster
  • Audit visibility deters collusion

Implementation Timeline

Phase Measures Timeline
v0.1 (Launch) Task diversity, temporal decay, basic anomaly detection Month 9
v0.2 (Post-Launch) Peer review, multi-source validation Month 12
v0.3 (Scale) Advanced anomaly detection, ML-based pattern analysis Month 18
Ongoing Quarterly bias audits, continuous improvement Quarterly

Success Metrics

How we measure anti-gaming effectiveness:

Metric Target Measurement
False positive rate <5% % of legit capability gains flagged as suspicious
False negative rate <2% % of gaming attempts that slip through (from audits)
Appeal success rate 15-20% % of quarantines overturned (healthy = some false positives)
Time to detection <7 days Median days from gaming attempt to detection
Recidivism rate <10% % of flagged users who game again after penalty

Integration with Other Systems

Anti-gaming measures connect to:

  • HumanOS - Quarantined capabilities excluded from routing
  • Workforce Cloud - Suspended users lose task access
  • Academy - Gaming detection informs training improvements
  • Passport - Integrity flags visible to enterprises (with consent)
  • Ledger - Attestation revocations propagated globally

Research Partnership Opportunity

Stuart Russell (UC Berkeley) - Advise on specification gaming prevention, value alignment implementation

Potential collaboration:

  • Review anti-gaming architecture
  • Co-author paper on practical value alignment in capability systems
  • Advisory board position (Technical Advisory Board tier)

See: kb/86_academic_and_thought_leader_engagement_strategy.md - Stuart Russell engagement plan


This comprehensive anti-gaming system ensures the Capability Graph remains the most trustworthy representation of human capability ever built—because it's designed from the ground up to be tamper-proof, fair, and transparent.


CAPABILITY CARDS: TRANSPARENCY & FAIRNESS LAYER

Source: Timnit Gebru (Model Cards for AI fairness) - Applied to human capabilities

Problem: If humans don't understand HOW they're being assessed, they can't trust the system. Black-box capability assessment = algorithmic bias risk.

Solution: Every capability has a public "Capability Card"—a transparency document that explains:

  • How it's assessed
  • What IS measured
  • What is NOT measured
  • Fairness considerations
  • How to improve it

Capability Cards are "nutrition labels" for capabilities—making the system explainable, auditable, and fair.

Why Capability Cards Matter

Traditional assessment problems:

  • ❌ Opaque (humans don't know how they're evaluated)
  • ❌ Biased (hidden assumptions favor certain demographics)
  • ❌ Unfair (some humans disadvantaged by assessment design)
  • ❌ Unauditable (no way to verify fairness)

Capability Cards solve all four:

  • Transparent: Humans see exactly how assessment works
  • Fair: Explicit about what's excluded (age, gender, credentials)
  • Auditable: Third parties can review methodology
  • Trustworthy: Enterprises know what they're routing on

Capability Card Template

# Capability Card: [Capability Name]

**ID:** cap:[capability-id]  
**Category:** [skill | judgment | experience | trait | certification]  
**Version:** 1.0  
**Last Reviewed:** [Date]  
**Reviewed By:** [DAIR Institute | Internal Trust & Safety | Third-Party Auditor]

---

## ASSESSMENT METHOD

**How is this capability measured?**

- **Training Evidence:** [Academy module completions, simulations]
- **Workforce Evidence:** [Real task completions, quality scores]
- **Peer Validation:** [Random audits, peer reviews]
- **External Attestation:** [Credentials, licenses, certifications]

**Weight Calculation:**
- Academy evidence: 20% of weight
- Workforce evidence: 50% of weight (real performance)
- Peer validation: 20% of weight
- External attestation: 10% of weight

**Diversity Requirements:**
[Explain task diversity, context variety, temporal validation]

---

## WHAT IS MEASURED

**Explicit list of assessed attributes:**

1. [Attribute 1]: [How it's measured]
2. [Attribute 2]: [How it's measured]
3. [Attribute 3]: [How it's measured]

**Example Tasks:**
- [Task example 1]
- [Task example 2]
- [Task example 3]

---

## WHAT IS NOT MEASURED

**Explicit exclusions (prevents proxy discrimination):**

- ❌ **Years of experience** - Not a proxy for capability
- ❌ **Educational credentials** - Not assessment criteria (unless required by regulation)
- ❌ **Speed** - Quality over speed
- ❌ **Age, gender, race, nationality** - Never factored into capability weight
- ❌ **Employment history** - Past employers don't determine capability
- ❌ **Socioeconomic status** - No bias based on background

---

## FAIRNESS CONSIDERATIONS

**How we ensure fairness:**

1. **Diverse testing contexts:** [Capability tested across varied scenarios to prevent cultural bias]
2. **Multiple pathways:** [Various ways to demonstrate capability—not one narrow test]
3. **Bias auditing:** [Quarterly audits for demographic disparities]
4. **Accessibility:** [Accommodations for disabilities, language support]
5. **Appeal process:** [How to dispute assessment]

**Bias Audit Results:**
- Last audit: [Date]
- Auditor: [DAIR Institute / Third party]
- Findings: [Summary of audit—any disparities detected?]
- Remediation: [Actions taken if bias found]

---

## HOW TO IMPROVE THIS CAPABILITY

**Recommended path to mastery:**

1. **Academy Training:** [Recommended modules]
2. **Practice Tasks:** [Workforce Cloud task types that build this capability]
3. **Peer Learning:** [Mentorship, collaboration opportunities]
4. **External Resources:** [Courses, certifications, books]

**Estimated Time to Proficiency:** [Realistic timeline]

**Current Supply/Demand:**
- Humans with this capability: [Count]
- Enterprise demand: [High | Medium | Low]
- Earning potential: [$ range for tasks requiring this capability]

---

## REVISION HISTORY

| Version | Date | Changes | Reviewer |
|---------|------|---------|----------|
| 1.0 | [Date] | Initial capability card | [Reviewer] |

---

**Questions or Concerns?**  
Contact: trust-and-safety@human.xyz  
Appeal Process: [Link to appeal form]

Example: Clinical Judgment (Nursing)

# Capability Card: Clinical Judgment (Nursing)

**ID:** cap:clinical-judgment-nursing  
**Category:** judgment  
**Version:** 1.2  
**Last Reviewed:** November 15, 2025  
**Reviewed By:** DAIR Institute (Third-Party Audit)

---

## ASSESSMENT METHOD

**How is this capability measured?**

- **Training Evidence:** Academy simulated patient scenarios (15+ scenarios across age groups)
- **Workforce Evidence:** Real triage decisions in HumanOS workflows (10+ successful escalations)
- **Peer Validation:** Random audit by licensed RNs (10% of assessments reviewed)
- **External Attestation:** Active RN license (verified through state board API)

**Weight Calculation:**
- Academy evidence: 15% (simulations ≠ real patients)
- Workforce evidence: 60% (real triage performance)
- Peer validation: 15% (expert review)
- External attestation: 10% (license validity)

**Diversity Requirements:**
- Minimum 3 patient demographics (pediatric, adult, geriatric)
- Minimum 4 acuity levels (routine, urgent, emergency, critical)
- Minimum 3 clinical contexts (inpatient, outpatient, emergency)

---

## WHAT IS MEASURED

1. **Decision accuracy under uncertainty** - Correctly identifying patient conditions when information is incomplete
2. **Evidence-based reasoning** - Using clinical guidelines and protocols appropriately
3. **Patient safety protocols** - Following escalation procedures for high-risk situations
4. **Communication clarity** - Documenting decisions clearly for other care team members
5. **Escalation appropriateness** - Knowing when to call a physician vs. handle independently

**Example Tasks:**
- Review patient vitals and determine if ER visit needed
- Assess medication side effects and escalate if dangerous
- Triage incoming patients by acuity level
- Document assessment findings for care team

---

## WHAT IS NOT MEASURED

- ❌ **Years of nursing experience** - Not a capability proxy (new RNs can have strong judgment)
- ❌ **Nursing school prestige** - Where you trained doesn't determine capability
- ❌ **Speed of triage** - Quality over speed (fast but wrong = dangerous)
- ❌ **Patient satisfaction scores** - Nice bedside manner ≠ clinical judgment
- ❌ **Age, gender, race** - Never factored into assessment
- ❌ **Employment history** - Past hospital employment irrelevant

---

## FAIRNESS CONSIDERATIONS

**How we ensure fairness:**

1. **Diverse patient demographics:** Scenarios include varied ages, genders, races, socioeconomic backgrounds
2. **No cultural bias:** Scenarios reviewed by diverse nursing panel to eliminate cultural assumptions
3. **Multiple pathways:** Academy training, real-world performance, or peer validation can all demonstrate capability
4. **Accessibility:** Scenarios available in multiple formats (text, audio description) for nurses with disabilities
5. **Language support:** Clinical judgment assessed in nurse's primary language

**Bias Audit Results:**
- Last audit: October 2025
- Auditor: DAIR Institute (Dr. Timnit Gebru's team)
- Findings: No statistically significant demographic disparities detected (p > 0.05)
- Remediation: N/A (passed audit)

---

## HOW TO IMPROVE THIS CAPABILITY

**Recommended path to mastery:**

1. **Academy Training:**
   - Complete "Clinical Triage Foundations" module (4 hours)
   - Complete "Escalation Decision-Making" module (3 hours)
   - Pass 20 simulated patient scenarios (varies by performance)

2. **Practice Tasks:**
   - Start with low-acuity triage tasks in Workforce Cloud
   - Progress to urgent/emergency scenarios as weight improves
   - Request peer mentorship from high-weight RNs

3. **External Resources:**
   - AACN Clinical Judgment Model (free online)
   - Tanner's Clinical Judgment Model (research paper)
   - State nursing board continuing education

**Estimated Time to Proficiency:** 60-90 days (depends on prior experience)

**Current Supply/Demand:**
- Humans with this capability: 1,247 (weight >0.70)
- Enterprise demand: **HIGH** (healthcare triage is top-requested)
- Earning potential: $45-75/hour for high-weight capability

---

## REVISION HISTORY

| Version | Date | Changes | Reviewer |
|---------|------|---------|----------|
| 1.0 | June 2025 | Initial capability card | Internal Trust & Safety |
| 1.1 | August 2025 | Added peer validation requirement | DAIR Institute |
| 1.2 | October 2025 | Passed third-party bias audit | DAIR Institute |

---

**Questions or Concerns?**  
Contact: trust-and-safety@human.xyz  
Appeal Process: https://human.xyz/appeal

Implementation Architecture

Storage:

interface CapabilityCard {
  capabilityId: string;
  version: string;
  lastReviewed: Date;
  reviewedBy: string;
  
  // Assessment method
  assessmentMethod: {
    trainingSources: string[];
    workforceSources: string[];
    peerValidation: object;
    externalAttestation?: object;
    weightCalculation: Record<string, number>;
    diversityRequirements: object;
  };
  
  // What's measured
  measuredAttributes: {
    name: string;
    description: string;
    howMeasured: string;
  }[];
  
  exampleTasks: string[];
  
  // What's NOT measured (critical for fairness)
  excludedAttributes: {
    attribute: string;
    rationale: string;
  }[];
  
  // Fairness
  fairnessConsiderations: {
    diverseTesting: string;
    multiplePathways: string;
    biasAuditing: string;
    accessibility: string;
    appealProcess: string;
  };
  
  biasAuditResults: {
    lastAudit: Date;
    auditor: string;
    findings: string;
    remediation?: string;
  };
  
  // Improvement path
  improvementPath: {
    academyModules: string[];
    practiceTaskTypes: string[];
    peerLearning: string[];
    externalResources: string[];
    estimatedTimeToMastery: string;
  };
  
  supplyDemand: {
    humanCount: number;
    demandLevel: 'high' | 'medium' | 'low';
    earningPotential: string;
  };
  
  // History
  revisionHistory: {
    version: string;
    date: Date;
    changes: string;
    reviewer: string;
  }[];
}

Access:

  • Public: All Capability Cards publicly accessible (transparency)
  • Web UI: Browse capability cards at https://human.xyz/capabilities/[capability-id]
  • API: GET /api/v1/capabilities/{capabilityId}/card
  • In-app: View card from Capability Graph UI

Use Cases

1. Human transparency: "Why is my 'Python Programming' weight only 0.65?" → Check Capability Card → See assessment method → Understand need for more diverse tasks

2. Enterprise audit: "How do you assess 'Clinical Judgment'? Show me." → Share Capability Card → Enterprise reviews methodology → Builds trust

3. Regulatory compliance: EU AI Act requires explainability for automated decision systems → Capability Cards satisfy transparency requirements

4. Bias detection: Third-party auditors review Capability Cards for fairness → DAIR Institute audits quarterly → Findings published

5. Improvement guidance: "How do I get better at 'Financial Analysis'?" → Capability Card shows recommended training path

Regulatory Alignment

EU AI Act (High-Risk AI Systems):

  • ✅ Transparency requirement (Capability Cards provide it)
  • ✅ Human oversight (peer review in assessment)
  • ✅ Documentation (revision history, audit trail)
  • ✅ Bias mitigation (explicit fairness section)

Timeline for EU AI Act prep: Month 12 (before Act enforcement)

Research Partnership Opportunity

Timnit Gebru (DAIR Institute) - Fairness & transparency advisor

Potential collaboration:

  • Review Capability Card template design
  • Conduct third-party bias audits (quarterly)
  • Co-author paper: "Capability Cards: Model Cards for Human Assessment Systems"
  • Advisory board position (Fairness & Equity Advisory tier)

Talking points:

  • "We're applying your Model Cards framework to human capability assessment"
  • "Every capability has a transparency document—no black boxes"
  • "Would love DAIR Institute to audit our fairness approach"

See: kb/86_academic_and_thought_leader_engagement_strategy.md - Timnit Gebru engagement plan

Implementation Timeline

Phase Deliverable Timeline
v0.1 (Design) Capability Card template design Month 9
v0.2 (Pilot) First 10 Capability Cards (top capabilities) Month 10
v0.3 (Public) All active capabilities have cards, web UI Month 12
v0.4 (Audit) Third-party bias audit (DAIR Institute) Month 15
Ongoing Quarterly updates, annual comprehensive audit Quarterly

Success Metrics

Metric Target Measurement
Coverage 100% of active capabilities % with published cards
Transparency score >4.5/5 User survey: "Do you understand how you're assessed?"
Audit pass rate 100% % of capabilities passing bias audit
Appeal rate <5% % of assessments appealed (healthy = some appeals)
Improvement adoption >60% % of users who follow improvement path after viewing card

COMPREHENSIVE BIAS MITIGATION ARCHITECTURE

Constitutional Mandate: "It is not for exclusion. It is for revelation." (Principle Two)

HUMAN addresses bias not as a compliance checkbox, but as a foundational architectural constraint. The entire Capability Graph is designed to make bias structurally impossible.

1. Architectural Exclusions (Cannot Be Measured)

The system is designed so these factors cannot influence capability assessment:

The engine NEVER evaluates:

  • Race - Not captured, not stored, not evaluated
  • Gender - Not a capability factor
  • Age - Not correlated with capability
  • Disability - Accommodations provided, not penalized
  • Geography - Location ≠ capability
  • Language - Multilingual assessment available
  • Educational pedigree - Where you learned ≠ what you can do
  • Employment history - Past employers don't determine capability
  • Socioeconomic status - No bias based on background
  • Years of experience - Not a proxy for capability
  • Speed - Quality over speed

Only demonstrated human capability.

These exclusions are hardcoded into Capability Cards (see above) and enforced through:

  • Schema design (fields don't exist)
  • Capability Card transparency (explicit exclusion lists)
  • Third-party audits (verify exclusions are enforced)

2. Anti-Exclusion Design Principles

No Scoring or Ranking:

  • Capabilities are weights (0.0-1.0), not scores
  • No leaderboards or comparative rankings
  • No "top performer" lists
  • Revelation of ability, not judgment

Multiple Pathways to Demonstrate Capability:

  • Academy training (structured learning)
  • Workforce performance (real-world tasks)
  • Peer validation (community review)
  • External attestation (credentials, licenses)
  • No single narrow test that favors certain demographics

Equal Opportunity to Demonstrate:

  • Capabilities assessed across diverse contexts
  • Cultural bias testing in scenario design
  • Accessibility accommodations (visual, audio, cognitive)
  • Multilingual assessment support
  • No time-based pressure that disadvantages certain groups

3. Transparency & Auditability

Capability Cards (Model Cards for Human Assessment):

Every capability has a public transparency document showing:

  • Assessment method - How it's measured
  • What IS measured - Explicit attributes
  • What is NOT measured - Explicit exclusions (prevents hidden bias)
  • Fairness considerations - How fairness is ensured
  • Bias audit results - Third-party audit findings (published quarterly)
  • Appeal process - How to dispute assessment

Example: See "Capability Card: Clinical Judgment (Nursing)" above for complete template.

Public Accessibility:

  • All Capability Cards publicly browsable
  • API access: GET /api/v1/capabilities/{capabilityId}/card
  • Web UI: https://human.xyz/capabilities/[capability-id]

4. Third-Party Bias Audits

Research Partnership: DAIR Institute (Dr. Timnit Gebru)

HUMAN commits to quarterly fairness audits by external AI ethics researchers:

Audit Scope:

  • Demographic disparity analysis (statistical testing)
  • Hidden proxy detection (correlations with protected attributes)
  • Assessment methodology review (fairness of design)
  • Capability Card accuracy verification
  • Remediation recommendations

Audit Results:

  • Published on each Capability Card
  • Findings shared publicly (transparency)
  • Remediation actions tracked and verified
  • Appeals process informed by audit findings

Timeline:

  • Month 15: First comprehensive DAIR Institute audit
  • Quarterly: Ongoing bias audits
  • Annual: Comprehensive fairness certification

See: kb/86_academic_and_thought_leader_engagement_strategy.md - DAIR Institute engagement plan

5. Capability-First Routing (Prevents Discrimination)

HumanOS routing follows Principle Twelve: Capability-First, Cost-Informed:

1. Filter to resources that CAN do the work (capability threshold)
2. Among capable resources, consider cost
3. Never route purely on cost (prevents "race to the bottom")
4. Log every routing decision (auditability)

This prevents:

  • Cost-driven discrimination (choosing cheapest worker regardless of fit)
  • Bias toward certain demographics (routing based on proxies)
  • Exploitative wage pressure (capability ensures fair matching)

Every routing decision is cryptographically logged with full provenance:

  • Why was this person chosen?
  • What capability weights influenced the decision?
  • Were there alternative matches? Why weren't they chosen?

See: kb/13_foundational_principles.md - Principle Twelve (Capability-First Routing)
See: kb/35_capability_routing_pattern.md - Complete routing architecture

6. Academy as Equity Infrastructure

Zero barriers to capability development (Principle Six):

  • Free forever for individuals - No socioeconomic gatekeeping
  • Multimodal learning - Text, voice, visual (accommodates learning styles)
  • Multilingual - Reaches global populations
  • Directly connected to paid work - Training → capability → income
  • No credential requirements - Demonstrated ability > pedigree

Academy prevents bias amplification:

  • Displaced workers (those most harmed by AI) get free training
  • No "wealthy-only re-skilling" that would worsen inequality
  • Training quality is identical regardless of background
  • Performance-based assessment (not demographic proxies)

Research Foundation (Timnit Gebru): "AI displacement will hit marginalized communities hardest. HUMAN's Academy is anti-exclusion infrastructure—free training that breaks the cycle."

See: kb/24_human_academy.md - Complete Academy architecture

7. The Rule of Threes: Ethical Constraint Layer

Every feature must satisfy:

  1. Good for the human - Does this increase agency and opportunity?
  2. Good for HUMAN - Is this sustainable and aligned with our mission?
  3. Good for humankind - Does this benefit society or create harm?

Any feature that creates bias fails this test and is architecturally blocked.

Examples:

  • ❌ Using years of experience as capability proxy → Fails human (disadvantages career-changers) + humankind (amplifies age bias)
  • ❌ Charging for re-skilling training → Fails human (barriers for displaced) + humankind (worsens inequality)
  • ❌ Ranking humans competitively → Fails human (surveillance culture) + humankind (exclusion tool)

The Rule of Threes is the immune system that prevents bias from being introduced even under investor or market pressure.

See: kb/13_foundational_principles.md - Principle Three (Rule of Threes)

8. Cryptographic Verifiability & Appeals

Every capability assessment is verifiable:

  • Provenance logging (who assessed, when, why)
  • Evidence pointers (what demonstrations contributed)
  • Audit trails (full history of capability evolution)
  • Appeal process (humans can contest assessments)

Appeals Process:

  1. Human submits appeal with rationale
  2. Trust & Safety reviews evidence
  3. Peer validators (qualified humans) re-evaluate
  4. Decision published with explanation
  5. If assessment was wrong, remediation applied retroactively

This prevents:

  • Black-box algorithmic discrimination
  • Hidden bias in automated systems
  • Inability to contest unfair treatment

Regulatory Alignment:

  • EU AI Act - High-risk AI systems require explainability (Capability Cards provide it)
  • GDPR - Right to explanation (full provenance available)
  • Algorithmic accountability laws - Third-party audits satisfy requirements

Why This Architecture Works

HUMAN doesn't "address bias"—the system is designed to make bias structurally impossible:

Bias Risk Traditional Systems HUMAN's Architecture
Hidden proxies Demographic data influences decisions Proxies architecturally excluded (can't be measured)
Black-box scoring Opaque algorithms, no explanation Capability Cards explain every assessment
Gaming/manipulation Resume inflation, credential fraud Multi-source evidence, cryptographic verification
Lack of accountability No audit trail, no appeals Full provenance logging, appeals process
Exclusionary design Single narrow tests favor certain groups Multiple pathways, diverse contexts
Socioeconomic barriers Expensive training gates opportunity Free Academy, performance-based assessment
Cost-driven discrimination Cheapest resource wins Capability-first routing, cost-informed

Research Foundation

Timnit Gebru (DAIR Institute) - Fairness & Bias Research:

"AI systems perpetuate and amplify existing biases. We need structural fairness, not just algorithmic tweaks."

HUMAN's response to Gebru's research:

  • Capability Graph focuses on demonstrated ability, not proxies (education, pedigree)
  • Capability Cards apply Gebru's Model Cards framework to human assessment
  • Regular third-party audits (DAIR Institute partnership)
  • Academy provides equity infrastructure (free training for displaced workers)

See: kb/85_strategic_frameworks_and_research_foundation.md - Complete Gebru framework analysis

Summary: Bias Mitigation is Constitutional

Bias prevention isn't a feature—it's a constitutional principle (Principle Two: Anti-Exclusion).

The entire architecture enforces:

  1. Explicit exclusions → Demographics can't be measured
  2. Transparency → Every assessment is explainable
  3. Third-party audits → External verification (DAIR Institute)
  4. Multiple pathways → No single narrow test
  5. Free training → No socioeconomic gatekeeping
  6. Capability-first routing → No cost-driven discrimination
  7. Cryptographic logging → Every decision is auditable
  8. Constitutional constraints → Rule of Threes blocks harmful features

Investors cannot override it.
Founders cannot compromise it.
Market pressure cannot erode it.
AI cannot bypass it.

This is what makes HUMAN trustworthy.


HUMAN-ONLY GUARANTEES

The CG guarantees:

  • You can always see your capability graph.
  • You can always edit/remove sensitive nodes.
  • You can always revoke access instantly.
  • You can lock your graph with biometric consent.
  • You can delete your graph and take it with you (enterprise cannot).

The Capability Graph is a human possession, not an enterprise asset.


HOW HUMANOS USES THE GRAPH

HumanOS uses the graph to:

  • route work safely
  • escalate at the right time
  • prevent overwhelm
  • match tasks to capability
  • identify mentorship opportunities
  • guide AI boundaries

Example:
If a person shows high "ambiguity triage," HumanOS may route early-warning tasks.
If they show strong "ethical escalation," HumanOS uses them for high-stakes checks.

Never exploitative.
Always protective.


WHY THE CAPABILITY GRAPH WINS

Because it achieves what no one else is even attempting:

  • Representing human capability with dignity and truth
  • Protecting people from algorithmic judgment
  • Enabling AI to recognize when it needs a human
  • Giving enterprises safe, verifiable human oversight
  • Lifting people with gaps, nonlinear histories, or invisible experience
  • Making growth continuous, guided, and owned by the person

The CG is the missing counterpart to AI models.

AI has weights.
Humans need capability graphs.

This is the match.


EDGE CACHING & LOCAL CAPABILITY DATA

See: 49_devops_and_infrastructure_model.md for complete edge/device architecture.

Where Capability Data Lives

Data Type Location Update Frequency
Full capability graph User's Passport Vault (device) Real-time
Capability profile summary Edge cache 1-5 minute TTL
Capability proofs Device (user-generated ZK proofs) On-demand
Public attestations Edge + Ledger On attestation
Matching indices Regional cloud Batch updated

Edge Caching Strategy

// Capability profile caching at edge

const CAPABILITY_CACHE_POLICY = {
  // Public profile (safe to cache)
  profileSummary: {
    ttl: "5m",
    staleWhileRevalidate: "1h",
    key: (did) => `capability:profile:${did}`
  },
  
  // Capability existence check (for routing)
  capabilityCheck: {
    ttl: "1m",
    key: (did, capability) => `capability:check:${did}:${capability}`
  },
  
  // Attestation verification (stable)
  attestation: {
    ttl: "24h",  // Attestations don't change
    key: (attestationId) => `capability:attestation:${attestationId}`
  }
};

On-Device Capability Access

Users can access and prove capabilities without cloud:

// On-device capability proof generation

class LocalCapabilityProver {
  async proveCapability(
    capability: string,
    verifier: DID,
    minLevel: number
  ): Promise<ZKProof> {
    // 1. Read capability from local vault
    const localGraph = await this.vault.getCapabilityGraph();
    const capNode = localGraph.find(capability);
    
    // 2. Generate ZK proof on-device
    const proof = await zkSnark.prove({
      statement: `I have ${capability} at level >= ${minLevel}`,
      witness: capNode,
      publicInputs: [verifier, capability, minLevel]
    });
    
    // No cloud call needed — proof is self-contained
    return proof;
  }
}

Offline Capability Queries

HumanOS can make routing decisions offline using cached capability data:

// Offline capability matching

async function matchLocalCapabilities(task: Task): Promise<MatchResult> {
  // 1. Get local capability cache
  const localCache = await device.getCapabilityCache();
  
  // 2. Check if user can handle task
  const requiredCapabilities = extractRequirements(task);
  const matches = requiredCapabilities.every(cap => 
    localCache.has(cap.id) && localCache.get(cap.id).level >= cap.minLevel
  );
  
  // 3. Return match (will sync provenance when online)
  return {
    matched: matches,
    offline: true,
    syncRequired: true
  };
}

Result: Capability verification and proof generation work entirely on-device. Cloud is only needed for:

  • Complex multi-user matching (Workforce Cloud)
  • Global capability index updates
  • Cross-user attestation verification

CAPABILITY GRAPHS AS UNIVERSAL ABSTRACTION

The Capability Graph is not just for humans.

The same abstraction that tracks human skills, credentials, and evidence must apply to ALL resources in the HUMAN ecosystem:

AI Models Have Capability Graphs

Every AI model has a capability profile:

interface AIModelCapabilityProfile {
  provider: 'anthropic' | 'openai' | 'deepseek' | 'google' | 'local';
  model: string;                     // 'claude-sonnet-4', 'gpt-4o'
  mode: ModelMode;                   // streaming, batch, extended_thinking
  
  // Same capability node structure as humans!
  capabilities: {
    reasoning: CapabilityScore;      // Chain-of-thought, analysis
    coding: CapabilityScore;         // Code generation, debugging
    synthesis: CapabilityScore;      // Combining sources, creativity
    factualRecall: CapabilityScore;  // Knowledge retrieval
    instruction: CapabilityScore;    // Following complex instructions
    safety: CapabilityScore;         // Refusal appropriateness
    speed: CapabilityScore;          // Latency characteristics
  };
  
  // Constraints
  contextWindow: number;
  outputLimit: number;
  supportedModalities: string[];     // text, code, image, audio
  
  // Known issues (critical for routing!)
  knownWeaknesses: string[];         // "hallucinates dates", "poor at math"
  avoidFor: string[];                // Task types to avoid
  
  // Cost - real dollars, not tokens
  pricing: {
    inputPerMillionTokens: number;
    outputPerMillionTokens: number;
    perMinute?: number;              // For realtime voice
  };
}

Agents Have Capability Graphs

Every agent in the HUMAN ecosystem has:

  • Tools — What tools does this agent have access to?
  • Domains — What knowledge domains is it trained for?
  • Permissions — What actions can it take?
  • Trust Level — Based on track record and verification

Services Have Capability Graphs

External services also have queryable profiles:

  • SLAs — What latency and availability guarantees?
  • Failure Modes — What are known failure patterns?
  • Capacity — What throughput can it handle?
  • Cost — What does it cost per operation?

The Universal Query Interface

The same query interface works for ALL resource types:

interface CapabilityQuery {
  requiredCapabilities: string[];    // What capabilities are needed
  minConfidence?: number;            // Minimum capability score
  constraints?: {
    maxCost?: number;                // Dollar budget
    maxLatency?: number;             // Time budget
    safetyRequirements?: string[];   // Non-negotiable guardrails
  };
}

// Works for humans, AI models, agents, and services
function findMatchingResources(
  query: CapabilityQuery,
  resourcePool: Resource[]
): Resource[] {
  return resourcePool.filter(r => 
    meetsCapabilityRequirements(r.capabilities, query.requiredCapabilities) &&
    meetsConstraints(r, query.constraints)
  );
}

Why This Unification Matters

  1. Single routing primitive — The Universal Routing Primitive uses capability graphs for all resource types
  2. Consistent evaluation — Same logic for matching humans to tasks and models to queries
  3. Explainable routing — "We chose Claude because it scored higher on reasoning"
  4. Learning loop — Quality feedback updates capability profiles for all resources

See: 35_capability_routing_pattern.md — The capability-first routing pattern


ANTI-SOCIAL-CREDIT TECHNICAL GUARANTEES

Critical Design Commitment:

The Capability Graph is designed to REVEAL human capability, not RANK humans. This is not aspirational—it's enforced through technical design and governance.

The Risk We're Defending Against

Social credit systems:

  • Rank and sort humans
  • Create hierarchies and scores
  • Enable discrimination and gatekeeping
  • Are owned/controlled by central authorities
  • Lack individual consent and control

The Capability Graph is fundamentally different:

  • Evidence-based, not score-based
  • Opt-in and consent-driven
  • Human-owned and portable
  • Context-specific, not universal ranking
  • Designed for revelation, not exclusion

Technical Safeguards (Implemented)

1. No Global Leaderboards (Ever)

Rule: The system will NEVER display:

  • "Top 100 nurses"
  • "Best JavaScript developers"
  • "Highest-rated reviewers"
  • Any global ranking or comparative scoring

Technical enforcement:

  • No API endpoints that return ranked lists of humans
  • UI explicitly prohibits comparative displays
  • Analytics aggregated only (no individual rankings)

Exception: Resource routing (internal to HumanOS) uses capability matching for task assignment, but this is NEVER exposed as a ranking.

// ❌ FORBIDDEN
function getTopDevelopers(count: number): Developer[] {
  // This will never exist
}

// ✅ ALLOWED
function findQualifiedDevelopers(
  requiredCapabilities: string[],
  minConfidence: number
): Developer[] {
  // Returns qualified candidates, not ranked list
  // Order is NOT significant
}

2. No Numeric Scores Visible to Third Parties

Rule: Capability weights (0.0-1.0) are internal system values, NEVER displayed to:

  • Other humans
  • Enterprises
  • Third-party services
  • Anyone except the capability owner

What IS visible:

  • "Has demonstrated capability in X"
  • Evidence pointers (credentials, work history, attestations)
  • Context-specific qualifications
  • Confidence intervals (e.g., "High confidence in medical triage")

What is NOT visible:

  • "0.87 capability score"
  • Numeric comparisons between people
  • Percentile rankings
// ❌ FORBIDDEN - Exposing numeric scores
interface PublicCapabilityView {
  name: string;
  score: number;  // NO
}

// ✅ ALLOWED - Evidence-based disclosure
interface PublicCapabilityView {
  name: string;
  evidencePointers: Evidence[];
  confidenceLevel: 'low' | 'medium' | 'high';
  lastDemonstrated: Date;
}

3. Capability Assertions Require Evidence Pointers

Rule: Every capability claim must link to verifiable evidence:

  • Work completed (provenance logs)
  • Training completed (Academy records)
  • Credentials earned (external attestations)
  • Peer attestations (signed by other Passport holders)

No unsupported claims:

  • Can't just say "I'm good at X"
  • Must point to WHERE and WHEN capability was demonstrated
  • Evidence is cryptographically signed
interface CapabilityAssertion {
  capability: string;
  evidence: Evidence[];  // REQUIRED, not optional
  attestedBy: PassportDID;
  timestamp: Date;
  signature: string;
}

interface Evidence {
  type: 'work' | 'training' | 'credential' | 'peer_attestation';
  source: string;  // Verifiable pointer
  date: Date;
  signature: string;
}

4. Regular Bias Audits

Commitment: Independent audits for:

  • Algorithmic bias in capability inference
  • Demographic disparities in capability weights
  • Access patterns (who gets opportunities?)
  • Evidence collection fairness

Conducted by:

  • External ethics advisory board member
  • Third-party algorithmic fairness researchers
  • Scheduled quarterly (minimum)

Audit findings:

  • Published publicly (aggregated, anonymized)
  • Action plan for any identified bias
  • System updates tracked and documented

Enforcement:

  • Ethics board has authority to flag concerns
  • HUMAN Labs must respond within 30 days
  • Foundation (post-transition) has veto power on standards

5. Public Commitment Document

"The Capability Truth Commitment" (Published at launch)

HUMAN commits to:

  1. Never ranking humans globally
  2. Never exposing numeric capability scores to third parties
  3. Requiring evidence for all capability assertions
  4. Conducting regular bias audits
  5. Publishing audit findings publicly
  6. Maintaining human ownership of capability data
  7. Enabling opt-out and data deletion at any time
  8. Prohibiting capability data from being sold
  9. Enforcing these principles via governance structure
  10. Transitioning governance to independent Foundation

This is a binding commitment, not marketing.

Governance Safeguards

Phase 1: HUMAN Labs (Seed to Series A)

Who controls standards:

  • HUMAN Labs proposes capability standards
  • Ethics advisory board provides oversight
  • All decisions logged publicly
  • Community can review and provide feedback

Constraints:

  • No standards that enable ranking
  • No standards that expose numeric scores
  • All standards must include evidence requirements
  • Bias audit findings must be addressed

Phase 2: Governance Council (Series A to B)

Who controls standards:

  • Multi-stakeholder governance council
  • Labs proposes, council reviews/approves
  • Includes: enterprises, researchers, privacy advocates
  • Public comment period (30 days minimum)

Constraints:

  • Same as Phase 1, plus:
  • Multi-party consensus for breaking changes
  • Independent ethics board has veto power on discriminatory standards

Phase 3: Foundation (Series B+)

Who controls standards:

  • Independent Foundation
  • Separated from HUMAN Labs operations
  • Community governance with elected board
  • HUMAN Labs becomes one stakeholder among many

Constraints:

  • Constitutional commitment to anti-ranking principles
  • Foundation charter prohibits social credit uses
  • Governance structure prevents capture by any single party

User Rights (Always Enforced)

Every human with a Capability Graph has the right to:

  1. View their own data (always free)
  2. Export their data (portable format)
  3. Delete their data (right to be forgotten)
  4. Contest inaccuracies (dispute resolution process)
  5. Control visibility (who sees what capabilities)
  6. Opt out of inference (no ML-derived capabilities without consent)
  7. Understand decisions (why was I routed/not routed for a task?)
  8. Appeal routing decisions (human review available)

These are not features. These are rights.

Why This Matters

The Capability Graph will be attacked.

Critics will say:

  • "This is just LinkedIn with crypto"
  • "You're building a social credit system"
  • "This enables discrimination"
  • "Who decides what capabilities matter?"

Our defense:

  1. Technical: We've designed anti-ranking into the system
  2. Governance: Independent oversight prevents misuse
  3. Rights: Users control their data and can opt out
  4. Transparency: Public audits, published findings, open governance
  5. Evidence: Not our word—external ethics board validates

This is not theoretical. This is operational.

Cross-References

  • See: 13_foundational_principles.md - Principle Two: "It is not for exclusion. It is for revelation."
  • See: 48_governance_model_and_constitutional_layer.md - Governance transition timeline
  • See: kb/internal/founder-decisions-2025-12-11.md - Decision #1: Social Credit Defense

Metadata

Source Sections:

  • Lines 32,241-32,593: SECTION 81 — The Capability Graph Engine v0.1

Merge Strategy: Extract directly (single comprehensive spec)

Strategic Purposes:

  • Building (primary)
  • Companion
  • Product Vision

Cross-References:

  • See: 35_capability_routing_pattern.md - The capability-first routing pattern
  • See: 04_the_five_systems.md - Capability Graph overview
  • See: 05_the_human_protocol.md - Graph in the loop
  • See: 20_passport_identity_layer.md - Identity integration
  • See: 22_humanos_orchestration_core.md - How HumanOS uses the graph
  • See: 24_human_academy.md - How Academy feeds the graph
  • See: 25_workforce_cloud.md - How work updates the graph
  • See: 49_devops_and_infrastructure_model.md - Edge caching for capability profiles
  • See: 65_cost_controls_and_ai_optimization.md - LLM routing using capability profiles
  • See: 97_api_specification_capability_graph.md - Capability Graph API

Line Count: ~355 lines
Extracted: November 24, 2025
Version: 2.0 (Complete Reorganization)