21. THE CAPABILITY GRAPH ENGINE v0.1
Technical Implementation Specification
The Capability Graph (CG) is not a model, a score, or a database.
It is the first living representation of human capability, built from:
- real actions
- real evidence
- real decisions
- real collaboration with AI
- and real demonstrations of judgment
CAPABILITY ENGINE (PROTOCOL-LEVEL): CG ENGINE + HUMANOS CRE
In HUMAN canon, the protocol capability engine is the joint behavior of:
- Capability Graph Engine: ingests governed events and updates capability (humans, agents, models) from evidence
- HumanOS Capability Resolution Engine (CRE): uses the Capability Graph + task metadata + risk/policy constraints to route work and decide escalation
Academy is one (high-quality, guided) source of capability evidence. Production work (HumanOS logs), Workforce Cloud execution history, and external attestations are also first-class evidence sources.
How this shows up in Passport: the Capability Graph lives in the actor’s vault; the Passport exposes capability evolution through pointers (CapabilityGraphRoot) and proof references (LedgerRefs) rather than embedding the full graph directly. See: 20_passport_identity_layer.md → “Passport Growth”.
MVP: CAPABILITY-LITE (Foundation Phase, Week 1-2)
See: 15_protocol_foundation_and_build_sequence.md for the canonical build sequence.
Before building the full Capability Graph specification below, we build Capability-Lite — the minimum viable capability tracking that enables real "capability-weighted routing" from Day 1.
What Capability-Lite Includes
| Component | Foundation (Week 1-2) | Full Spec (Wave 2+) |
|---|---|---|
| Nodes | Simple capability strings with weights | Full semantic ontology, taxonomies |
| Weights | Manual weights (0.0-1.0) | ML-derived, evidence-weighted |
| Evidence | Task completion logs | Multi-source (credentials, work, peer) |
| Updates | Manual + basic rules | Continuous learning, time decay |
| Queries | "Does user X have capability Y?" | Semantic similarity, inference |
| Proofs | Signed capability assertions | ZK proofs, selective disclosure |
Capability-Lite Implementation
// Capability-Lite: Minimum viable capability tracking
interface CapabilityNode {
id: string;
passportDid: string; // Owner's Passport DID
name: string; // e.g., "ai_safety_evaluation", "rlhf_review"
weight: number; // 0.0 to 1.0
evidenceCount: number; // Number of supporting events
lastUpdated: Date;
}
interface CapabilityEvidence {
id: string;
capabilityId: string;
type: 'task_completion' | 'training' | 'manual';
description: string;
timestamp: Date;
signedBy: string; // DID of attestor
}
// Core operations
async function getCapabilities(passportDid: string): Promise<CapabilityNode[]>;
async function hasCapability(passportDid: string, name: string, minWeight: number): Promise<boolean>;
async function updateCapability(passportDid: string, name: string, delta: number, evidence: CapabilityEvidence): Promise<void>;
async function findQualifiedUsers(capability: string, minWeight: number): Promise<string[]>;
Capability-Lite Database Schema
PostgreSQL Implementation:
-- Capability Nodes
CREATE TABLE capability_nodes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
passport_did TEXT NOT NULL,
name TEXT NOT NULL,
category TEXT NOT NULL,
weight NUMERIC(3,2) NOT NULL CHECK (weight >= 0 AND weight <= 1),
confidence_interval JSONB, -- {lower: 0.x, upper: 0.y}
evidence_count INT DEFAULT 0,
last_updated TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
deleted_at TIMESTAMPTZ,
UNIQUE(passport_did, name)
);
-- Evidence Records
CREATE TABLE capability_evidence (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
capability_node_id UUID REFERENCES capability_nodes(id) ON DELETE CASCADE,
passport_did TEXT NOT NULL,
evidence_type TEXT NOT NULL CHECK (evidence_type IN ('task_completion', 'training', 'credential', 'manual', 'peer_review')),
description TEXT,
metadata JSONB NOT NULL,
weight_impact NUMERIC(3,2),
signed_by TEXT NOT NULL,
signature TEXT NOT NULL,
recorded_at TIMESTAMPTZ DEFAULT NOW()
);
-- Critical indexes for routing queries
CREATE INDEX idx_capability_nodes_passport ON capability_nodes(passport_did)
WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_nodes_name_weight ON capability_nodes(name, weight DESC)
WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_nodes_category ON capability_nodes(category)
WHERE deleted_at IS NULL;
CREATE INDEX idx_capability_evidence_node ON capability_evidence(capability_node_id, recorded_at DESC);
CREATE INDEX idx_capability_evidence_metadata ON capability_evidence USING GIN(metadata);
CREATE INDEX idx_capability_evidence_passport ON capability_evidence(passport_did, recorded_at DESC);
Capability-Lite Query Patterns
Core Queries with Performance:
-- 1. Find users with specific capability above threshold (routing query)
SELECT passport_did, name, weight, confidence_interval
FROM capability_nodes
WHERE name = 'ai_safety_evaluation'
AND weight >= 0.8
AND deleted_at IS NULL
ORDER BY weight DESC
LIMIT 50;
-- Uses idx_capability_nodes_name_weight (Index Scan, <10ms at 100K rows)
-- 2. Get all capabilities for a user
SELECT id, name, category, weight, evidence_count, last_updated
FROM capability_nodes
WHERE passport_did = 'did:human:abc123'
AND deleted_at IS NULL
ORDER BY category, weight DESC;
-- Uses idx_capability_nodes_passport (Index Scan, <5ms)
-- 3. Find users with MULTIPLE capabilities (complex routing)
SELECT cn.passport_did,
ARRAY_AGG(cn.name) AS capabilities,
AVG(cn.weight) AS avg_weight
FROM capability_nodes cn
WHERE cn.name IN ('ai_safety', 'content_moderation', 'rlhf_review')
AND cn.weight >= 0.7
AND cn.deleted_at IS NULL
GROUP BY cn.passport_did
HAVING COUNT(DISTINCT cn.name) = 3 -- Must have ALL capabilities
ORDER BY avg_weight DESC
LIMIT 20;
-- Uses idx_capability_nodes_name_weight (Bitmap Index Scan, <50ms at 100K rows)
-- 4. Get evidence history for a capability
SELECT evidence_type, description, metadata, signed_by, recorded_at
FROM capability_evidence
WHERE capability_node_id = '<uuid>'
ORDER BY recorded_at DESC
LIMIT 100;
-- Uses idx_capability_evidence_node (Index Scan, <5ms)
-- 5. Query evidence metadata (e.g., find all task completions for a specific task type)
SELECT ce.passport_did, ce.evidence_type, ce.metadata, ce.recorded_at
FROM capability_evidence ce
WHERE ce.metadata @> '{"taskType": "safety_evaluation"}'
ORDER BY ce.recorded_at DESC
LIMIT 100;
-- Uses idx_capability_evidence_metadata GIN index (Bitmap Index Scan, <20ms)
Capability-Lite Routing Implementation
// How routing uses Capability-Lite
async function routeTask(task: Task): Promise<string> {
const requiredCapabilities = task.requiredCapabilities;
const minWeight = task.riskLevel === 'high' ? 0.8 : 0.6;
// Find users with ALL required capabilities above threshold
const qualified = await findQualifiedForAll(requiredCapabilities, minWeight);
if (qualified.length === 0) {
throw new NoQualifiedReviewerError(task);
}
// Among qualified, select by availability (simple for MVP)
return selectByAvailability(qualified);
}
async function findQualifiedForAll(
capabilities: string[],
minWeight: number
): Promise<string[]> {
// SQL query with proper indexing
const result = await db.query(`
SELECT cn.passport_did,
ARRAY_AGG(cn.name) AS capabilities,
AVG(cn.weight) AS avg_weight
FROM capability_nodes cn
WHERE cn.name = ANY($1::text[])
AND cn.weight >= $2
AND cn.deleted_at IS NULL
GROUP BY cn.passport_did
HAVING COUNT(DISTINCT cn.name) = $3
ORDER BY avg_weight DESC
`, [capabilities, minWeight, capabilities.length]);
return result.rows.map(r => r.passport_did);
}
async function updateCapabilityFromEvidence(
passportDid: string,
capabilityName: string,
evidence: CapabilityEvidence
): Promise<void> {
await db.transaction(async (tx) => {
// Insert evidence
await tx.query(`
INSERT INTO capability_evidence (
capability_node_id, passport_did, evidence_type,
description, metadata, weight_impact, signed_by, signature
) VALUES (
(SELECT id FROM capability_nodes
WHERE passport_did = $1 AND name = $2),
$1, $3, $4, $5, $6, $7, $8
)
`, [passportDid, capabilityName, evidence.type, evidence.description,
evidence.metadata, evidence.weightImpact, evidence.signedBy, evidence.signature]);
// Update capability weight and evidence count
await tx.query(`
UPDATE capability_nodes
SET weight = LEAST(1.0, weight + $3),
evidence_count = evidence_count + 1,
last_updated = NOW()
WHERE passport_did = $1 AND name = $2
`, [passportDid, capabilityName, evidence.weightImpact]);
});
}
Why Capability-Lite Matters
Without Capability-Lite:
- "Capability-weighted routing" is just marketing
- No difference from Scale.AI's crowd workers
- No evidence trail for capability claims
With Capability-Lite:
- Tasks route to actually qualified reviewers
- Evidence accumulates with each completed task
- Real differentiation from Day 1
Capability-Lite is the foundation. The full spec below is the vision.
FULL SPECIFICATION
This section explains how the engine actually works — technically, operationally, mathematically, and architecturally.
PURPOSE OF THE ENGINE
The engine must do four things simultaneously:
1. Observe
Capture meaningful, structured human behavior from:
- training interactions
- Workforce Cloud workflows
- AI/human collaboration events
- decisions made under HumanOS routing
- peer interaction
- moments of care, safety, nuance, escalation
2. Interpret
Turn those signals into:
- capability nodes
- weighted edges (strength of evidence)
- pattern clusters
- confidence scores
- time-based decay and reinforcement
3. Represent
Produce a portable capability graph that:
- lives inside the HUMAN Passport
- updates continuously
- is cryptographically anchored
- is selectively revealable (e.g., "prove I'm qualified for X")
4. Protect
Ensure the graph is:
- never comparative
- never used to rank humans
- never used to exclude someone
- never owned by an employer
- free from bias, gaming, or manipulation
CAPABILITY GRAPH STRUCTURE DIAGRAM
graph TB
subgraph "Human Identity"
Human[<b>Human</b><br/>via Passport DID]
end
subgraph "Capability Nodes (Dynamic & Evidence-Based)"
subgraph "Core Capabilities"
Judgment[<b>Judgment</b><br/>Weight: 0.85<br/>Evidence: 47 events]
Empathy[<b>Empathy</b><br/>Weight: 0.78<br/>Evidence: 32 events]
Safety[<b>Safety Detection</b><br/>Weight: 0.92<br/>Evidence: 58 events]
end
subgraph "Domain Capabilities"
Healthcare[<b>Healthcare</b><br/>Weight: 0.68<br/>Evidence: 21 events]
Legal[<b>Legal Reasoning</b><br/>Weight: 0.55<br/>Evidence: 12 events]
Technical[<b>Technical Analysis</b><br/>Weight: 0.72<br/>Evidence: 35 events]
end
subgraph "Emerging Capabilities (Learning)"
Finance[<b>Financial Analysis</b><br/>Weight: 0.35<br/>Evidence: 5 events<br/><i>Growing</i>]
Education[<b>Educational Design</b><br/>Weight: 0.28<br/>Evidence: 3 events<br/><i>New</i>]
end
end
subgraph "Evidence Sources"
AcademyEvidence[<b>Academy</b><br/>Training Attestations]
WorkforceEvidence[<b>Workforce Cloud</b><br/>Task Completions]
HumanOSEvidence[<b>HumanOS</b><br/>Decision Logs]
PeerEvidence[<b>Peer Validation</b><br/>Collaborative Evidence]
end
subgraph "Graph Properties"
TimeDecay[<b>Time Decay</b><br/>Older evidence<br/>weighs less]
Reinforcement[<b>Reinforcement</b><br/>Repeated success<br/>increases weight]
CrossDomain[<b>Cross-Domain Patterns</b><br/>Transfer learning<br/>between capabilities]
end
Human --> Judgment
Human --> Empathy
Human --> Safety
Human --> Healthcare
Human --> Legal
Human --> Technical
Human --> Finance
Human --> Education
AcademyEvidence --> Judgment
AcademyEvidence --> Empathy
AcademyEvidence --> Safety
AcademyEvidence --> Finance
AcademyEvidence --> Education
WorkforceEvidence --> Healthcare
WorkforceEvidence --> Legal
WorkforceEvidence --> Technical
HumanOSEvidence --> Judgment
HumanOSEvidence --> Safety
PeerEvidence --> Empathy
PeerEvidence --> Technical
Judgment -.->|Correlation| Safety
Empathy -.->|Correlation| Healthcare
Technical -.->|Enables| Legal
Healthcare -.->|Transfer Learning| Finance
TimeDecay -.->|Governs| Judgment
Reinforcement -.->|Strengthens| Safety
CrossDomain -.->|Connects| Finance
style Human fill:#2ECC71,stroke:#27AE60,stroke-width:4px,color:#fff
style Judgment fill:#3498DB,stroke:#2E7CB8,stroke-width:3px,color:#fff
style Safety fill:#E74C3C,stroke:#C73C2C,stroke-width:3px,color:#fff
style Finance fill:#F39C12,stroke:#D68910,stroke-width:2px,color:#fff
style AcademyEvidence fill:#9B59B6,stroke:#8E44AD,stroke-width:2px,color:#fff
style TimeDecay fill:#95A5A6,stroke:#7F8C8D,stroke-width:2px
Key Graph Features:
- Node Weights - Each capability has a weight (0.0-1.0) based on quantity and quality of evidence
- Evidence Count - Number of events supporting each capability (transparent provenance)
- Time Decay - Older evidence naturally decreases in weight unless reinforced
- Reinforcement Learning - Repeated demonstrations strengthen capability nodes
- Cross-Domain Edges - Capabilities can correlate or transfer (e.g., healthcare → financial analysis)
- Selective Disclosure - Users can prove specific capabilities via zero-knowledge proofs without revealing full graph
- Non-Comparative - Each graph is individual; no ranking or comparison with others
- Privacy-First - Graph lives in user's Passport, not centralized database
MULTI-SOURCE CAPABILITY EVIDENCE ARCHITECTURE
The Capability Graph integrates three types of evidence, each serving a distinct purpose:
The strategic insight: Capability Graph doesn't replace traditional credentials—it integrates and validates them, then adds real-time work evidence on top.
This makes HUMAN valuable for:
- Recent college grads (have education, need practical experience)
- Displaced white-collar workers (have deep experience in one domain, pivoting to another)
- Senior experts (have decades of mastery, learning AI collaboration)
- Career changers (leveraging transferable capabilities)
The Three Evidence Types
Type 1: Foundational Credentials (Traditional Education & Licensing)
Purpose: Establish baseline competency via formal education and professional licensing.
Sources:
- Universities (Bachelor's, Master's, PhD, professional degrees)
- Certification bodies (CPA, PMP, AWS Certified, etc.)
- Licensing boards (MD, JD, PE, RN, etc.)
- Professional organizations (IEEE, ACM, ABA, etc.)
Verification Method:
- Cryptographic attestation from issuing institution
- Institution signs credential with their private key
- Attestation includes: degree/license, date issued, student/licensee ID
- Stored in Passport, verified on-chain
Weight Contribution:
- Initial weight: 0.5-0.8 (high, but not maximum—degrees prove potential, not current ability)
- Relevance decay: Slow (degrees don't expire, but become less relevant over time without practice)
- Example: CS degree from MIT → 0.70 initial weight in "software-engineering" capability
Schema:
interface CredentialEvidence {
type: "credential";
source: string; // "MIT", "American Board of Radiology"
credential: string; // "Bachelor of Science in Computer Science"
specialization?: string; // "Machine Learning"
issuedDate: Date;
expirationDate?: Date; // For licenses/certifications that expire
credentialId: string; // Unique ID from issuer
verificationStatus: VerificationStatus;
// Cryptographic proof
issuerSignature: string;
issuerPublicKey: string;
attestationHash: string;
// Weight contribution
contribution: number; // 0.0-1.0
relevanceDecayRate: number; // How fast this becomes less relevant
lastValidated: Date;
}
Example: Recent College Grad (Alex)
{
capabilityId: "software-engineering",
weight: 0.65,
evidence: [
{
type: "credential",
source: "University of Michigan",
credential: "Bachelor of Science in Computer Science",
issuedDate: "2024-05-15",
verificationStatus: "issuer_verified",
issuerSignature: "0x7a8f...", // Cryptographic signature from U-M
contribution: 0.60, // 60% of total weight comes from degree
relevanceDecayRate: 0.05 // Decays 5% per year without practice
},
{
type: "credential",
source: "University of Michigan",
credential: "Senior Capstone Project",
description: "Built ML model for fraud detection",
verificationStatus: "issuer_verified",
contribution: 0.05
}
],
lastUpdated: "2024-05-20",
freshness: 1.0 // Recently updated
}
Type 2: Professional Experience (Employer Attestations & Provenance)
Purpose: Validate practical experience and domain mastery from previous employment.
Sources:
- Previous employers (HR departments, managers)
- Clients (for consultants, freelancers)
- Project collaborators (peer attestations)
- HUMAN provenance logs (if previous work was done through HUMAN)
Verification Method:
- Employer attestation: Signed document from HR/manager confirming role, duration, responsibilities
- Provenance logs: If work was done through HUMAN, cryptographic logs of tasks completed
- Peer attestations: Colleagues verify collaboration and capability
- LinkedIn-style verification: Contacts can attest to working together (but weighted lower than formal attestations)
Weight Contribution:
- Initial weight: 0.6-0.9 (very high—proven experience)
- Experience weight formula:
baseWeight × (years / 10)^0.5 × recencyFactor- 2 years experience: 0.6 × √0.2 = 0.27
- 5 years experience: 0.6 × √0.5 = 0.42
- 10 years experience: 0.6 × √1.0 = 0.60
- 20 years experience: 0.6 × √2.0 = 0.85
- Relevance decay: Medium (experience stays relevant for 5-10 years in most fields, then needs refreshing)
Schema:
interface ProfessionalExperienceEvidence {
type: "professional_experience";
source: string; // "TechCorp", "Mayo Clinic"
role: string; // "VP of Marketing", "Attending Radiologist"
duration: {
start: Date;
end: Date;
yearsExperience: number;
};
// What they actually did
responsibilities: string[];
achievements?: string[];
projectsCompleted?: number;
peopleManaged?: number;
// Verification
employerAttestation?: {
signedBy: string; // HR director, manager name
signedDate: Date;
signature: string; // Cryptographic signature
attestationDocument: string; // PDF/document hash
};
provenanceLogs?: {
tasksCompleted: number;
averageQuality: number;
domainsWorked: string[];
};
peerAttestations?: {
attestorId: PassportId;
relationship: "colleague" | "manager" | "client";
attestationText: string;
signedDate: Date;
}[];
// Weight contribution
contribution: number;
relevanceDecayRate: number;
}
Example: Displaced White-Collar Worker (Jennifer, Marketing VP)
{
capabilityId: "strategic-planning",
weight: 0.88,
evidence: [
{
type: "credential",
source: "Northwestern University",
credential: "MBA",
issuedDate: "2005-06-15",
contribution: 0.40
},
{
type: "professional_experience",
source: "TechCorp",
role: "VP of Marketing",
duration: {
start: "2016-03-01",
end: "2024-11-01",
yearsExperience: 8
},
responsibilities: [
"Led marketing strategy for $500M revenue division",
"Managed team of 25",
"Launched 12 major campaigns"
],
employerAttestation: {
signedBy: "Jane Doe, Chief People Officer",
signedDate: "2024-11-15",
signature: "0x9f2a...",
attestationDocument: "ipfs://Qm..."
},
contribution: 0.48, // 8 years experience = significant weight
relevanceDecayRate: 0.08 // Decays 8% per year in fast-moving field
}
],
lastUpdated: "2024-11-20"
}
Example: Senior Expert (Dr. Patel, Radiologist)
{
capabilityId: "radiology-diagnosis",
weight: 0.96, // Near-maximum, decades of evidence
evidence: [
{
type: "credential",
source: "American Board of Radiology",
credential: "Board Certification in Diagnostic Radiology",
issuedDate: "2004-07-01",
expirationDate: "2034-07-01", // Lifetime certification
verificationStatus: "issuer_verified",
contribution: 0.50
},
{
type: "credential",
source: "Johns Hopkins University",
credential: "Doctor of Medicine",
issuedDate: "2000-05-20",
contribution: 0.30
},
{
type: "professional_experience",
source: "Mayo Clinic",
role: "Attending Radiologist",
duration: {
start: "2004-08-01",
end: "2024-10-01",
yearsExperience: 20
},
responsibilities: ["Diagnostic radiology", "Training residents", "Research"],
achievements: ["~50,000 diagnostic reads over career"],
employerAttestation: {
signedBy: "Dr. Sarah Johnson, Department Chair",
signedDate: "2024-10-15",
signature: "0x3d1b...",
attestationDocument: "ipfs://Qm..."
},
contribution: 0.16, // Even with 20 years, not 100%—needs to stay current
relevanceDecayRate: 0.03 // Medical knowledge decays slowly but does decay
}
],
lastUpdated: "2024-11-01",
freshness: 0.95 // Recently active
}
Type 3: Demonstrated Work (Real-Time Task Performance)
Purpose: Prove current ability through actual task completion. This is the most dynamic and trustworthy evidence type.
Sources:
- Workforce Cloud tasks (primary source)
- Academy assessments (training tasks)
- HumanOS-routed decisions (live work)
- Peer reviews (quality validation)
Verification Method:
- HumanOS provenance logs: Every task has cryptographic audit trail
- Outcome metrics: Accuracy, speed, quality, client satisfaction
- Peer review: Other workers or experts validate quality
- A/B ground truth: Some tasks have known correct answers for validation
Weight Contribution:
- Grows over time: Starts low (0.1-0.3), grows to high (0.8-0.9) as evidence accumulates
- Formula:
baseWeight + (tasksCompleted / 1000)^0.5 × qualityFactor- 10 tasks at 90% accuracy: 0.3 + √0.01 × 0.90 = 0.39
- 100 tasks at 95% accuracy: 0.3 + √0.10 × 0.95 = 0.60
- 500 tasks at 95% accuracy: 0.3 + √0.50 × 0.95 = 0.97
- Relevance decay: Fast (if you stop doing the work, weight decays quickly—"use it or lose it")
- Freshness bonus: Recent work gets higher weight
Schema:
interface DemonstratedWorkEvidence {
type: "demonstrated_work";
source: "workforce_cloud" | "academy" | "humanos" | "peer_review";
description: string; // "Completed 500 code reviews"
// Performance metrics
tasksCompleted: number;
accuracy?: number; // 0.0-1.0
averageCompletionTime?: number; // milliseconds
clientSatisfactionScore?: number; // 0.0-1.0
peerReviewScore?: number; // 0.0-1.0
// Time distribution
timeRange: {
firstTask: Date;
lastTask: Date;
totalDuration: number; // milliseconds
};
// Quality indicators
errorsDetected: number; // Errors they caught
errorsMade: number; // Errors they made
escalationsHandled: number;
edgeCasesResolved: number;
// Provenance
provenanceLogs: string[]; // Array of ledger transaction IDs
// Weight contribution
contribution: number;
freshnessWeight: number; // Bonus for recent work
relevanceDecayRate: number; // Decays fast if not maintained
}
Example: Recent College Grad (Alex) After 3 Months Workforce Cloud
{
capabilityId: "software-engineering",
weight: 0.82, // Grew from 0.65 initial (degree only)
evidence: [
{
type: "credential",
source: "University of Michigan",
credential: "Bachelor of Science in Computer Science",
contribution: 0.45 // Was 0.60, now diluted by new evidence (but still significant)
},
{
type: "demonstrated_work",
source: "workforce_cloud",
description: "Completed 200 code reviews",
tasksCompleted: 200,
accuracy: 0.94,
clientSatisfactionScore: 0.91,
peerReviewScore: 0.88,
timeRange: {
firstTask: "2024-06-01",
lastTask: "2024-09-01",
totalDuration: 7776000000 // 90 days
},
errorsDetected: 47, // Caught 47 bugs in reviewed code
errorsMade: 3, // Made 3 mistakes in reviews
provenanceLogs: ["0xabc123...", "0xdef456...", ...],
contribution: 0.37, // 37% of weight now comes from proven work
freshnessWeight: 1.0, // Recent work
relevanceDecayRate: 0.15 // Decays 15% per year if stops working
}
],
lastUpdated: "2024-09-01",
freshness: 1.0
}
Example: Senior Expert (Dr. Patel) After 1 Month AI-Assisted Radiology
{
capabilityId: "ai-assisted-radiology",
weight: 0.88, // Rapid growth because foundation is so strong
evidence: [
{
type: "academy_training",
source: "academy",
modulesCompleted: [
"AI Radiology Systems Overview",
"Reviewing AI Diagnostic Outputs",
"When to Trust vs. Override AI"
],
totalHours: 25,
assessmentScores: [0.95, 0.98, 0.92],
contribution: 0.30 // Training provides foundation
},
{
type: "demonstrated_work",
source: "workforce_cloud",
description: "Reviewed 1,000 AI radiology cases",
tasksCompleted: 1000,
accuracy: 0.98, // 98% agreement with ground truth
averageCompletionTime: 180000, // 3 minutes per case (expert speed)
timeRange: {
firstTask: "2024-10-01",
lastTask: "2024-11-01",
totalDuration: 2592000000 // 30 days
},
aiOverrideRate: 0.12, // Corrected AI 12% of the time
errorsDetected: 120, // Caught 120 AI errors
errorsMade: 2, // Made 2 mistakes (extremely low)
provenanceLogs: [...],
contribution: 0.58, // Demonstrated work carries most weight
freshnessWeight: 1.0,
relevanceDecayRate: 0.10
}
],
// Dr. Patel's radiology-diagnosis capability (0.96) acts as foundation
transferredFrom: {
capabilityId: "radiology-diagnosis",
transferWeight: 0.85 // 85% of that capability transfers to AI-assisted version
}
}
Weight Calculation: Multi-Source Fusion
How the three evidence types combine:
function calculateCapabilityWeight(
credentials: CredentialEvidence[],
experience: ProfessionalExperienceEvidence[],
demonstratedWork: DemonstratedWorkEvidence[]
): number {
// 1. Calculate contribution from each evidence type
const credentialWeight = credentials.reduce((sum, c) =>
sum + (c.contribution * getRelevancyFactor(c)), 0
);
const experienceWeight = experience.reduce((sum, e) =>
sum + (e.contribution * getRelevancyFactor(e)), 0
);
const workWeight = demonstratedWork.reduce((sum, w) =>
sum + (w.contribution * w.freshnessWeight * getRelevancyFactor(w)), 0
);
// 2. Combine with diminishing returns (can't exceed 1.0)
// Formula: 1 - (1 - cred) × (1 - exp) × (1 - work)
// This ensures:
// - Multiple strong signals compound
// - But never exceed 1.0
// - Weak signals have less impact
const combinedWeight = 1 - (
(1 - credentialWeight) *
(1 - experienceWeight) *
(1 - workWeight)
);
// 3. Apply floor and ceiling
return Math.max(0.0, Math.min(1.0, combinedWeight));
}
function getRelevancyFactor(evidence: Evidence): number {
const ageInYears = (Date.now() - evidence.lastUpdated) / (365.25 * 24 * 60 * 60 * 1000);
return Math.exp(-evidence.relevanceDecayRate * ageInYears);
}
Example calculation for Alex (recent grad after 3 months):
Credential: 0.60 contribution × 1.0 relevancy = 0.60
Experience: 0 (no prior work experience)
Demonstrated: 0.37 contribution × 1.0 freshness × 1.0 relevancy = 0.37
Combined: 1 - (1 - 0.60) × (1 - 0.0) × (1 - 0.37)
= 1 - (0.40 × 1.0 × 0.63)
= 1 - 0.252
= 0.748
≈ 0.75 (but Alex actually has 0.82 because of additional evidence from projects)
Key insight: Recent demonstrated work (0.37) combined with degree (0.60) creates 0.75+ weight. This is higher than degree alone (0.60) but still shows the degree matters (without degree, just 0.37 from work alone).
Strategic Advantages of Multi-Source Architecture
1. Fair to Recent Grads
- Degree provides strong initial weight (0.6-0.7)
- But must prove practical ability through demonstrated work
- Prevents "degree mills" from gaming system (low work performance = low final weight)
2. Fair to Career Changers
- Previous experience provides foundation
- Capability Graph identifies transferable capabilities
- Example: Jennifer's "strategic-planning" (0.88) transfers 70% to "ai-workflow-design" → starts at 0.62 instead of 0.1
- Academy training fills specific gaps
- Demonstrated work proves ability in new domain
3. Fair to Senior Experts
- Decades of experience + credentials provide near-maximum weight (0.9+)
- Need minimal Academy training (just "AI collaboration" not "learn the field")
- Start at L5 (expert tier) immediately in Workforce Cloud
- Demonstrated work maintains and updates weight (prevents stagnation)
4. Anti-Gaming by Design
- Can't fake credentials: Cryptographically verified by issuing institution
- Can't fake experience: Requires employer attestation (signed by HR/manager)
- Can't fake work: Provenance logs every task, peer review validates quality
- Can't buy weight: Must actually complete tasks at high quality
5. Reveals Hidden Capability
- Traditional model: "I don't have a degree so I can't prove my ability"
- Capability Graph: "You don't have a degree, but you completed 1,000 tasks at 95% accuracy—you're proven capable"
- Example: Jamal (no degree) reaches L4 (expert tier) through 3 years of demonstrated work
6. Keeps Seniors Current
- Traditional model: "I have 20 years experience" (but when was last time you did the work?)
- Capability Graph: Checks freshness—if no demonstrated work in 2 years, weight decays
- Forces continuous learning and practice (anti-stagnation)
INPUT CHANNELS (Where the CG Gets Its Data)
The engine ingests from five primary sources plus three evidence types (credentials, experience, demonstrated work):
1. Academy Training Cycles
Signals include:
- pattern recognition under time pressure
- safety judgment calls
- ethical decision branching
- cognitive load management
- escalation detection
- attention switching
- error correction
Each training block produces:
- micro-attestations
- capability deltas
- weighted nodes
2. Workforce Cloud Assignments
Every real task yields:
- provenance logs
- correctness signals
- escalation rationale
- cooperation with AI companions
- outcome quality and timeliness
- repeatability
These generate:
- reliability edges
- situational judgment weights
- domain-specific capability boosts
3. AI/Human Collaborative Events
Every AI decision that requires a human override or approval produces:
- risk-class annotations
- override justification
- "pattern felt wrong" signals
- counterfactual expectations
These strengthen:
- meta-cognition nodes
- anomaly detection weights
- trust-sensing edges
4. Peer + Mentor Interactions
When a human:
- helps someone else
- mentors a junior worker
- resolves interpersonal friction
- contributes to group judgment
We generate:
- empathy nodes
- conflict resolution edges
- communication capability weights
5. HumanOS Routing Context
The fact that HumanOS chose a person for a particular task is itself signal:
- the system sees them as capable under certain constraints
- repeated routing builds strong evidence
HumanOS → CG → HumanOS becomes a virtuous loop.
CAPABILITY VERIFICATION & PROGRESSIVE TRUST MODEL
The Capability Graph accepts capability data from multiple sources with varying levels of trust. Rather than requiring perfect verification upfront, the system implements a progressive trust model that allows capabilities to be claimed early and verified over time.
This pragmatic approach enables:
- Rapid onboarding (humans don't wait weeks for credential verification)
- Immediate participation (low-trust tasks available while verification proceeds)
- Graceful integration (works before every institution has a VC issuer)
- Continuous improvement (capabilities strengthen with evidence)
Verification Status Taxonomy
Each capability node and evidence item carries a verification status that influences its weight in the graph:
export type VerificationStatus =
| "self_reported" // User entered it, no external verification
| "pending" // Verification requested but not complete
| "document_provided" // User uploaded supporting document
| "human_verified" // HUMAN staff manually reviewed
| "api_verified" // Automated check against issuer registry
| "issuer_verified" // Direct VC from authoritative source
| "proven" // Demonstrated through task performance
| "revoked" // Issuer revoked the credential
| "expired"; // Credential past expiration date
The Verification Ladder
Capabilities progress through verification stages, with each stage increasing the capability's weight and routing eligibility:
| Level | Status | Suggested Weight Range | Method | Example |
|---|---|---|---|---|
| 0 | self_reported |
0.10 - 0.30 | User claims during onboarding | "I'm a licensed RN in California" |
| 1 | document_provided |
0.30 - 0.50 | User uploads certificate/transcript | PDF scan of nursing license |
| 2 | human_verified |
0.40 - 0.60 | HUMAN staff reviews document | Reviewer confirms license format |
| 3 | api_verified |
0.50 - 0.70 | Automated registry lookup | Query CA Board of Nursing API |
| 4 | issuer_verified |
0.70 - 0.85 | Direct VC from issuer | California Board signs VC |
| 5 | proven |
0.70 - 0.95 | Evidence from completed tasks | 50+ successful triage decisions |
Note: These weight ranges are suggestions for initial implementation. Actual weights should be tuned based on:
- Domain requirements (healthcare may need higher thresholds than general skills)
- Risk tolerance (high-stakes tasks require higher verification)
- Evidence accumulation patterns (multiple weak signals can exceed one strong credential)
- Observed correlation between verification level and actual performance
Pre-Integration Strategy
Before institutions have native VC issuers, HUMAN uses a bridging strategy:
Phase 1 - Launch (Months 1-6):
- Accept all self-reported credentials
- Prompt users to upload supporting documents
- Manual review by trained HUMAN staff for high-value credentials
- Basic API checks where available (e.g., state licensing boards)
- Low routing priority for unverified capabilities
Phase 2 - Partnerships (Months 6-18):
- Partner with major issuers (universities, licensing boards, employers)
- Build integration adapters for existing credential systems
- Automated verification flows via OAuth + data sharing agreements
- Retroactive upgrades: self-reported → issuer-verified automatically
Phase 3 - Protocol Adoption (18+ months):
- Direct VC issuance from authoritative sources
- Zero HUMAN intermediation for credential verification
- Instant verification for participating institutions
- Self-reported capabilities become rare edge cases
Evidence Accumulation & Weight Dynamics
Multiple evidence items combine to strengthen capability weight over time:
// Suggested weight calculation (simplified)
function calculateCapabilityWeight(
capability: CapabilityNode
): number {
const evidenceWeights = capability.evidence.map(e => {
// Base quality score from verification status
let weight = e.qualityScore;
// Time decay: older evidence counts less (unless credential-based)
if (e.source === 'task_completion' || e.source === 'training') {
const ageMonths = monthsSince(e.timestamp);
const decayFactor = Math.exp(-ageMonths / 12); // 12-month half-life
weight *= decayFactor;
}
// Credential expiration
if (e.expiresAt && e.expiresAt < Date.now()) {
weight = 0; // Expired credentials contribute nothing
}
// Revocation
if (e.verificationStatus === 'revoked') {
weight = 0;
}
return weight;
});
// Combine evidence: diminishing returns (can't just stack weak evidence)
// Use logarithmic aggregation to reward diverse evidence
const totalEvidence = evidenceWeights.reduce((sum, w) => sum + w, 0);
const evidenceDiversity = evidenceWeights.length;
const finalWeight = Math.min(
0.95, // Cap at 0.95 (perfect certainty is impossible)
Math.log1p(totalEvidence) / Math.log1p(10) * // Logarithmic scaling
Math.sqrt(evidenceDiversity) / 3 // Diversity bonus
);
return finalWeight;
}
Key Principles:
- Multiple weak signals < One strong credential - But many task completions can rival credential verification
- Time decay for behavioral evidence - Skills atrophy without practice
- Credentials don't decay - Until expiration/revocation
- Diversity matters - Evidence from multiple sources is stronger than repeated evidence from one source
Real-World Onboarding Scenarios
Scenario 1: Nurse with License (Verifiable Credential Flow)
Day 1 - Self-Reported:
{
capability: "registered-nurse-license-ca",
weight: 0.20, // Low - self-reported only
verificationStatus: "self_reported",
evidence: [{
source: "self_reported",
qualityScore: 0.20,
description: "California RN License #RN-123456"
}]
}
→ System behavior: Can browse tasks, cannot be routed to patient care
Day 3 - Issuer Verified: User clicks "Verify with California Board of Nursing" → OAuth flow → Board issues VC
{
capability: "registered-nurse-license-ca",
weight: 0.75, // High - issuer verified
verificationStatus: "issuer_verified",
evidence: [
// ... previous self-reported entry
{
source: "credential",
qualityScore: 0.80,
verificationStatus: "issuer_verified",
issuerDID: "did:org:california-board-nursing",
referenceId: "vc:ca-nursing:RN-123456",
expiresAt: "2027-12-01"
}
]
}
→ System behavior: Now eligible for triage tasks, HIPAA workflows, higher pay tier
Week 2 - Proven Through Performance: Completes 10 successful triage tasks
{
capability: "clinical-triage-judgment",
weight: 0.68, // Built from observed behavior
verificationStatus: "proven",
evidence: [
{
source: "task_completion",
qualityScore: 0.70,
referenceId: "task:triage-001",
description: "Emergency triage: correctly escalated chest pain"
},
// ... 9 more
]
}
→ New capability emerges that no credential could prove: pattern recognition, escalation sense
Scenario 2: Software Engineer (Gradual Verification)
Day 1 - Resume Import:
{
workHistory: [{
employer: "Meta",
role: "Senior Software Engineer",
verificationStatus: "self_reported",
skills: ["Python", "Distributed Systems", "ML Infrastructure"]
}],
capabilities: [
{ id: "skill:python", weight: 0.25 },
{ id: "skill:distributed-systems", weight: 0.20 }
]
}
→ Low priority in matching
Week 1 - Employer Integration: Meta has HUMAN integration → Issues employment VC
{
capabilities: [
{
id: "skill:python",
weight: 0.65, // Jumped from 0.25!
evidence: [{
source: "credential",
issuerDID: "did:org:meta",
verificationStatus: "issuer_verified"
}]
}
]
}
Week 2 - Performance Exceeds Credentials: Completes 5 ML infrastructure tasks with peer review
{
capability: "ml-infrastructure-design",
weight: 0.78, // Higher than credential verification!
evidence: [
{ source: "credential", qualityScore: 0.65 }, // Meta VC
{ source: "task_completion", qualityScore: 0.82 },
{ source: "peer_review", qualityScore: 0.85 },
// ... 3 more tasks
]
}
→ Result: Performance-based evidence can exceed credential-based claims
Scenario 3: Education Without Integration (Stopgap Flow)
Day 1 - Self-Reported Degree:
{
education: [{
institution: "Bangalore University",
degree: "B.S. Computer Science",
verificationStatus: "self_reported",
weight: 0.15
}]
}
Week 1 - Document Upload: User uploads transcript PDF → HUMAN verification service
{
verificationStatus: "human_verified",
weight: 0.50, // Better than self-reported
evidence: [{
source: "attestation",
issuerDID: "did:org:human-verification-service",
documentHash: "sha256:abc123...",
notes: "Transcript verified against known Bangalore University format"
}]
}
Year 2 - University Integration: Bangalore University joins HUMAN → User re-verifies → Weight jumps to 0.75
Handling Expired & Revoked Credentials
Credentials can lose validity over time:
Expiration (Gradual):
- 90 days before expiration: Prompt user to renew
- At expiration: Weight drops to 0, routing eligibility lost
- User can appeal with renewal proof
- Historical evidence remains in graph (for provenance)
Revocation (Immediate):
- Issuer publishes revocation to ledger
- Next graph update detects revocation
- Weight immediately drops to 0
- User notified of revocation reason
- No routing until issue resolved
Example - License Expiration:
// Before expiration
{ capability: "rn-license-ca", weight: 0.80, expiresAt: "2026-06-01" }
// After expiration (June 2, 2026)
{
capability: "rn-license-ca",
weight: 0.00, // No longer valid
verificationStatus: "expired",
note: "License expired. Renew to regain routing eligibility."
}
// After renewal (user uploads new license)
{
capability: "rn-license-ca",
weight: 0.75, // Restored
verificationStatus: "issuer_verified",
expiresAt: "2028-06-01"
}
Connection to Passkeys & Device Security
All credential verification flows are bound to device-level security:
Identity Flow:
- User initiates verification ("Verify my nursing license")
- HUMAN redirects to issuer (California Board portal)
- User authenticates with issuer (their existing login)
- Issuer asks: "Issue credential to which DID?"
- User's Passport provides:
did:human:sarah-abc123 - Issuer signs VC with their private key
- VC delivered to Passport, encrypted with user's DeviceKey
- Hash anchored to ledger
- Capability Graph updated
Security Properties:
- Issuer never sees user's private keys
- HUMAN never sees full VC content (only hash)
- User can revoke access anytime
- Multi-device sync via encrypted vault
- Biometric authentication required for high-value credentials
Summary: Progressive Trust in Practice
The HUMAN verification model is:
- Accept everything initially - No barriers to entry, but limited privileges
- Prompt for verification - Guided flows to official issuers
- Upgrade as evidence arrives - VCs, task performance, peer attestations
- Continuously reinforce - Every task adds evidence, strengthens weight
- Expire when needed - Licenses lapse, credentials revoke, skills atrophy
- Behavioral proof can exceed credentials - Demonstrated capability beats claimed capability
This makes HUMAN practical (works before universal integration) while being aspirational (incentivizes verified credentials). The capability graph becomes a living, breathing, evidence-based representation that's more trustworthy than any static resume—and you own it completely.
INTERNAL STRUCTURE OF THE GRAPH
The Capability Graph is composed of:
Nodes (Capabilities)
Each node is a capability primitive, like:
- pattern recognition
- escalation sense
- safety triage
- ethical judgment
- domain fluency
- attention stability
- ambiguity resolution
- empathy projection
- context restoration
- anomaly detection
Nodes are:
- extensible
- modular
- hierarchical
- domain-specific when needed
Edges (Evidence Relationships)
Edges represent:
- how often
- how strongly
- and in what context
a capability manifested.
Edges include:
- timestamp
- weight
- context class
- risk level
- domain
- verification source
- whether escalation occurred
Weights (Confidence)
Computed from:
- repeated evidence
- cross-channel consistency
- error/override patterns
- contextual diversity
- recency decay
Weights are never used to rank humans — only to help HumanOS route safely.
CAPABILITY TAXONOMY & ONTOLOGY
The Capability Graph needs a semantic understanding of skills, not just flat string labels. The current 5-category system (skill, judgment, experience, trait, certification) is a v0.1 placeholder. This section defines the Living Capability Ontology—a dynamic, semantic taxonomy that evolves with the economy.
The Problem with Flat Taxonomies
Traditional skill taxonomies fail because:
- No semantic relationships - "Python" and "Machine Learning" often appear together, but flat lists don't capture this
- No synonyms - "ML", "Machine Learning", "AI/ML" are treated as different skills
- No hierarchy - "React" is a "JavaScript Framework" is a "Programming Language" is a "Technical Skill"
- Can't handle emergence - "AI Safety Auditing" didn't exist 2 years ago; how does it enter the taxonomy?
- Static definitions - "Web Development" in 2010 ≠ "Web Development" in 2025
The HUMAN Capability Ontology
HUMAN uses a multi-layered semantic ontology:
interface CapabilityDefinition {
// Identity
id: string; // "cap:python-programming"
canonicalName: string; // "Python Programming"
// Taxonomy (broad categorization)
category: CapabilityCategory; // 'skill', 'judgment', 'experience', 'trait', 'certification'
subcategory?: string; // "Programming Languages"
domain?: string; // "Software Engineering"
// Semantic relationships
synonyms: string[]; // ["Python", "python", "py", "Python3"]
relatedCapabilities: {
capabilityId: string;
relationshipType: 'prerequisite' | 'complementary' | 'specialization' | 'often-paired';
strength: number; // 0-1, how strong the relationship
}[];
// Semantic embedding (for similarity search)
embedding: number[]; // 768-dim vector from capability description
// Definition
description: string; // Rich description for LLM understanding
examples: string[]; // Example tasks: "Build REST APIs", "Data analysis with pandas"
// Lifecycle
status: 'emerging' | 'active' | 'evolving' | 'deprecated';
createdAt: Date;
updatedAt: Date;
// Usage statistics (for trend detection)
supplyCount: number; // How many humans claim this capability
demandCount: number; // How many tasks request it
trendDirection: 'rising' | 'stable' | 'declining';
// Version history (for evolving capabilities)
semanticDrift: number; // 0-1, how much meaning has changed over time
historicalDefinitions?: {
dateRange: [Date, Date];
description: string;
embedding: number[];
}[];
}
Capability Relationship Types
Capabilities connect to each other in structured ways:
1. Prerequisite - One capability requires another
{
from: "cap:react-development",
to: "cap:javascript",
type: "prerequisite",
strength: 0.95 // Strong dependency
}
2. Complementary - Often learned/used together
{
from: "cap:kubernetes",
to: "cap:docker",
type: "complementary",
strength: 0.88
}
3. Specialization - One is a more specific version
{
from: "cap:pytorch",
to: "cap:machine-learning",
type: "specialization",
strength: 0.92
}
4. Often-Paired - Frequently appear together in job requirements
{
from: "cap:python",
to: "cap:data-analysis",
type: "often-paired",
strength: 0.85
}
Semantic Embeddings
Every capability has a vector embedding that captures its semantic meaning:
// Generate embedding from capability description + examples
async function generateCapabilityEmbedding(
capability: CapabilityDefinition
): Promise<number[]> {
const text = `
${capability.canonicalName}
Description: ${capability.description}
Examples:
${capability.examples.join('\n')}
Related to: ${capability.relatedCapabilities.map(r => r.capabilityId).join(', ')}
`;
// Use embedding model (e.g., OpenAI ada-002, Cohere embed-v3)
const embedding = await embeddingProvider.embed(text);
return embedding; // 768 or 1536 dimensional vector
}
Embeddings enable:
- Semantic similarity search (find capabilities close to "AI safety")
- Fuzzy matching (match "ML Engineer" to "Machine Learning Specialist")
- Cross-language support (embeddings work across languages)
- Continuous evolution (re-embed as definitions change)
Storage Architecture
-- Capability definitions table
CREATE TABLE capabilities (
id TEXT PRIMARY KEY,
canonical_name TEXT NOT NULL,
category TEXT NOT NULL,
subcategory TEXT,
domain TEXT,
description TEXT NOT NULL,
examples JSONB, -- Array of example tasks
-- Semantic data
synonyms JSONB, -- Array of strings
embedding VECTOR(768), -- pgvector extension for semantic search
-- Relationships stored separately (see below)
-- Lifecycle
status TEXT NOT NULL DEFAULT 'active',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
-- Usage statistics
supply_count INTEGER DEFAULT 0,
demand_count INTEGER DEFAULT 0,
trend_direction TEXT,
semantic_drift DECIMAL DEFAULT 0.0
);
-- Semantic search index (cosine similarity)
CREATE INDEX capabilities_embedding_idx ON capabilities
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Full-text search on names and synonyms
CREATE INDEX capabilities_name_search_idx ON capabilities
USING gin(to_tsvector('english', canonical_name || ' ' || synonyms::text));
-- Capability relationships (edges in the ontology graph)
CREATE TABLE capability_relationships (
from_capability_id TEXT NOT NULL REFERENCES capabilities(id),
to_capability_id TEXT NOT NULL REFERENCES capabilities(id),
relationship_type TEXT NOT NULL,
strength DECIMAL NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (from_capability_id, to_capability_id, relationship_type)
);
-- Capability evolution history
CREATE TABLE capability_history (
capability_id TEXT NOT NULL REFERENCES capabilities(id),
valid_from TIMESTAMPTZ NOT NULL,
valid_to TIMESTAMPTZ,
description TEXT NOT NULL,
embedding VECTOR(768),
related_capabilities JSONB,
PRIMARY KEY (capability_id, valid_from)
);
Example: "Python Programming" in the Ontology
{
"id": "cap:python-programming",
"canonicalName": "Python Programming",
"category": "skill",
"subcategory": "Programming Languages",
"domain": "Software Engineering",
"description": "Ability to write, debug, and maintain code in the Python programming language. Includes understanding of Python syntax, standard library, common frameworks, and best practices.",
"examples": [
"Write Python scripts for data processing",
"Build REST APIs with Flask or FastAPI",
"Develop data analysis pipelines with pandas",
"Create machine learning models with scikit-learn"
],
"synonyms": ["Python", "python", "Python3", "py"],
"relatedCapabilities": [
{
"capabilityId": "cap:programming-fundamentals",
"relationshipType": "prerequisite",
"strength": 0.85
},
{
"capabilityId": "cap:data-analysis",
"relationshipType": "often-paired",
"strength": 0.78
},
{
"capabilityId": "cap:django",
"relationshipType": "specialization",
"strength": 0.70
},
{
"capabilityId": "cap:flask",
"relationshipType": "specialization",
"strength": 0.72
}
],
"embedding": [0.023, -0.156, 0.089, ...], // 768-dim vector
"status": "active",
"supplyCount": 45203, // 45k humans have this capability
"demandCount": 12456, // 12k tasks requested it
"trendDirection": "stable",
"semanticDrift": 0.15 // Low drift - Python is Python
}
Querying the Ontology
Exact match:
SELECT * FROM capabilities
WHERE canonical_name = 'Python Programming'
OR synonyms @> '["Python"]';
Semantic search (find capabilities similar to a query):
-- Find capabilities semantically similar to "AI safety auditing"
SELECT
c.id,
c.canonical_name,
c.description,
1 - (c.embedding <=> $query_embedding::vector) AS similarity
FROM capabilities c
WHERE 1 - (c.embedding <=> $query_embedding::vector) > 0.70 -- Min 70% similarity
ORDER BY c.embedding <=> $query_embedding::vector -- Cosine distance (lower = more similar)
LIMIT 20;
Graph traversal (find related capabilities):
-- Find all capabilities related to "Python"
WITH RECURSIVE related AS (
-- Start with Python
SELECT id, canonical_name, 0 AS depth
FROM capabilities
WHERE id = 'cap:python-programming'
UNION
-- Find capabilities related to capabilities we've found
SELECT c.id, c.canonical_name, r.depth + 1
FROM capabilities c
JOIN capability_relationships cr ON c.id = cr.to_capability_id
JOIN related r ON cr.from_capability_id = r.id
WHERE r.depth < 3 -- Max 3 hops
AND cr.strength > 0.5 -- Only strong relationships
)
SELECT DISTINCT * FROM related;
CAPABILITY DISCOVERY & EVOLUTION
The capability taxonomy must evolve continuously as new skills emerge and old ones become obsolete. This section describes how HUMAN discovers, validates, and tracks capability evolution.
The Capability Lifecycle
┌─────────────┐
│ Candidate │ ← Detected from tasks, resumes, training
└──────┬──────┘
│
↓ (Human curator approves)
┌─────────────┐
│ Emerging │ ← New capability, limited evidence
└──────┬──────┘
│
↓ (Usage > threshold)
┌─────────────┐
│ Active │ ← Mainstream capability, high demand/supply
└──────┬──────┘
│
↓ (Meaning changes over time)
┌─────────────┐
│ Evolving │ ← Definition shifting (e.g., "Web Dev" 2010→2025)
└──────┬──────┘
│
↓ (Demand drops, replaced by newer skills)
┌─────────────┐
│ Deprecated │ ← Obsolete (e.g., "Flash Development")
└─────────────┘
Discovery Pipeline
HUMAN discovers new capabilities from four sources:
Source 1: Task Requests (Demand Side)
When enterprises submit tasks with capability requirements:
// Enterprise: "Need someone with prompt engineering expertise"
interface TaskRequest {
description: string;
requiredCapabilities: string[]; // ["prompt-engineering", "LLM-evaluation"]
}
// System detects: "prompt-engineering" not in taxonomy
async function processNewCapabilityCandidate(capabilityName: string) {
// 1. Check if already exists (exact or synonym match)
const existing = await findCapabilityBySynonym(capabilityName);
if (existing) {
await incrementDemandCount(existing.id);
return existing;
}
// 2. Check semantic similarity to existing capabilities
const embedding = await embeddingProvider.embed(capabilityName);
const similar = await findSimilarCapabilities(embedding, minSimilarity: 0.85);
if (similar.length > 0) {
// High similarity - likely a synonym of existing capability
await addSynonym(similar[0].id, capabilityName);
return similar[0];
}
// 3. Genuinely new capability - create candidate
const candidate = {
id: generateId(),
canonicalName: capabilityName,
status: 'candidate',
source: 'task_request',
description: await llm.generate(`Describe the skill: ${capabilityName}`),
examples: await llm.generate(`Give 3 example tasks for: ${capabilityName}`),
embedding: embedding,
demandCount: 1
};
// 4. Queue for human curator approval
await queueForCuration(candidate);
return candidate;
}
Source 2: Human Self-Reports (Supply Side)
When humans add capabilities to their profiles:
// User adds: "I'm skilled in AI red-teaming"
async function processHumanCapabilityClaim(
passportId: string,
capabilityName: string,
evidence?: string
) {
// Similar flow to task requests
const capability = await findOrCreateCapability(capabilityName);
// Add to human's capability graph
await addCapabilityToGraph(passportId, capability.id, {
weight: 0.20, // Low weight - self-reported
verificationStatus: 'self_reported',
source: 'user_claim'
});
// Increment supply count
await incrementSupplyCount(capability.id);
}
Source 3: Academy Training Modules
When new Academy courses are created:
// New course: "RAG System Design"
async function createAcademyCourse(course: {
title: string;
description: string;
learningOutcomes: string[];
}) {
// Extract capabilities from learning outcomes
const capabilities = await llm.extractCapabilities(course.learningOutcomes);
// For each capability:
for (const cap of capabilities) {
const capability = await findOrCreateCapability(cap.name, {
status: 'active', // Academy-validated = auto-approve
description: cap.description,
examples: course.learningOutcomes,
source: 'academy'
});
// Link course to capability
await linkCourseToCapability(course.id, capability.id);
}
}
Source 4: Workforce Evidence (Revealed Preferences)
Analyze patterns in successful task completions:
// Batch job: Analyze task completion patterns
async function analyzeWorkforcePatterns() {
// Find tasks completed by humans with certain capability combinations
const patterns = await db.query(`
SELECT
array_agg(DISTINCT hc.capability_id) as capability_combo,
COUNT(*) as completion_count,
AVG(t.quality_score) as avg_quality
FROM task_completions tc
JOIN human_capabilities hc ON tc.human_passport_id = hc.human_passport_id
JOIN tasks t ON tc.task_id = t.id
WHERE tc.completed_at > NOW() - INTERVAL '30 days'
GROUP BY tc.task_id
HAVING COUNT(DISTINCT hc.capability_id) > 1 -- Multi-capability tasks
`);
// Look for emergent capability patterns
for (const pattern of patterns) {
if (pattern.completion_count > 50 && pattern.avg_quality > 0.8) {
// Frequent, high-quality pattern - might indicate emergent capability
const capabilityName = await llm.generate(
`What capability do these skills represent together: ${pattern.capability_combo}`
);
await processNewCapabilityCandidate(capabilityName);
}
}
}
Curation Workflow
New capability candidates require human review:
interface CurationTask {
candidateId: string;
proposedName: string;
description: string;
examples: string[];
source: 'task_request' | 'user_claim' | 'academy' | 'pattern_analysis';
usageCount: number; // How many times requested/claimed
// Curator actions
action?: 'approve' | 'merge' | 'reject';
mergeIntoCapabilityId?: string; // If merging with existing
curatorNotes?: string;
}
// Curator dashboard shows:
// - High-demand candidates (many task requests)
// - Pattern-detected capabilities (strong evidence)
// - Similar existing capabilities (to prevent duplicates)
Curation rules:
-
Auto-approve if:
- From Academy (trusted source)
- Matches known skill taxonomies (LinkedIn, O*NET)
- High usage count (>100 requests)
-
Human review if:
- Ambiguous or vague name
- Potential duplicate/synonym
- Low usage but interesting pattern
-
Auto-reject if:
- Spam/gibberish
- Offensive content
- Already exists as synonym
Tracking Capability Evolution
Skills change meaning over time. HUMAN tracks this evolution:
interface CapabilityEvolution {
capabilityId: string;
// Snapshot definitions over time
historicalDefinitions: {
dateRange: [Date, Date];
description: string;
relatedCapabilities: string[];
embedding: number[];
exampleTasks: string[];
}[];
// Drift metrics
semanticDrift: number; // 0-1, cosine distance between first and current embedding
definitionChangeRate: number; // How often description changes
relationshipChurn: number; // How often related capabilities change
// Trend analysis
demandTrend: {
direction: 'rising' | 'stable' | 'declining';
velocity: number; // Rate of change
peakDate?: Date; // When demand peaked (for declining skills)
};
// Morphing patterns
evolvesInto?: string[]; // "Ruby on Rails" → ["Full Stack", "Backend Engineering"]
replacedBy?: string[]; // "Flash" → ["JavaScript", "HTML5", "Canvas"]
}
Example: "Web Development" Evolution
{
capabilityId: "cap:web-development",
historicalDefinitions: [
{
dateRange: ["2010-01-01", "2015-12-31"],
description: "Build websites using HTML, CSS, JavaScript, and server-side languages like PHP or Ruby.",
relatedCapabilities: ["html", "css", "javascript", "jquery", "php", "mysql"],
embedding: [0.123, -0.456, ...], // 2010 embedding
exampleTasks: [
"Create responsive website layouts",
"Build contact forms with PHP",
"Implement jQuery animations"
]
},
{
dateRange: ["2016-01-01", "2020-12-31"],
description: "Build dynamic web applications using modern JavaScript frameworks, REST APIs, and Node.js.",
relatedCapabilities: ["react", "angular", "vue", "nodejs", "rest-api", "webpack"],
embedding: [0.234, -0.345, ...], // 2016 embedding (drift detected)
exampleTasks: [
"Build single-page applications with React",
"Create REST APIs with Express.js",
"Implement real-time features with WebSockets"
]
},
{
dateRange: ["2021-01-01", "present"],
description: "Build full-stack applications with modern frameworks (React, Next.js, TypeScript), serverless architectures, and AI integrations.",
relatedCapabilities: ["react", "nextjs", "typescript", "tailwind", "graphql", "vercel", "ai-apis"],
embedding: [0.345, -0.234, ...], // 2021 embedding (significant drift)
exampleTasks: [
"Build Next.js apps with TypeScript and Tailwind",
"Integrate AI APIs (OpenAI, Anthropic) into web apps",
"Deploy serverless functions to Vercel/Netlify",
"Implement GraphQL APIs with type safety"
]
}
],
semanticDrift: 0.67, // 67% change in meaning from 2010 to 2025
definitionChangeRate: 0.15, // Redefined ~15% per year
relationshipChurn: 0.82, // 82% of related capabilities changed
demandTrend: {
direction: 'rising',
velocity: 0.12 // 12% annual growth in demand
}
}
Drift detection triggers:
async function detectCapabilityDrift() {
// Every quarter, re-analyze capability definitions
const capabilities = await getActiveCapabilities();
for (const cap of capabilities) {
// Get current usage context (recent tasks, training, claims)
const recentContext = await getRecentCapabilityContext(cap.id, days: 90);
// Generate new description from context
const newDescription = await llm.generate(
`Based on recent usage, describe the capability: ${cap.canonicalName}\n\nContext:\n${recentContext}`
);
// Generate new embedding
const newEmbedding = await embeddingProvider.embed(newDescription);
// Compare to current embedding
const similarity = cosineSimilarity(cap.embedding, newEmbedding);
const drift = 1 - similarity;
if (drift > 0.15) { // >15% drift threshold
// Create snapshot in history
await snapshotCapabilityDefinition(cap, newDescription, newEmbedding);
// Update capability
await updateCapability(cap.id, {
description: newDescription,
embedding: newEmbedding,
semanticDrift: calculateTotalDrift(cap)
});
// Notify curators of significant change
await notifyCurators({
type: 'capability_evolution',
capabilityId: cap.id,
drift: drift,
message: `"${cap.canonicalName}" has evolved significantly. Review for accuracy.`
});
}
}
}
Deprecation Strategy
When capabilities become obsolete:
async function evaluateDeprecation(capabilityId: string) {
const capability = await getCapability(capabilityId);
// Signals of obsolescence:
const signals = {
demandDropped: capability.demandCount < (capability.historicalPeakDemand * 0.1), // <10% of peak
noRecentUsage: capability.lastUsedAt < Date.now() - (365 * 24 * 60 * 60 * 1000), // 1+ year
replacementExists: await hasReplacementCapability(capabilityId),
industryTrend: await checkIndustryTrend(capability.canonicalName) // External data
};
if (signals.demandDropped && signals.noRecentUsage) {
// Mark as deprecated
await updateCapability(capabilityId, {
status: 'deprecated',
deprecatedAt: new Date(),
replacementCapabilities: await findReplacements(capabilityId)
});
// Notify humans who have this capability
await notifyHumansWithCapability(capabilityId, {
message: `"${capability.canonicalName}" is becoming less relevant. Consider learning: ${replacements.join(', ')}`,
suggestedTraining: await findRelevantCourses(replacements)
});
}
}
Example: Flash Development → Deprecated
{
"id": "cap:flash-development",
"canonicalName": "Adobe Flash Development",
"status": "deprecated",
"deprecatedAt": "2020-12-31",
"description": "Historical: Building interactive content and animations using Adobe Flash/ActionScript. No longer supported by major browsers.",
"replacementCapabilities": [
"cap:javascript",
"cap:html5-canvas",
"cap:webgl",
"cap:svg-animation"
],
"demandTrend": {
"direction": "declining",
"velocity": -0.95, // 95% decline
"peakDate": "2010-03-15"
},
"supplyCount": 234, // Still some humans with this skill
"demandCount": 2 // Almost zero demand
}
THREE-LAYER CAPABILITY ARCHITECTURE
The Capability Graph uses a three-layer architecture to balance developer UX (easy discovery) with trust semantics (structured routing).
The Three Layers
Layer 1: Canonical Capabilities (Global)
Purpose: Standardized capabilities used for trust-aware routing and attestations.
Characteristics:
- Owned and curated by HUMAN Foundation
- Full semantic ontology (embeddings, relationships, lifecycle)
- Used for HumanOS routing decisions
- Cross-org interoperability
- Governance tier: Canon
Examples:
cap:ai-safety-evaluationcap:clinical-discharge-reviewcap:contract-review-standardcap:python-programming
Schema:
interface CanonicalCapability extends CapabilityDefinition {
id: string; // "cap:python-programming"
canonicalName: string;
status: 'emerging' | 'active' | 'evolving' | 'deprecated';
governanceApproved: boolean; // Requires curator approval
crossOrgUsage: number; // How many orgs use this capability
}
Layer 2: Org Capabilities (Scoped)
Purpose: Organization-specific capabilities that don't (yet) exist in the canonical ontology.
Characteristics:
- Namespaced:
cap:org:<orgSlug>:<slug> - Same schema as canonical capabilities (embeddings, relationships, etc.)
- Curated by org capability admins + Capability Janitor agent
- Can be promoted to canonical if widely adopted
- Used for org-local routing only
Examples:
cap:org:acme:soc2-readiness-assessmentcap:org:healthcorp:patient-intake-specialistcap:org:lawfirm:contract-negotiation-saas
Schema:
interface OrgCapability extends CapabilityDefinition {
id: string; // "cap:org:acme:soc2-readiness"
orgId: string;
status: 'draft' | 'active' | 'deprecated';
curatorApproved: boolean;
usageCount: number;
// Optional: Mapping to canonical
canonicalEquivalent?: string; // "cap:security-audit"
promotionCandidate?: boolean; // High usage, consider canonical promotion
}
Lifecycle:
Developer proposes
↓
Agent drafts definition
↓
Org admin approves (or auto-approve for low-risk)
↓
Active (available for routing)
↓
(If widely used across orgs)
↓
Promoted to canonical
Layer 3: Dev Labels (Freeform)
Purpose: Easy discovery and grouping without polluting the capability ontology.
Characteristics:
- Freeform strings or key/value pairs
- NO trust semantics - never used for routing decisions
- Used only for search, filtering, UI organization
- Can be added/removed freely
- No approval required
Examples:
const agent = {
id: 'agent:soc2-checker',
labels: ['soc2', 'compliance', 'audit', 'security'], // Freeform
capabilities: ['cap:security-audit', 'cap:compliance-review'] // Structured
};
const workflow = {
id: 'workflow:invoice-processing',
labels: ['finance', 'accounts-payable', 'automation'],
steps: [
{
requiredCapabilities: ['cap:invoice-extraction', 'cap:accounting-review']
}
]
};
Schema:
interface ResourceLabels {
resourceId: string; // Agent, workflow, human, etc.
labels: string[]; // Freeform strings
labelKV?: Record<string, string>; // Optional key-value labels
}
When to Use Each Layer
| Use Case | Layer | Example |
|---|---|---|
| Trust-aware routing | Canonical or Org | HumanOS routes high-risk task to humans with cap:medical-review >= 0.8 |
| Agent discovery | Labels | Search agents with ['patient', 'billing'] |
| Org-specific workflows | Org capabilities | Workflow requires cap:org:acme:hipaa-audit |
| Experimentation | Labels | Tag prototype agents with ['prototype', 'v2', 'beta'] |
| Cross-org semantics | Canonical | Workforce Cloud matches humans across enterprises using canonical capabilities |
Developer Experience
❌ Wrong: Creating micro-capabilities for everything
// BAD: Pollutes capability ontology
await capabilityGraph.createCapability({
name: 'Agent that checks SOC-2 readiness for SaaS companies using AWS',
// This is too specific and will never be reused
});
✅ Right: Use structured capabilities + labels
// GOOD: Reusable capabilities + discoverable labels
const agent = {
id: 'agent:soc2-checker',
capabilities: [
'cap:security-audit', // Canonical
'cap:compliance-review', // Canonical
'cap:org:acme:soc2-readiness' // Org-specific if truly unique
],
labels: ['soc2', 'saas', 'aws', 'compliance'], // Easy discovery
};
// Routing uses capabilities
await humanos.routeTask({
requiredCapabilities: ['cap:security-audit'],
minWeight: 0.7
});
// Search uses labels
await agentRegistry.search({
labels: ['soc2', 'aws'],
limit: 20
});
Capability vs Label Decision Tree
New requirement detected
│
├─ Is this a trust/routing decision?
│ ├─ YES → Use Capability (canonical or org)
│ └─ NO → Use Label
│
├─ Will this be used across multiple orgs?
│ ├─ YES → Canonical Capability
│ └─ NO → Org Capability or Label
│
└─ Is this a one-off descriptor?
├─ YES → Label
└─ NO → Capability
Integration with Capability Ontology
Canonical capabilities:
- Full lifecycle management (emerging → active → deprecated)
- Semantic embeddings for similarity search
- Relationships (prerequisite, complementary, etc.)
- Drift detection
- Governed by HUMAN Foundation
Org capabilities:
- Same lifecycle and semantics as canonical
- Governed by org capability admins
- Capability Janitor monitors for:
- Duplicates (suggest merge)
- High-usage candidates (suggest promotion to canonical)
- Stale capabilities (suggest deprecation)
Labels:
- No lifecycle (add/remove freely)
- No semantics (just strings)
- No governance (anyone can add)
- Used only for UX
Data Placement
| Layer | Storage | Access Control |
|---|---|---|
| Canonical capabilities | Global capability ontology table | Public read, curator write |
| Org capabilities | Tenant-scoped capability tables | Org-scoped read/write |
| Labels | Resource metadata tables | Resource owner read/write |
Benefits
For Developers:
- ✅ Easy discovery via labels
- ✅ No bureaucracy for freeform tags
- ✅ Reusable capabilities when needed
For HumanOS Routing:
- ✅ Structured capabilities for trust decisions
- ✅ No pollution from freeform descriptors
- ✅ Clear semantics for routing logic
For Capability Graph:
- ✅ Prevents capability sprawl
- ✅ Maintains semantic quality
- ✅ Enables cross-org interoperability
AGENT CAPABILITY PROFILES
Agents (not just humans) need capability profiles to enable capability-based agent routing in inter-agent workflows.
Schema
interface AgentCapabilityProfile {
// Identity
agentId: string; // "agent:invoice-processor"
agentName: string;
version: string;
// Capabilities (canonical or org)
capabilities: {
capabilityId: string;
weight: number; // 0.0-1.0, agent's proficiency
confidence: number; // How sure are we of this capability?
evidenceCount: number; // Task completions, human approvals
}[];
// Tools & Muscles
tools: string[]; // Muscles the agent can invoke
connectors: string[]; // External services (Stripe, Salesforce, etc.)
// Context
domains: string[]; // "healthcare", "finance", "legal"
languages: string[]; // Natural languages supported
// Trust
trustLevel: 'verified' | 'community' | 'experimental';
riskTier: 'low' | 'medium' | 'high' | 'critical';
certifications: string[]; // Org-issued certifications
// Permissions
permissions: string[]; // What the agent is allowed to do
// Performance
taskCompletions: number;
avgQualityScore: number;
avgLatency: number; // Milliseconds
failureRate: number; // 0.0-1.0
// Activity
lastActive: Date;
createdAt: Date;
updatedAt: Date;
}
Discovery
Agents are discovered via capability queries, just like humans:
// Find agent with capability
const agents = await capabilityGraph.findAgentsWithCapability(
'cap:contract-review',
{
minWeight: 0.8,
trustLevel: 'verified',
riskTier: ['low', 'medium']
}
);
// HumanOS routing (capability-first)
const routingDecision = await humanos.routeTask({
taskId: 'task_123',
requiredCapabilities: ['cap:invoice-extraction', 'cap:accounting-review'],
riskLevel: 'medium',
resourceTypes: ['agent', 'human'] // Consider both
});
// Result: Agent or human with best capability match
Capability Evolution for Agents
Agents gain capability evidence from:
1. Task Outcomes
// After agent completes task
await capabilityGraph.submitEvidence({
resourceType: 'agent',
resourceId: 'agent:invoice-processor',
evidenceType: 'task_completion',
taskId: 'task_123',
capabilitiesDemonstrated: [
{
capabilityId: 'cap:invoice-extraction',
performanceScore: 0.94, // Quality of work
reviewerDid: 'did:human:reviewer_xyz' // Human who approved
}
],
context: {
complexity: 'medium',
timeToComplete: 1200 // ms
}
});
// Capability Graph updates agent's capability weight
// Based on multi-source evidence algorithm (just like humans)
2. Human Approvals
// Agent proposes action → Human approves
await capabilityGraph.submitEvidence({
resourceType: 'agent',
resourceId: 'agent:contract-reviewer',
evidenceType: 'human_approval',
capabilitiesDemonstrated: [
{
capabilityId: 'cap:contract-review',
approvedBy: 'did:human:legal_counsel',
confidence: 0.95 // Human's confidence in agent's work
}
]
});
3. Org Attestations
// Org certifies agent for specific domain
await capabilityGraph.submitEvidence({
resourceType: 'agent',
resourceId: 'agent:medical-triage',
evidenceType: 'attestation',
capabilitiesDemonstrated: [
{
capabilityId: 'cap:medical-triage',
attestedBy: 'did:org:healthcorp',
attestationLevel: 'certified',
validUntil: '2026-12-31'
}
]
});
Agent Manifest Integration
Agents declare capabilities in their manifest (YAML or SDK):
# agent-manifest.yaml
agent:
id: invoice-processor
name: Invoice Processing Agent
version: 1.2.0
capabilities:
- cap:invoice-extraction
- cap:data-validation
- cap:accounting-review
tools:
- stripe-connector
- quickbooks-connector
- email-sender
domains:
- finance
- accounts-payable
trustLevel: verified
riskTier: medium
permissions:
- read:invoices
- write:invoice-records
- call:accounting-review-agents
At registration, agent manifest is parsed into AgentCapabilityProfile:
// Agent SDK does this automatically
const manifest = await loadManifest('agent-manifest.yaml');
const profile = await capabilityGraph.registerAgent({
agentId: manifest.agent.id,
capabilities: manifest.agent.capabilities.map(capId => ({
capabilityId: capId,
weight: 0.50, // Initial weight (no evidence yet)
confidence: 0.70,
evidenceCount: 0
})),
tools: manifest.agent.tools,
domains: manifest.agent.domains,
trustLevel: manifest.agent.trustLevel,
riskTier: manifest.agent.riskTier,
permissions: manifest.agent.permissions
});
Inter-Agent Capability Routing
When an agent needs to delegate to another agent:
// Inside an agent
export const handler = async (ctx: AgentContext) => {
// Agent A needs help from agent with contract review capability
const result = await ctx.call.agent('cap:contract-review', {
contract: ctx.input.contract,
requiredCapabilityWeight: 0.8,
riskLevel: 'high'
});
// HumanOS routing engine:
// 1. Queries Capability Graph for agents with 'cap:contract-review >= 0.8'
// 2. Filters by risk tier (high-risk may require human)
// 3. Routes to best match (agent or human)
// 4. Logs provenance
return result;
};
Benefits
For HumanOS:
- Unified routing for humans AND agents
- Capability-based agent discovery
- Trust-aware agent selection
For Developers:
- Declare capabilities in manifest
- Automatic capability tracking
- No manual capability management
For Agents:
- Capability weights evolve with performance
- Trust level increases with successful tasks
- Clear path to "verified" status
CAPABILITY JANITOR (ANTI-SPRAWL)
As org-specific capabilities grow, sprawl becomes a problem. The Capability Janitor is an automated agent that:
- Detects duplicate or near-duplicate capabilities
- Suggests merges and aliases
- Identifies stale capabilities
- Proposes promotions to canonical
Purpose
Prevent capability graph pollution by:
- Clustering similar capabilities (embedding-based)
- Detecting duplicates (suggest merge)
- Flagging stale capabilities (no usage in 90+ days)
- Identifying promotion candidates (high cross-org usage)
Algorithm
async function runCapabilityJanitor(orgId: string) {
// 1. Get all org capabilities
const orgCaps = await getOrgCapabilities(orgId, { status: 'active' });
// 2. Cluster by embedding similarity
const clusters = await clusterByEmbedding(orgCaps, {
threshold: 0.90, // 90%+ similarity = potential duplicate
method: 'cosine'
});
// 3. For each cluster with multiple capabilities, suggest merge
for (const cluster of clusters) {
if (cluster.length > 1) {
// Primary = highest usage
const primary = cluster.sort((a, b) => b.usageCount - a.usageCount)[0];
const duplicates = cluster.filter(c => c.id !== primary.id);
await suggestMerge({
orgId,
primary: primary.id,
duplicates: duplicates.map(d => d.id),
reason: `High semantic similarity (avg ${cluster.avgSimilarity})`,
impact: {
affectedAgents: await countAgentsUsingCapabilities(duplicates.map(d => d.id)),
affectedWorkflows: await countWorkflowsUsingCapabilities(duplicates.map(d => d.id))
},
suggestedAction: 'merge_into_primary_and_create_aliases'
});
}
}
// 4. Flag stale capabilities (no usage in 90 days)
const staleThreshold = Date.now() - (90 * 24 * 60 * 60 * 1000);
const stale = orgCaps.filter(c => c.lastUsed < staleThreshold);
for (const cap of stale) {
await suggestDeprecation({
orgId,
capabilityId: cap.id,
reason: 'No usage in 90 days',
usageHistory: await getCapabilityUsageHistory(cap.id, days: 180),
suggestedAction: cap.usageCount > 0
? 'archive_with_migration_path' // Used before, might return
: 'delete' // Never used, safe to remove
});
}
// 5. Identify high-usage org caps → suggest canonical promotion
const highUsage = orgCaps.filter(c =>
c.usageCount > 100 && // Used frequently
c.crossAgentUsage > 10 // Used by many agents
);
for (const cap of highUsage) {
// Check if similar canonical capability exists
const canonicalSimilar = await findCanonicalCapabilitySimilar(cap.embedding, threshold: 0.85);
if (canonicalSimilar.length === 0) {
// No similar canonical → suggest promotion
await suggestCanonicalPromotion({
orgId,
capabilityId: cap.id,
reason: 'High usage, no canonical equivalent',
usageStats: {
usageCount: cap.usageCount,
crossAgentUsage: cap.crossAgentUsage,
avgQuality: await getAvgCapabilityQuality(cap.id)
},
suggestedAction: 'promote_to_canonical_with_governance_review'
});
} else {
// Similar canonical exists → suggest mapping
await suggestCanonicalMapping({
orgId,
capabilityId: cap.id,
canonicalId: canonicalSimilar[0].id,
similarity: canonicalSimilar[0].similarity,
suggestedAction: 'map_org_cap_to_canonical_and_deprecate'
});
}
}
}
Clustering Algorithm
interface CapabilityCluster {
capabilities: OrgCapability[];
avgSimilarity: number;
centroid: number[]; // Average embedding
}
async function clusterByEmbedding(
capabilities: OrgCapability[],
options: { threshold: number; method: 'cosine' | 'euclidean' }
): Promise<CapabilityCluster[]> {
const clusters: CapabilityCluster[] = [];
const visited = new Set<string>();
for (const cap of capabilities) {
if (visited.has(cap.id)) continue;
// Find all capabilities similar to this one
const similar = capabilities.filter(other => {
if (visited.has(other.id)) return false;
const similarity = cosineSimilarity(cap.embedding, other.embedding);
return similarity >= options.threshold;
});
if (similar.length > 1) {
// Calculate cluster centroid
const embeddings = similar.map(c => c.embedding);
const centroid = averageEmbedding(embeddings);
// Calculate average pairwise similarity
const similarities = [];
for (let i = 0; i < similar.length; i++) {
for (let j = i + 1; j < similar.length; j++) {
similarities.push(cosineSimilarity(similar[i].embedding, similar[j].embedding));
}
}
const avgSimilarity = similarities.reduce((a, b) => a + b, 0) / similarities.length;
clusters.push({
capabilities: similar,
avgSimilarity,
centroid
});
// Mark as visited
similar.forEach(c => visited.add(c.id));
}
}
return clusters;
}
Admin Workflow
Org admins receive quarterly Capability Janitor reports:
interface CapabilityJanitorReport {
orgId: string;
reportDate: Date;
// Suggested merges
merges: {
primary: OrgCapability;
duplicates: OrgCapability[];
reason: string;
impact: { affectedAgents: number; affectedWorkflows: number };
action: 'approve' | 'reject' | 'defer';
}[];
// Stale capabilities
staleCapabilities: {
capability: OrgCapability;
daysSinceLastUse: number;
suggestedAction: 'archive' | 'delete';
action: 'approve' | 'reject' | 'defer';
}[];
// Promotion candidates
promotionCandidates: {
capability: OrgCapability;
usageStats: { usageCount: number; crossAgentUsage: number; avgQuality: number };
suggestedCanonicalName: string;
action: 'nominate' | 'reject' | 'defer';
}[];
// Canonical mappings
canonicalMappings: {
orgCapability: OrgCapability;
canonicalCapability: CanonicalCapability;
similarity: number;
action: 'approve_mapping' | 'reject' | 'defer';
}[];
}
Admin dashboard:
- Review suggestions one-by-one
- Approve/reject with one click
- Bulk actions for obvious cases
- Defer for manual review
Auto-Merge Rules
Some cases can be auto-merged without human approval:
async function shouldAutoMerge(cluster: CapabilityCluster): Promise<boolean> {
// Auto-merge if:
// 1. Very high similarity (>95%)
// 2. Low impact (affects <5 agents)
// 3. Recent creation (all capabilities created in last 30 days)
const veryHighSimilarity = cluster.avgSimilarity > 0.95;
const lowImpact = await countAgentsUsingCapabilities(
cluster.capabilities.map(c => c.id)
) < 5;
const recentCreation = cluster.capabilities.every(c =>
Date.now() - c.createdAt.getTime() < (30 * 24 * 60 * 60 * 1000)
);
return veryHighSimilarity && lowImpact && recentCreation;
}
Execution Schedule
// Capability Janitor runs:
// - Weekly: Duplicate detection (quick wins)
// - Monthly: Stale capability flagging
// - Quarterly: Canonical promotion suggestions
// - Ad-hoc: On-demand when org cap count > threshold
const schedule = {
duplicateDetection: 'weekly',
staleDetection: 'monthly',
promotionSuggestions: 'quarterly',
onDemand: (orgCapCount) => orgCapCount > 100
};
Benefits
For Org Admins:
- ✅ Automatic cleanup suggestions
- ✅ No manual capability management needed
- ✅ Prevents sprawl before it's a problem
For Capability Graph:
- ✅ Maintains semantic quality
- ✅ Prevents duplicate capabilities
- ✅ Identifies canonical candidates
For Developers:
- ✅ Cleaner capability search
- ✅ Fewer "which capability do I use?" decisions
- ✅ Auto-aliasing handles edge cases
DATA RESIDENCY BY DEPLOYMENT PROFILE
The Capability Graph respects data sovereignty across all three deployment profiles.
Data Placement
| Data Type | Hosted | Hybrid | Self-Hosted |
|---|---|---|---|
| Capability ontology | HUMAN Cloud (global) | Mirrored to customer | Customer-controlled |
| Personal capability graphs | Encrypted vault (HUMAN) | Customer vault | Customer vault |
| Capability evidence | Encrypted vault (HUMAN) | Customer vault | Customer vault |
| Attestations | Ledger (HUMAN-managed) | Customer ledger + optional federation | Customer ledger (air-gapped or federated) |
| Capability queries | HUMAN Cloud | Customer edge/cloud | Customer edge/cloud |
| Org capabilities | HUMAN Cloud (tenant-scoped) | Customer database | Customer database |
| Agent capability profiles | HUMAN Cloud (tenant-scoped) | Customer database | Customer database |
Sync Model: Device → Edge → Regional Cloud
The Capability Graph operates in a tiered sync model to support offline operation and low-latency queries:
┌─────────────────────────────────────────────────────────────────┐
│ CAPABILITY GRAPH SYNC │
├─────────────────────────────────────────────────────────────────┤
│ │
│ DEVICE (Offline-Capable) │
│ ├─ Full personal capability graph │
│ ├─ Cached org capabilities (relevant to user) │
│ ├─ Cached canonical ontology (for search) │
│ └─ Pending evidence submissions (queued for sync) │
│ │ │
│ ↓ (sync when online) │
│ │
│ EDGE (CDN / Regional) │
│ ├─ Cached capability profiles (for routing) │
│ ├─ Canonical ontology (full, frequently refreshed) │
│ ├─ Org capability summaries (for fast lookup) │
│ └─ Recent evidence cache (TTL 5 min) │
│ │ │
│ ↓ (complex queries) │
│ │
│ REGIONAL CLOUD (Authoritative) │
│ ├─ Full capability ontology (canonical + org) │
│ ├─ All personal capability graphs (encrypted) │
│ ├─ Capability evidence store │
│ ├─ Capability Graph inference engine │
│ └─ Cross-region sync (eventual consistency) │
│ │
└─────────────────────────────────────────────────────────────────┘
Data Flows
Evidence Submission (Device → Cloud)
// On device (can work offline)
await capabilityGraph.submitEvidence({
passportDid: 'did:human:abc123',
evidenceType: 'task_completion',
capabilitiesDemonstrated: [
{ capabilityId: 'cap:python', performanceScore: 0.91 }
],
// Queue if offline
syncStrategy: 'queue_if_offline'
});
// Evidence stored locally in encrypted vault
// Synced to regional cloud when online
// Capability weights updated in cloud
// Updated profile synced back to device
Capability Query (Edge-First)
// HumanOS routing query (edge-first)
const matches = await capabilityGraph.findResourcesWithCapability(
'cap:medical-review',
{ minWeight: 0.8, resourceTypes: ['human', 'agent'] }
);
// Execution:
// 1. Check edge cache (if profiles cached)
// 2. If cache miss or stale → query regional cloud
// 3. Cache result at edge for future queries
Hybrid Profile: Data Never Leaves VPC
In Hybrid deployment, capability data stays in customer infrastructure:
// Capability Graph configuration (Hybrid)
const config = {
deploymentProfile: 'hybrid',
// Ontology: Mirror from HUMAN Cloud (read-only)
ontology: {
source: 'mirror',
syncFrom: 'https://ontology.human.ai',
syncInterval: '1h',
localPath: '/var/lib/human/ontology'
},
// Personal graphs: Customer vault (read/write)
personalGraphs: {
storage: 'customer_vault',
encryption: 'customer_keys',
backups: 'customer_controlled'
},
// Org capabilities: Customer database
orgCapabilities: {
storage: 'customer_database',
endpoint: 'postgres.acme.internal'
},
// Evidence: Customer vault
evidence: {
storage: 'customer_vault',
retentionPolicy: 'customer_defined'
},
// Attestations: Customer ledger (optional federation)
attestations: {
storage: 'customer_ledger',
federation: {
enabled: true, // Can federate with HUMAN public ledger
mode: 'write_only' // Push hashes to HUMAN, keep data local
}
}
};
Self-Hosted Profile: Full Air-Gap Support
In Self-Hosted deployment, no external connectivity required:
// Capability Graph configuration (Self-Hosted, Air-Gapped)
const config = {
deploymentProfile: 'selfhosted',
// Ontology: Customer fork (can diverge from HUMAN)
ontology: {
source: 'customer_fork',
initialImport: 'canonical_v1.0.0', // One-time import
updates: 'manual', // Controlled updates via USB/sneakernet
customCapabilities: 'allowed' // Customer can extend ontology
},
// All data stays on-prem
personalGraphs: { storage: 'on_prem_vault' },
orgCapabilities: { storage: 'on_prem_database' },
evidence: { storage: 'on_prem_vault' },
attestations: { storage: 'on_prem_ledger', federation: { enabled: false } },
// No external connectivity
externalConnectivity: {
enabled: false,
updateChannel: 'disabled',
telemetry: 'disabled'
}
};
Privacy Guarantees
| Profile | Personal Data Location | Org Data Location | Ontology Location |
|---|---|---|---|
| Hosted | Encrypted HUMAN vault | Tenant-scoped HUMAN DB | HUMAN Cloud |
| Hybrid | Customer vault | Customer DB | Mirrored to customer |
| Self-Hosted | Customer vault | Customer DB | Customer-controlled |
Key Principle: Capability data is non-stick to HUMAN infrastructure in Hybrid/Self-hosted.
Compliance
This architecture supports:
- ✅ GDPR (data residency in EU)
- ✅ HIPAA (healthcare data stays in customer VPC)
- ✅ CCPA (California data residency)
- ✅ FedRAMP (air-gapped self-hosted)
- ✅ DoD (ITAR compliance via self-hosted)
SEMANTIC CAPABILITY MATCHING
Traditional keyword matching fails for capability-based routing. "Machine Learning" doesn't match "ML", "Python Developer" doesn't match "Python Programming". HUMAN uses semantic matching powered by embeddings.
The Matching Problem
Naive string matching:
// Task requires: ["machine-learning", "healthcare"]
// Human A has: ["ML", "medical-data-analysis"]
// Result: NO MATCH ❌ (even though human is perfect!)
// Human B has: ["machine-learning", "finance"]
// Result: PARTIAL MATCH (but wrong domain)
Semantic matching:
// Task embedding: [0.234, -0.567, ...]
// Human A capabilities:
// - "ML" embedding: [0.245, -0.554, ...] → 96% similar to "machine-learning"
// - "medical-data-analysis" embedding: [0.123, 0.456, ...] → 89% similar to "healthcare"
// Result: STRONG MATCH ✅ (95% overall)
// Human B capabilities:
// - "machine-learning" embedding: [0.234, -0.567, ...] → 100% similar
// - "finance" embedding: [-0.345, 0.123, ...] → 25% similar to "healthcare"
// Result: WEAK MATCH (62% overall) ⚠️
Semantic Matching Algorithm
interface CapabilityMatchResult {
human: Human;
overallScore: number; // 0-1, aggregate match quality
capabilityMatches: {
requiredCapability: string;
matchedCapability: CapabilityNode;
similarity: number; // 0-1, semantic similarity
weight: number; // Human's capability weight
combinedScore: number; // similarity * weight
}[];
weakestLink: number; // Minimum similarity across required capabilities
strengths: string[]; // Areas where human excels
gaps: string[]; // Missing or weak capabilities
}
async function matchHumansToTask(
taskRequirements: {
requiredCapabilities: string[];
minSimilarity?: number; // Default 0.70
minWeight?: number; // Default 0.50
domainContext?: string; // E.g., "healthcare", "finance"
},
candidateHumans: Human[]
): Promise<CapabilityMatchResult[]> {
const minSim = taskRequirements.minSimilarity ?? 0.70;
const minWeight = taskRequirements.minWeight ?? 0.50;
// 1. Get embeddings for required capabilities
const requiredEmbeddings = await Promise.all(
taskRequirements.requiredCapabilities.map(async cap => {
// Try exact match first
const existing = await getCapabilityByName(cap);
if (existing) return { name: cap, embedding: existing.embedding };
// Generate embedding for ad-hoc capability name
return { name: cap, embedding: await embeddingProvider.embed(cap) };
})
);
// 2. For each human, compute match score
const matches = await Promise.all(
candidateHumans.map(async human => {
const humanCapabilities = await getHumanCapabilityGraph(human.passportId);
// For each required capability, find best match in human's graph
const capabilityMatches = requiredEmbeddings.map(req => {
// Find human's capability with highest semantic similarity
const bestMatch = humanCapabilities.nodes.reduce(
(best, humanCap) => {
const similarity = cosineSimilarity(req.embedding, humanCap.embedding);
const combinedScore = similarity * humanCap.weight; // Factor in capability weight
return combinedScore > best.combinedScore
? { capability: humanCap, similarity, combinedScore }
: best;
},
{ capability: null, similarity: 0, combinedScore: 0 }
);
return {
requiredCapability: req.name,
matchedCapability: bestMatch.capability,
similarity: bestMatch.similarity,
weight: bestMatch.capability?.weight ?? 0,
combinedScore: bestMatch.combinedScore
};
});
// 3. Aggregate scoring
const validMatches = capabilityMatches.filter(m =>
m.similarity >= minSim && m.weight >= minWeight
);
// All required capabilities must have valid matches
if (validMatches.length < requiredEmbeddings.length) {
return null; // Human missing critical capabilities
}
// Overall score: weighted average of combined scores
const overallScore = validMatches.reduce((sum, m) => sum + m.combinedScore, 0) / validMatches.length;
// Weakest link: minimum similarity (chain is only as strong as weakest link)
const weakestLink = Math.min(...validMatches.map(m => m.similarity));
// Identify strengths and gaps
const strengths = capabilityMatches
.filter(m => m.combinedScore > 0.85)
.map(m => m.matchedCapability.name);
const gaps = capabilityMatches
.filter(m => m.combinedScore < 0.60)
.map(m => m.requiredCapability);
return {
human,
overallScore,
capabilityMatches,
weakestLink,
strengths,
gaps
};
})
);
// 4. Filter nulls (humans who don't meet requirements) and sort by score
return matches
.filter(m => m !== null)
.sort((a, b) => b.overallScore - a.overallScore);
}
Match Quality Tiers
function categorizeMatchQuality(match: CapabilityMatchResult): string {
if (match.overallScore >= 0.90 && match.weakestLink >= 0.85) {
return 'exceptional'; // Perfect fit, no weaknesses
} else if (match.overallScore >= 0.80 && match.weakestLink >= 0.70) {
return 'strong'; // Very good fit, minor gaps acceptable
} else if (match.overallScore >= 0.70 && match.weakestLink >= 0.60) {
return 'adequate'; // Meets requirements, some training may help
} else if (match.overallScore >= 0.60) {
return 'marginal'; // Risky, significant gaps
} else {
return 'poor'; // Should not be routed
}
}
Domain-Aware Matching
Context matters. "Data analysis" in healthcare ≠ "data analysis" in finance:
async function domainAwareMatching(
taskRequirements: {
requiredCapabilities: string[];
domainContext: string; // "healthcare", "finance", "legal", etc.
},
candidateHumans: Human[]
): Promise<CapabilityMatchResult[]> {
// 1. Standard semantic matching
const baseMatches = await matchHumansToTask(taskRequirements, candidateHumans);
// 2. Apply domain boost/penalty
const domainAdjustedMatches = baseMatches.map(match => {
const humanDomains = extractDomains(match.human.capabilityGraph);
// Check if human has experience in the required domain
const domainExperience = humanDomains.find(d =>
d.domain === taskRequirements.domainContext
);
if (domainExperience) {
// Boost score for domain expertise
const domainBoost = domainExperience.weight * 0.15; // Up to 15% boost
match.overallScore = Math.min(1.0, match.overallScore + domainBoost);
match.strengths.push(`${taskRequirements.domainContext} domain expertise`);
} else {
// Penalty for lack of domain experience
const domainPenalty = 0.10; // 10% penalty
match.overallScore = Math.max(0, match.overallScore - domainPenalty);
match.gaps.push(`Limited ${taskRequirements.domainContext} experience`);
}
return match;
});
// 3. Re-sort after domain adjustment
return domainAdjustedMatches.sort((a, b) => b.overallScore - a.overallScore);
}
Fuzzy Synonym Detection
Handle variations in capability names:
async function fuzzyCapabilityMatch(
queryCapability: string,
threshold: number = 0.85
): Promise<CapabilityDefinition[]> {
// 1. Exact synonym match
const exactMatch = await db.query(`
SELECT * FROM capabilities
WHERE canonical_name = $1
OR synonyms @> $2::jsonb
`, [queryCapability, JSON.stringify([queryCapability])]);
if (exactMatch.length > 0) return exactMatch;
// 2. Fuzzy text match (Levenshtein distance)
const fuzzyTextMatch = await db.query(`
SELECT *, levenshtein(canonical_name, $1) AS distance
FROM capabilities
WHERE levenshtein(canonical_name, $1) < 3 -- Max 2 character difference
ORDER BY distance
LIMIT 5
`, [queryCapability]);
if (fuzzyTextMatch.length > 0) return fuzzyTextMatch;
// 3. Semantic similarity (embedding-based)
const queryEmbedding = await embeddingProvider.embed(queryCapability);
const semanticMatch = await db.query(`
SELECT
*,
1 - (embedding <=> $1::vector) AS similarity
FROM capabilities
WHERE 1 - (embedding <=> $1::vector) > $2
ORDER BY embedding <=> $1::vector
LIMIT 10
`, [queryEmbedding, threshold]);
return semanticMatch;
}
Real-World Example: Healthcare Task Routing
// Task: Medical record review for triage
const taskRequirements = {
requiredCapabilities: [
"healthcare-triage",
"medical-record-analysis",
"HIPAA-compliance"
],
domainContext: "healthcare",
minSimilarity: 0.75,
minWeight: 0.70 // High bar for healthcare
};
const candidateHumans = await getAvailableHumans();
const matches = await domainAwareMatching(taskRequirements, candidateHumans);
// Results:
[
{
human: { passportId: "did:human:sarah-rn", displayName: "Sarah J." },
overallScore: 0.94,
weakestLink: 0.88,
capabilityMatches: [
{
requiredCapability: "healthcare-triage",
matchedCapability: { name: "Clinical Triage", weight: 0.92 },
similarity: 0.96,
combinedScore: 0.88
},
{
requiredCapability: "medical-record-analysis",
matchedCapability: { name: "EHR Review", weight: 0.85 },
similarity: 0.91,
combinedScore: 0.77
},
{
requiredCapability: "HIPAA-compliance",
matchedCapability: { name: "HIPAA Certified", weight: 0.95 },
similarity: 0.98, // Nearly perfect match
combinedScore: 0.93
}
],
strengths: [
"Clinical Triage",
"HIPAA Certified",
"Healthcare domain expertise"
],
gaps: []
},
{
human: { passportId: "did:human:mike-emt", displayName: "Mike T." },
overallScore: 0.78,
weakestLink: 0.65,
capabilityMatches: [
{
requiredCapability: "healthcare-triage",
matchedCapability: { name: "Emergency Medical Response", weight: 0.88 },
similarity: 0.82,
combinedScore: 0.72
},
{
requiredCapability: "medical-record-analysis",
matchedCapability: { name: "Patient Assessment", weight: 0.70 },
similarity: 0.72,
combinedScore: 0.50 // Weaker here
},
{
requiredCapability: "HIPAA-compliance",
matchedCapability: { name: "Healthcare Privacy Training", weight: 0.65 },
similarity: 0.88,
combinedScore: 0.57
}
],
strengths: ["Emergency Medical Response"],
gaps: ["Limited medical record analysis experience"]
}
]
// Sarah (RN) routed to task, Mike (EMT) held in reserve
CAPABILITY-BASED ACCESS CONTROL
Capabilities don't just route work—they gate access to resources, knowledge, and privileges. This section defines how HUMAN uses capabilities for fine-grained access control.
Access Control Model
Traditional access control: "Is user X allowed to access resource Y?"
Capability-based access control: "Does user X have the required capabilities to access resource Y?"
interface AccessPolicy {
resourceId: string; // What's being protected
resourceType: 'kb_document' | 'task_tier' | 'system_feature' | 'data_set' | 'api_endpoint';
// Capability requirements
requiredCapabilities: {
capabilityId: string;
minWeight: number; // Minimum capability weight (0-1)
minVerification?: VerificationStatus; // Minimum verification level
}[];
// Logical operators
operator: 'AND' | 'OR' | 'THRESHOLD'; // How to combine requirements
threshold?: number; // For THRESHOLD: how many capabilities needed (e.g., "2 of 3")
// Additional constraints
constraints?: {
requiredPassportKind?: PassportKind[]; // E.g., only Founders can access
minTrustLevel?: number; // Overall trust score
geographicRestrictions?: string[]; // Jurisdictional limits
timeRestrictions?: {
allowedHours?: string; // E.g., "09:00-17:00"
allowedDays?: string[]; // E.g., ["Monday", "Tuesday"]
};
};
// Audit
createdBy: PassportId;
createdAt: Date;
updatedAt: Date;
rationale: string; // Why this policy exists
}
Access Check Algorithm
async function checkAccess(
actor: Passport,
resourceId: string
): Promise<AccessDecision> {
// 1. Get access policy for resource
const policy = await getAccessPolicy(resourceId);
if (!policy) {
// No policy = default deny
return { allowed: false, reason: 'No access policy defined' };
}
// 2. Check passport kind and constraints
if (policy.constraints) {
if (policy.constraints.requiredPassportKind &&
!policy.constraints.requiredPassportKind.includes(actor.kind)) {
return { allowed: false, reason: 'Passport kind not authorized' };
}
// Check time restrictions, geo restrictions, etc.
const constraintCheck = await evaluateConstraints(actor, policy.constraints);
if (!constraintCheck.passed) {
return { allowed: false, reason: constraintCheck.reason };
}
}
// 3. Get actor's capabilities
const actorCapabilities = await getHumanCapabilityGraph(actor.id);
// 4. Check each required capability
const capabilityChecks = policy.requiredCapabilities.map(req => {
const actorCap = actorCapabilities.nodes.find(c => c.id === req.capabilityId);
if (!actorCap) {
return {
capability: req.capabilityId,
satisfied: false,
reason: 'Capability not present'
};
}
if (actorCap.weight < req.minWeight) {
return {
capability: req.capabilityId,
satisfied: false,
reason: `Capability weight ${actorCap.weight} below required ${req.minWeight}`
};
}
if (req.minVerification &&
!meetsVerificationRequirement(actorCap.verificationStatus, req.minVerification)) {
return {
capability: req.capabilityId,
satisfied: false,
reason: `Verification level ${actorCap.verificationStatus} insufficient`
};
}
return {
capability: req.capabilityId,
satisfied: true
};
});
// 5. Apply logical operator
const allowed = evaluateLogicalOperator(
policy.operator,
capabilityChecks,
policy.threshold
);
// 6. Log access attempt
await logAccessAttempt({
actorId: actor.id,
resourceId,
allowed,
capabilityChecks,
timestamp: new Date()
});
return {
allowed,
reason: allowed ? 'Access granted' : 'Capability requirements not met',
missingCapabilities: capabilityChecks.filter(c => !c.satisfied)
};
}
function evaluateLogicalOperator(
operator: 'AND' | 'OR' | 'THRESHOLD',
checks: { satisfied: boolean }[],
threshold?: number
): boolean {
switch (operator) {
case 'AND':
return checks.every(c => c.satisfied);
case 'OR':
return checks.some(c => c.satisfied);
case 'THRESHOLD':
const satisfiedCount = checks.filter(c => c.satisfied).length;
return satisfiedCount >= (threshold ?? checks.length);
}
}
Real-World Access Policies
Example 1: KB Document Access (PHI Data)
{
resourceId: "kb:patient-health-information",
resourceType: "kb_document",
requiredCapabilities: [
{
capabilityId: "cap:hipaa-compliance",
minWeight: 0.75,
minVerification: "issuer_verified" // Must have official HIPAA certification
},
{
capabilityId: "cap:healthcare-license",
minWeight: 0.80,
minVerification: "issuer_verified" // Must have verified healthcare license
}
],
operator: "AND", // Must satisfy BOTH
constraints: {
requiredPassportKind: ["Founder", "InternalTeam", "PartnerExternal"],
geographicRestrictions: ["US", "CA"] // HIPAA only applies in US/Canada
},
rationale: "PHI requires HIPAA training and healthcare licensure"
}
Example 2: High-Value Task Access
{
resourceId: "task-tier:enterprise-ml-architecture",
resourceType: "task_tier",
requiredCapabilities: [
{
capabilityId: "cap:ml-systems",
minWeight: 0.70 // Strong ML systems knowledge
},
{
capabilityId: "cap:distributed-systems",
minWeight: 0.65
},
{
capabilityId: "cap:production-experience",
minWeight: 0.60
}
],
operator: "AND",
constraints: {
minTrustLevel: 0.80 // High trust score required
},
rationale: "High-stakes ML infrastructure design requires proven expertise"
}
Example 3: Feature Access (Threshold Model)
{
resourceId: "feature:advanced-analytics-dashboard",
resourceType: "system_feature",
requiredCapabilities: [
{ capabilityId: "cap:data-analysis", minWeight: 0.65 },
{ capabilityId: "cap:statistics", minWeight: 0.60 },
{ capabilityId: "cap:data-visualization", minWeight: 0.60 },
{ capabilityId: "cap:sql", minWeight: 0.55 }
],
operator: "THRESHOLD",
threshold: 2, // Must have at least 2 of the 4 capabilities
rationale: "Analytics dashboard requires data literacy, but not all specific skills"
}
Example 4: Time-Restricted Access
{
resourceId: "data:financial-transactions",
resourceType: "data_set",
requiredCapabilities: [
{
capabilityId: "cap:financial-analysis",
minWeight: 0.70,
minVerification: "issuer_verified"
}
],
operator: "AND",
constraints: {
timeRestrictions: {
allowedHours: "09:00-17:00", // Business hours only
allowedDays: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
},
geographicRestrictions: ["US"],
minTrustLevel: 0.85
},
rationale: "Financial data access restricted to business hours for audit trail"
}
Dynamic Access Elevation
Capabilities can be temporarily elevated for specific tasks:
interface TemporaryAccessGrant {
actorId: PassportId;
resourceId: string;
grantedBy: PassportId;
reason: string;
expiresAt: Date;
// Elevated capabilities (temporary boost)
elevatedCapabilities?: {
capabilityId: string;
temporaryWeight: number; // Override weight for this grant
}[];
}
// Example: Junior developer needs access to production for emergency fix
{
actorId: "did:human:junior-dev",
resourceId: "api:production-database",
grantedBy: "did:human:senior-engineer",
reason: "Emergency hotfix for payment processing bug",
expiresAt: "2025-12-01T20:00:00Z", // 2 hours
elevatedCapabilities: [
{
capabilityId: "cap:production-access",
temporaryWeight: 0.75 // Boost from 0.30 to 0.75 for this session
}
]
}
Capability-Gated KB Access
Different KB documents require different capabilities:
// Governance tier + capability requirements
const kbAccessPolicies = {
"Canon": {
requiredCapabilities: [
{ capabilityId: "cap:strategic-thinking", minWeight: 0.70 }
],
requiredPassportKind: ["Founder", "InternalTeam"]
},
"Working": {
requiredCapabilities: [
{ capabilityId: "cap:product-knowledge", minWeight: 0.50 }
],
requiredPassportKind: ["Founder", "InternalTeam", "PartnerExternal"]
},
"Public": {
// No capability requirements, but maybe basic trust level
constraints: { minTrustLevel: 0.20 }
},
// Document-specific overrides
"kb:ml-infrastructure-design": {
requiredCapabilities: [
{ capabilityId: "cap:ml-systems", minWeight: 0.65 },
{ capabilityId: "cap:architecture-design", minWeight: 0.60 }
],
operator: "AND"
}
};
Audit Trail
Every access decision is logged:
interface AccessAuditLog {
id: string;
timestamp: Date;
actorId: PassportId;
resourceId: string;
resourceType: string;
decision: 'granted' | 'denied';
// What capabilities were checked
requiredCapabilities: {
capabilityId: string;
required: { minWeight: number; minVerification?: string };
actual: { weight: number; verificationStatus: string };
satisfied: boolean;
}[];
// Why access was granted/denied
reason: string;
missingCapabilities?: string[];
// Session context
sessionId?: string;
ipAddress?: string;
deviceId?: string;
}
This audit trail enables:
- Compliance: Prove who accessed what and why
- Security: Detect unusual access patterns
- Capability insights: See which capabilities are most frequently required
- Training recommendations: Identify common capability gaps
COMPUTATION MODEL
The engine runs on a three-pass update model:
Pass 1 — Immediate Update (Fast Path)
Triggered by a workflow event.
Produces:
- lightweight edge updates
- provisional capability deltas
- timestamped micro-attestations
Must complete within <50ms to keep workflows smooth.
Pass 2 — Contextual Update (Slow Path)
Runs asynchronously.
Calculates:
- pattern clusters
- long-range capability arcs
- cross-domain generalization
- bias correction
Pass 3 — Periodic Reconciliation (Scheduled)
Daily or weekly.
Performs:
- weight smoothing
- anomaly detection
- gaming detection
- drift correction
- deletion and revocation propagation
PRIVACY ARCHITECTURE
All capability data is:
- stored locally in the Human Vault
- anchored in the ledger by hash only
- selectively revealable via zk-like mechanisms
People can reveal:
- capabilities relevant to a role
without revealing:
- how they were gained
- where they were gained
- or what tasks they did
No employer owns the graph.
No system can retain it after revocation.
No algorithm can reverse-engineer private context.
SELECTIVE DISCLOSURE ENGINE
Examples:
Prove capability in "AI safety triage"
→ reveal only the node + confidence
→ do NOT reveal workflows, errors, history, or training pathways.
Prove compliance with a role requirement
→ reveal exactly the needed subgraph
→ redact everything else.
Prove you worked for a company
→ reveal a signed employment attestation
→ no dates unless you choose
The graph is yours. Always.
GAMING PREVENTION & ANTI-SPECIFICATION-GAMING MEASURES
Source: Stuart Russell (value alignment research - specification gaming)
The Capability Graph is only valuable if it's tamper-proof and gaming-resistant. Without rigorous anti-gaming measures, humans could artificially inflate capability weights, destroying enterprise trust and graph integrity.
HUMAN implements comprehensive, multi-layered gaming prevention that makes fraudulent capability claims economically irrational and technically infeasible.
The Gaming Threat Model
Potential gaming vectors:
- Credential farming - Rapidly completing easy tasks to inflate scores
- Collusion - Peers artificially vouching for each other
- Bot assistance - Using AI to complete human assessments
- Context manipulation - Cherry-picking favorable task contexts
- Temporal gaming - Timing completions to exploit system patterns
- Multi-account gaming - Creating multiple identities to game reputation
- Social engineering - Manipulating review processes
Anti-Gaming Architecture
1. Task Diversity Requirements
Principle: Capability requires success across varied contexts, not just one test.
interface CapabilityDiversityRequirement {
capabilityId: string;
// Evidence must span multiple dimensions
minContextVariety: {
domains: number; // Must demonstrate in N different domains
taskTypes: number; // Must complete N different task types
complexityLevels: number; // Must succeed at various difficulty levels
timeWindows: number; // Must perform over N distinct time periods
};
// Prevents "farming" the same easy task repeatedly
maxRepetitionWeight: number; // Cap weight from repeated similar tasks
}
Example - Healthcare Triage:
{
capabilityId: "cap:healthcare-triage",
minContextVariety: {
domains: 3, // Pediatrics, adult, geriatric
taskTypes: 4, // Assessment, escalation, documentation, communication
complexityLevels: 3, // Routine, urgent, emergency
timeWindows: 6 // At least 6 separate weeks of performance
},
maxRepetitionWeight: 0.40 // Repeated similar tasks capped at 40% of total weight
}
Implementation:
- Task completion events tagged with context fingerprints
- Weight calculation penalizes repetition
- Capability weight plateaus until diversity threshold met
2. Peer Review & Random Auditing
Principle: 10% of capability assessments randomly audited by other humans.
interface PeerReviewProcess {
// Random audit selection
auditProbability: number; // 10% of all assessments
// Reviewer selection criteria
reviewerRequirements: {
minCapabilityWeight: number; // Reviewer must have higher capability
noConflictOfInterest: boolean; // No prior collaboration with subject
geographicDiversity: boolean; // Prefer reviewers from different regions
};
// Review process
blindReview: boolean; // Reviewer doesn't see original score
reviewCriteria: string[]; // Specific rubric for review
// Dispute resolution
thresholdForEscalation: number; // If reviewer disagrees by >20%, escalate
tiebreaker: 'third_reviewer' | 'trust_and_safety_team';
}
Audit triggers:
- Random selection (10% baseline)
- High-value capabilities (healthcare, finance) → 20% audit rate
- Rapid capability gain (>0.20 weight increase in 30 days) → 50% audit rate
- Anomaly detection flags → 100% audit rate
Reviewer compensation:
- Paid per review (aligned incentive to be thorough)
- Own reputation at stake (bad reviews damage their graph)
- Blind reviews prevent social pressure
3. Temporal Validation & Capability Decay
Principle: Capabilities decay without continued demonstration. Skills atrophy.
interface CapabilityDecayModel {
capabilityId: string;
decayFunction: 'exponential' | 'linear' | 'step';
// Time-based decay parameters
halfLife: number; // Months until weight halves (if not reinforced)
minRetentionRate: number; // Floor below which capability removed
// Reinforcement resets decay
reinforcementEvents: {
taskCompletion: { decayReset: 'full' | 'partial', months: number };
training: { decayReset: 'partial', months: number };
peerValidation: { decayReset: 'partial', months: number };
};
// Exemptions (credentials don't decay until expiration)
exemptIfCredentialBacked: boolean;
}
Example decay curves:
| Capability Type | Half-Life | Decay Function | Rationale |
|---|---|---|---|
| Technical Skills (Python, SQL) | 12 months | Exponential | Skills rust without practice |
| Judgment Capabilities (Triage, Safety) | 18 months | Linear | Judgment degrades slower but steadily |
| Certifications (HIPAA, CPR) | No decay | Step (expires at cert expiration) | Binary valid/expired |
| Soft Skills (Communication, Empathy) | 24 months | Linear | Stable but can degrade |
Anti-gaming benefit:
- Can't "farm" capability once and coast forever
- Forces continuous demonstration
- Aligns with real-world skill maintenance
4. Anomaly Detection & Pattern Analysis
Principle: Statistical analysis flags suspicious patterns.
interface AnomalyDetectionSystem {
// Velocity anomalies
rapidGain: {
threshold: number; // >0.30 weight gain in <30 days = suspicious
compareToPopulation: boolean; // Compare to other humans gaining same capability
};
// Perfection anomalies
unrealisticSuccess: {
threshold: number; // 100% success rate over 50+ tasks = suspicious
expectedErrorRate: number; // Humans make mistakes; 0 errors is a red flag
};
// Temporal anomalies
offHoursPatterns: {
detectBotPatterns: boolean; // Activity at 3 AM every night = bot?
regularityThreshold: number; // Too-regular timing patterns
};
// Social anomalies
collusion: {
detectPeerReviewClusters: boolean; // Same 3 people always review each other?
maxReviewOverlap: number; // Max % of reviews from same reviewers
};
// Context anomalies
taskSimilarity: {
detectRepetition: boolean; // Completing nearly-identical tasks repeatedly
maxSimilarity: number; // Tasks >95% similar flagged
};
}
Anomaly response workflow:
async function handleAnomalyDetection(
passportId: PassportId,
capabilityId: string,
anomalyType: string,
confidence: number
) {
if (confidence > 0.90) {
// High confidence anomaly - immediate quarantine
await quarantineCapability(passportId, capabilityId, {
reason: anomalyType,
confidence: confidence,
status: 'under_review'
});
// Notify Trust & Safety team
await notifyTrustAndSafety({
passportId,
capabilityId,
anomalyType,
confidence,
priority: 'high'
});
// Human review required before un-quarantine
await createReviewTask({
type: 'capability_integrity_review',
subject: passportId,
capability: capabilityId,
evidence: await gatherAnomalyEvidence(passportId, capabilityId)
});
} else if (confidence > 0.70) {
// Medium confidence - increase audit rate
await flagForEnhancedAudit(passportId, capabilityId, {
auditRate: 0.50, // 50% of future tasks audited
duration: '60 days',
reason: anomalyType
});
} else {
// Low confidence - log for pattern monitoring
await logAnomalySignal(passportId, capabilityId, anomalyType, confidence);
}
}
5. Multi-Source Validation (Cross-Channel Consistency)
Principle: Combine evidence from Academy, Workforce Cloud, and external attestations.
interface MultiSourceValidation {
capabilityId: string;
// Evidence sources must corroborate
requiredSources: {
academy: { minTasks: number; minSuccessRate: number };
workforceCloud: { minTasks: number; minQualityScore: number };
peerReview: { minReviews: number; minConsensus: number };
externalAttestation?: { minCredentials: number; minTrustLevel: number };
};
// Cross-source consistency check
maxVariance: number; // If sources disagree by >20%, flag for review
// Weighting by source reliability
sourceWeights: {
academy: number; // Training ≠ real performance
workforceCloud: number; // Real tasks = highest weight
peerReview: number; // Social validation
externalAttestation: number; // Official credentials
};
}
Example - Software Engineering:
{
capabilityId: "cap:full-stack-development",
requiredSources: {
academy: { minTasks: 10, minSuccessRate: 0.80 },
workforceCloud: { minTasks: 5, minQualityScore: 0.75 },
peerReview: { minReviews: 3, minConsensus: 0.70 },
externalAttestation: { minCredentials: 0, minTrustLevel: 0.60 } // Optional
},
maxVariance: 0.20, // Academy says 0.90, Workforce says 0.60 = flag for review
sourceWeights: {
academy: 0.20, // Training is lowest weight
workforceCloud: 0.50, // Real performance is highest
peerReview: 0.20, // Social validation
externalAttestation: 0.10 // Credentials add credibility but not primary
}
}
Anti-gaming benefit:
- Can't just ace Academy training (must perform in real tasks)
- Can't just game Workforce Cloud (peers must validate)
- Can't just get peer vouches (must have training + performance)
6. Economic Disincentives
Principle: Make gaming economically irrational.
Cost of gaming > Benefit of inflated capability:
| Gaming Vector | Cost to Gamer | Detection Probability | Expected Loss |
|---|---|---|---|
| Credential farming | Time wasted on repetitive tasks | 90% (diversity check fails) | Lost time + quarantine |
| Collusion | Coordinating with peers | 85% (pattern detection) | Both parties banned |
| Bot assistance | Risk of platform ban | 95% (timing anomalies) | Loss of all capability data |
| Multi-account gaming | Infrastructure cost | 99% (device fingerprinting) | All accounts banned |
Penalties:
- First offense: Capability quarantine (30 days)
- Second offense: Capability revoked, 90-day probation
- Third offense: Account suspension, potential ban
- Severe cases: Permanent ban + notification to ecosystem partners
Incentive alignment:
- Honest capability building is faster than gaming
- Real capability = real earnings in Workforce Cloud
- Quarantine = lost earning opportunity
- Reputation damage is permanent
7. Transparency & Explainability
Principle: Humans must understand HOW capabilities are assessed to trust the system.
Public documentation:
- Capability assessment criteria (see Capability Cards below)
- Weighting formulas (no black boxes)
- Audit processes (how peer review works)
- Appeal processes (how to dispute a quarantine)
Per-capability transparency:
interface CapabilityTransparencyReport {
capabilityId: string;
// How was this capability assessed?
assessmentMethod: {
evidenceSources: string[]; // Academy, Workforce, Peer, Credential
diversityRequirements: object; // Context variety needed
decayModel: object; // How capability decays
peerReviewRate: number; // % of assessments audited
};
// What's measured vs. NOT measured?
measuredAttributes: string[];
excludedAttributes: string[];
// Fairness considerations
biasAudit: {
lastAuditDate: Date;
auditor: string; // Third-party auditor
findings: string;
};
// How to improve this capability
improvementPath: {
recommendedTraining: string[];
requiredTasks: string[];
estimatedTimeToMastery: string;
};
}
Anti-gaming benefit:
- Transparency reduces "find the loophole" gaming
- Humans see legitimate path is faster
- Audit visibility deters collusion
Implementation Timeline
| Phase | Measures | Timeline |
|---|---|---|
| v0.1 (Launch) | Task diversity, temporal decay, basic anomaly detection | Month 9 |
| v0.2 (Post-Launch) | Peer review, multi-source validation | Month 12 |
| v0.3 (Scale) | Advanced anomaly detection, ML-based pattern analysis | Month 18 |
| Ongoing | Quarterly bias audits, continuous improvement | Quarterly |
Success Metrics
How we measure anti-gaming effectiveness:
| Metric | Target | Measurement |
|---|---|---|
| False positive rate | <5% | % of legit capability gains flagged as suspicious |
| False negative rate | <2% | % of gaming attempts that slip through (from audits) |
| Appeal success rate | 15-20% | % of quarantines overturned (healthy = some false positives) |
| Time to detection | <7 days | Median days from gaming attempt to detection |
| Recidivism rate | <10% | % of flagged users who game again after penalty |
Integration with Other Systems
Anti-gaming measures connect to:
- HumanOS - Quarantined capabilities excluded from routing
- Workforce Cloud - Suspended users lose task access
- Academy - Gaming detection informs training improvements
- Passport - Integrity flags visible to enterprises (with consent)
- Ledger - Attestation revocations propagated globally
Research Partnership Opportunity
Stuart Russell (UC Berkeley) - Advise on specification gaming prevention, value alignment implementation
Potential collaboration:
- Review anti-gaming architecture
- Co-author paper on practical value alignment in capability systems
- Advisory board position (Technical Advisory Board tier)
See: kb/86_academic_and_thought_leader_engagement_strategy.md - Stuart Russell engagement plan
This comprehensive anti-gaming system ensures the Capability Graph remains the most trustworthy representation of human capability ever built—because it's designed from the ground up to be tamper-proof, fair, and transparent.
CAPABILITY CARDS: TRANSPARENCY & FAIRNESS LAYER
Source: Timnit Gebru (Model Cards for AI fairness) - Applied to human capabilities
Problem: If humans don't understand HOW they're being assessed, they can't trust the system. Black-box capability assessment = algorithmic bias risk.
Solution: Every capability has a public "Capability Card"—a transparency document that explains:
- How it's assessed
- What IS measured
- What is NOT measured
- Fairness considerations
- How to improve it
Capability Cards are "nutrition labels" for capabilities—making the system explainable, auditable, and fair.
Why Capability Cards Matter
Traditional assessment problems:
- ❌ Opaque (humans don't know how they're evaluated)
- ❌ Biased (hidden assumptions favor certain demographics)
- ❌ Unfair (some humans disadvantaged by assessment design)
- ❌ Unauditable (no way to verify fairness)
Capability Cards solve all four:
- ✅ Transparent: Humans see exactly how assessment works
- ✅ Fair: Explicit about what's excluded (age, gender, credentials)
- ✅ Auditable: Third parties can review methodology
- ✅ Trustworthy: Enterprises know what they're routing on
Capability Card Template
# Capability Card: [Capability Name]
**ID:** cap:[capability-id]
**Category:** [skill | judgment | experience | trait | certification]
**Version:** 1.0
**Last Reviewed:** [Date]
**Reviewed By:** [DAIR Institute | Internal Trust & Safety | Third-Party Auditor]
---
## ASSESSMENT METHOD
**How is this capability measured?**
- **Training Evidence:** [Academy module completions, simulations]
- **Workforce Evidence:** [Real task completions, quality scores]
- **Peer Validation:** [Random audits, peer reviews]
- **External Attestation:** [Credentials, licenses, certifications]
**Weight Calculation:**
- Academy evidence: 20% of weight
- Workforce evidence: 50% of weight (real performance)
- Peer validation: 20% of weight
- External attestation: 10% of weight
**Diversity Requirements:**
[Explain task diversity, context variety, temporal validation]
---
## WHAT IS MEASURED
**Explicit list of assessed attributes:**
1. [Attribute 1]: [How it's measured]
2. [Attribute 2]: [How it's measured]
3. [Attribute 3]: [How it's measured]
**Example Tasks:**
- [Task example 1]
- [Task example 2]
- [Task example 3]
---
## WHAT IS NOT MEASURED
**Explicit exclusions (prevents proxy discrimination):**
- ❌ **Years of experience** - Not a proxy for capability
- ❌ **Educational credentials** - Not assessment criteria (unless required by regulation)
- ❌ **Speed** - Quality over speed
- ❌ **Age, gender, race, nationality** - Never factored into capability weight
- ❌ **Employment history** - Past employers don't determine capability
- ❌ **Socioeconomic status** - No bias based on background
---
## FAIRNESS CONSIDERATIONS
**How we ensure fairness:**
1. **Diverse testing contexts:** [Capability tested across varied scenarios to prevent cultural bias]
2. **Multiple pathways:** [Various ways to demonstrate capability—not one narrow test]
3. **Bias auditing:** [Quarterly audits for demographic disparities]
4. **Accessibility:** [Accommodations for disabilities, language support]
5. **Appeal process:** [How to dispute assessment]
**Bias Audit Results:**
- Last audit: [Date]
- Auditor: [DAIR Institute / Third party]
- Findings: [Summary of audit—any disparities detected?]
- Remediation: [Actions taken if bias found]
---
## HOW TO IMPROVE THIS CAPABILITY
**Recommended path to mastery:**
1. **Academy Training:** [Recommended modules]
2. **Practice Tasks:** [Workforce Cloud task types that build this capability]
3. **Peer Learning:** [Mentorship, collaboration opportunities]
4. **External Resources:** [Courses, certifications, books]
**Estimated Time to Proficiency:** [Realistic timeline]
**Current Supply/Demand:**
- Humans with this capability: [Count]
- Enterprise demand: [High | Medium | Low]
- Earning potential: [$ range for tasks requiring this capability]
---
## REVISION HISTORY
| Version | Date | Changes | Reviewer |
|---------|------|---------|----------|
| 1.0 | [Date] | Initial capability card | [Reviewer] |
---
**Questions or Concerns?**
Contact: trust-and-safety@human.xyz
Appeal Process: [Link to appeal form]
Example: Clinical Judgment (Nursing)
# Capability Card: Clinical Judgment (Nursing)
**ID:** cap:clinical-judgment-nursing
**Category:** judgment
**Version:** 1.2
**Last Reviewed:** November 15, 2025
**Reviewed By:** DAIR Institute (Third-Party Audit)
---
## ASSESSMENT METHOD
**How is this capability measured?**
- **Training Evidence:** Academy simulated patient scenarios (15+ scenarios across age groups)
- **Workforce Evidence:** Real triage decisions in HumanOS workflows (10+ successful escalations)
- **Peer Validation:** Random audit by licensed RNs (10% of assessments reviewed)
- **External Attestation:** Active RN license (verified through state board API)
**Weight Calculation:**
- Academy evidence: 15% (simulations ≠ real patients)
- Workforce evidence: 60% (real triage performance)
- Peer validation: 15% (expert review)
- External attestation: 10% (license validity)
**Diversity Requirements:**
- Minimum 3 patient demographics (pediatric, adult, geriatric)
- Minimum 4 acuity levels (routine, urgent, emergency, critical)
- Minimum 3 clinical contexts (inpatient, outpatient, emergency)
---
## WHAT IS MEASURED
1. **Decision accuracy under uncertainty** - Correctly identifying patient conditions when information is incomplete
2. **Evidence-based reasoning** - Using clinical guidelines and protocols appropriately
3. **Patient safety protocols** - Following escalation procedures for high-risk situations
4. **Communication clarity** - Documenting decisions clearly for other care team members
5. **Escalation appropriateness** - Knowing when to call a physician vs. handle independently
**Example Tasks:**
- Review patient vitals and determine if ER visit needed
- Assess medication side effects and escalate if dangerous
- Triage incoming patients by acuity level
- Document assessment findings for care team
---
## WHAT IS NOT MEASURED
- ❌ **Years of nursing experience** - Not a capability proxy (new RNs can have strong judgment)
- ❌ **Nursing school prestige** - Where you trained doesn't determine capability
- ❌ **Speed of triage** - Quality over speed (fast but wrong = dangerous)
- ❌ **Patient satisfaction scores** - Nice bedside manner ≠ clinical judgment
- ❌ **Age, gender, race** - Never factored into assessment
- ❌ **Employment history** - Past hospital employment irrelevant
---
## FAIRNESS CONSIDERATIONS
**How we ensure fairness:**
1. **Diverse patient demographics:** Scenarios include varied ages, genders, races, socioeconomic backgrounds
2. **No cultural bias:** Scenarios reviewed by diverse nursing panel to eliminate cultural assumptions
3. **Multiple pathways:** Academy training, real-world performance, or peer validation can all demonstrate capability
4. **Accessibility:** Scenarios available in multiple formats (text, audio description) for nurses with disabilities
5. **Language support:** Clinical judgment assessed in nurse's primary language
**Bias Audit Results:**
- Last audit: October 2025
- Auditor: DAIR Institute (Dr. Timnit Gebru's team)
- Findings: No statistically significant demographic disparities detected (p > 0.05)
- Remediation: N/A (passed audit)
---
## HOW TO IMPROVE THIS CAPABILITY
**Recommended path to mastery:**
1. **Academy Training:**
- Complete "Clinical Triage Foundations" module (4 hours)
- Complete "Escalation Decision-Making" module (3 hours)
- Pass 20 simulated patient scenarios (varies by performance)
2. **Practice Tasks:**
- Start with low-acuity triage tasks in Workforce Cloud
- Progress to urgent/emergency scenarios as weight improves
- Request peer mentorship from high-weight RNs
3. **External Resources:**
- AACN Clinical Judgment Model (free online)
- Tanner's Clinical Judgment Model (research paper)
- State nursing board continuing education
**Estimated Time to Proficiency:** 60-90 days (depends on prior experience)
**Current Supply/Demand:**
- Humans with this capability: 1,247 (weight >0.70)
- Enterprise demand: **HIGH** (healthcare triage is top-requested)
- Earning potential: $45-75/hour for high-weight capability
---
## REVISION HISTORY
| Version | Date | Changes | Reviewer |
|---------|------|---------|----------|
| 1.0 | June 2025 | Initial capability card | Internal Trust & Safety |
| 1.1 | August 2025 | Added peer validation requirement | DAIR Institute |
| 1.2 | October 2025 | Passed third-party bias audit | DAIR Institute |
---
**Questions or Concerns?**
Contact: trust-and-safety@human.xyz
Appeal Process: https://human.xyz/appeal
Implementation Architecture
Storage:
interface CapabilityCard {
capabilityId: string;
version: string;
lastReviewed: Date;
reviewedBy: string;
// Assessment method
assessmentMethod: {
trainingSources: string[];
workforceSources: string[];
peerValidation: object;
externalAttestation?: object;
weightCalculation: Record<string, number>;
diversityRequirements: object;
};
// What's measured
measuredAttributes: {
name: string;
description: string;
howMeasured: string;
}[];
exampleTasks: string[];
// What's NOT measured (critical for fairness)
excludedAttributes: {
attribute: string;
rationale: string;
}[];
// Fairness
fairnessConsiderations: {
diverseTesting: string;
multiplePathways: string;
biasAuditing: string;
accessibility: string;
appealProcess: string;
};
biasAuditResults: {
lastAudit: Date;
auditor: string;
findings: string;
remediation?: string;
};
// Improvement path
improvementPath: {
academyModules: string[];
practiceTaskTypes: string[];
peerLearning: string[];
externalResources: string[];
estimatedTimeToMastery: string;
};
supplyDemand: {
humanCount: number;
demandLevel: 'high' | 'medium' | 'low';
earningPotential: string;
};
// History
revisionHistory: {
version: string;
date: Date;
changes: string;
reviewer: string;
}[];
}
Access:
- Public: All Capability Cards publicly accessible (transparency)
- Web UI: Browse capability cards at https://human.xyz/capabilities/[capability-id]
- API:
GET /api/v1/capabilities/{capabilityId}/card - In-app: View card from Capability Graph UI
Use Cases
1. Human transparency: "Why is my 'Python Programming' weight only 0.65?" → Check Capability Card → See assessment method → Understand need for more diverse tasks
2. Enterprise audit: "How do you assess 'Clinical Judgment'? Show me." → Share Capability Card → Enterprise reviews methodology → Builds trust
3. Regulatory compliance: EU AI Act requires explainability for automated decision systems → Capability Cards satisfy transparency requirements
4. Bias detection: Third-party auditors review Capability Cards for fairness → DAIR Institute audits quarterly → Findings published
5. Improvement guidance: "How do I get better at 'Financial Analysis'?" → Capability Card shows recommended training path
Regulatory Alignment
EU AI Act (High-Risk AI Systems):
- ✅ Transparency requirement (Capability Cards provide it)
- ✅ Human oversight (peer review in assessment)
- ✅ Documentation (revision history, audit trail)
- ✅ Bias mitigation (explicit fairness section)
Timeline for EU AI Act prep: Month 12 (before Act enforcement)
Research Partnership Opportunity
Timnit Gebru (DAIR Institute) - Fairness & transparency advisor
Potential collaboration:
- Review Capability Card template design
- Conduct third-party bias audits (quarterly)
- Co-author paper: "Capability Cards: Model Cards for Human Assessment Systems"
- Advisory board position (Fairness & Equity Advisory tier)
Talking points:
- "We're applying your Model Cards framework to human capability assessment"
- "Every capability has a transparency document—no black boxes"
- "Would love DAIR Institute to audit our fairness approach"
See: kb/86_academic_and_thought_leader_engagement_strategy.md - Timnit Gebru engagement plan
Implementation Timeline
| Phase | Deliverable | Timeline |
|---|---|---|
| v0.1 (Design) | Capability Card template design | Month 9 |
| v0.2 (Pilot) | First 10 Capability Cards (top capabilities) | Month 10 |
| v0.3 (Public) | All active capabilities have cards, web UI | Month 12 |
| v0.4 (Audit) | Third-party bias audit (DAIR Institute) | Month 15 |
| Ongoing | Quarterly updates, annual comprehensive audit | Quarterly |
Success Metrics
| Metric | Target | Measurement |
|---|---|---|
| Coverage | 100% of active capabilities | % with published cards |
| Transparency score | >4.5/5 | User survey: "Do you understand how you're assessed?" |
| Audit pass rate | 100% | % of capabilities passing bias audit |
| Appeal rate | <5% | % of assessments appealed (healthy = some appeals) |
| Improvement adoption | >60% | % of users who follow improvement path after viewing card |
COMPREHENSIVE BIAS MITIGATION ARCHITECTURE
Constitutional Mandate: "It is not for exclusion. It is for revelation." (Principle Two)
HUMAN addresses bias not as a compliance checkbox, but as a foundational architectural constraint. The entire Capability Graph is designed to make bias structurally impossible.
1. Architectural Exclusions (Cannot Be Measured)
The system is designed so these factors cannot influence capability assessment:
The engine NEVER evaluates:
- ❌ Race - Not captured, not stored, not evaluated
- ❌ Gender - Not a capability factor
- ❌ Age - Not correlated with capability
- ❌ Disability - Accommodations provided, not penalized
- ❌ Geography - Location ≠ capability
- ❌ Language - Multilingual assessment available
- ❌ Educational pedigree - Where you learned ≠ what you can do
- ❌ Employment history - Past employers don't determine capability
- ❌ Socioeconomic status - No bias based on background
- ❌ Years of experience - Not a proxy for capability
- ❌ Speed - Quality over speed
Only demonstrated human capability.
These exclusions are hardcoded into Capability Cards (see above) and enforced through:
- Schema design (fields don't exist)
- Capability Card transparency (explicit exclusion lists)
- Third-party audits (verify exclusions are enforced)
2. Anti-Exclusion Design Principles
No Scoring or Ranking:
- Capabilities are weights (0.0-1.0), not scores
- No leaderboards or comparative rankings
- No "top performer" lists
- Revelation of ability, not judgment
Multiple Pathways to Demonstrate Capability:
- Academy training (structured learning)
- Workforce performance (real-world tasks)
- Peer validation (community review)
- External attestation (credentials, licenses)
- No single narrow test that favors certain demographics
Equal Opportunity to Demonstrate:
- Capabilities assessed across diverse contexts
- Cultural bias testing in scenario design
- Accessibility accommodations (visual, audio, cognitive)
- Multilingual assessment support
- No time-based pressure that disadvantages certain groups
3. Transparency & Auditability
Capability Cards (Model Cards for Human Assessment):
Every capability has a public transparency document showing:
- ✅ Assessment method - How it's measured
- ✅ What IS measured - Explicit attributes
- ✅ What is NOT measured - Explicit exclusions (prevents hidden bias)
- ✅ Fairness considerations - How fairness is ensured
- ✅ Bias audit results - Third-party audit findings (published quarterly)
- ✅ Appeal process - How to dispute assessment
Example: See "Capability Card: Clinical Judgment (Nursing)" above for complete template.
Public Accessibility:
- All Capability Cards publicly browsable
- API access:
GET /api/v1/capabilities/{capabilityId}/card - Web UI:
https://human.xyz/capabilities/[capability-id]
4. Third-Party Bias Audits
Research Partnership: DAIR Institute (Dr. Timnit Gebru)
HUMAN commits to quarterly fairness audits by external AI ethics researchers:
Audit Scope:
- Demographic disparity analysis (statistical testing)
- Hidden proxy detection (correlations with protected attributes)
- Assessment methodology review (fairness of design)
- Capability Card accuracy verification
- Remediation recommendations
Audit Results:
- Published on each Capability Card
- Findings shared publicly (transparency)
- Remediation actions tracked and verified
- Appeals process informed by audit findings
Timeline:
- Month 15: First comprehensive DAIR Institute audit
- Quarterly: Ongoing bias audits
- Annual: Comprehensive fairness certification
See: kb/86_academic_and_thought_leader_engagement_strategy.md - DAIR Institute engagement plan
5. Capability-First Routing (Prevents Discrimination)
HumanOS routing follows Principle Twelve: Capability-First, Cost-Informed:
1. Filter to resources that CAN do the work (capability threshold)
2. Among capable resources, consider cost
3. Never route purely on cost (prevents "race to the bottom")
4. Log every routing decision (auditability)
This prevents:
- Cost-driven discrimination (choosing cheapest worker regardless of fit)
- Bias toward certain demographics (routing based on proxies)
- Exploitative wage pressure (capability ensures fair matching)
Every routing decision is cryptographically logged with full provenance:
- Why was this person chosen?
- What capability weights influenced the decision?
- Were there alternative matches? Why weren't they chosen?
See: kb/13_foundational_principles.md - Principle Twelve (Capability-First Routing)
See: kb/35_capability_routing_pattern.md - Complete routing architecture
6. Academy as Equity Infrastructure
Zero barriers to capability development (Principle Six):
- ✅ Free forever for individuals - No socioeconomic gatekeeping
- ✅ Multimodal learning - Text, voice, visual (accommodates learning styles)
- ✅ Multilingual - Reaches global populations
- ✅ Directly connected to paid work - Training → capability → income
- ✅ No credential requirements - Demonstrated ability > pedigree
Academy prevents bias amplification:
- Displaced workers (those most harmed by AI) get free training
- No "wealthy-only re-skilling" that would worsen inequality
- Training quality is identical regardless of background
- Performance-based assessment (not demographic proxies)
Research Foundation (Timnit Gebru): "AI displacement will hit marginalized communities hardest. HUMAN's Academy is anti-exclusion infrastructure—free training that breaks the cycle."
See: kb/24_human_academy.md - Complete Academy architecture
7. The Rule of Threes: Ethical Constraint Layer
Every feature must satisfy:
- ✔ Good for the human - Does this increase agency and opportunity?
- ✔ Good for HUMAN - Is this sustainable and aligned with our mission?
- ✔ Good for humankind - Does this benefit society or create harm?
Any feature that creates bias fails this test and is architecturally blocked.
Examples:
- ❌ Using years of experience as capability proxy → Fails human (disadvantages career-changers) + humankind (amplifies age bias)
- ❌ Charging for re-skilling training → Fails human (barriers for displaced) + humankind (worsens inequality)
- ❌ Ranking humans competitively → Fails human (surveillance culture) + humankind (exclusion tool)
The Rule of Threes is the immune system that prevents bias from being introduced even under investor or market pressure.
See: kb/13_foundational_principles.md - Principle Three (Rule of Threes)
8. Cryptographic Verifiability & Appeals
Every capability assessment is verifiable:
- Provenance logging (who assessed, when, why)
- Evidence pointers (what demonstrations contributed)
- Audit trails (full history of capability evolution)
- Appeal process (humans can contest assessments)
Appeals Process:
- Human submits appeal with rationale
- Trust & Safety reviews evidence
- Peer validators (qualified humans) re-evaluate
- Decision published with explanation
- If assessment was wrong, remediation applied retroactively
This prevents:
- Black-box algorithmic discrimination
- Hidden bias in automated systems
- Inability to contest unfair treatment
Regulatory Alignment:
- ✅ EU AI Act - High-risk AI systems require explainability (Capability Cards provide it)
- ✅ GDPR - Right to explanation (full provenance available)
- ✅ Algorithmic accountability laws - Third-party audits satisfy requirements
Why This Architecture Works
HUMAN doesn't "address bias"—the system is designed to make bias structurally impossible:
| Bias Risk | Traditional Systems | HUMAN's Architecture |
|---|---|---|
| Hidden proxies | Demographic data influences decisions | Proxies architecturally excluded (can't be measured) |
| Black-box scoring | Opaque algorithms, no explanation | Capability Cards explain every assessment |
| Gaming/manipulation | Resume inflation, credential fraud | Multi-source evidence, cryptographic verification |
| Lack of accountability | No audit trail, no appeals | Full provenance logging, appeals process |
| Exclusionary design | Single narrow tests favor certain groups | Multiple pathways, diverse contexts |
| Socioeconomic barriers | Expensive training gates opportunity | Free Academy, performance-based assessment |
| Cost-driven discrimination | Cheapest resource wins | Capability-first routing, cost-informed |
Research Foundation
Timnit Gebru (DAIR Institute) - Fairness & Bias Research:
"AI systems perpetuate and amplify existing biases. We need structural fairness, not just algorithmic tweaks."
HUMAN's response to Gebru's research:
- Capability Graph focuses on demonstrated ability, not proxies (education, pedigree)
- Capability Cards apply Gebru's Model Cards framework to human assessment
- Regular third-party audits (DAIR Institute partnership)
- Academy provides equity infrastructure (free training for displaced workers)
See: kb/85_strategic_frameworks_and_research_foundation.md - Complete Gebru framework analysis
Summary: Bias Mitigation is Constitutional
Bias prevention isn't a feature—it's a constitutional principle (Principle Two: Anti-Exclusion).
The entire architecture enforces:
- ✅ Explicit exclusions → Demographics can't be measured
- ✅ Transparency → Every assessment is explainable
- ✅ Third-party audits → External verification (DAIR Institute)
- ✅ Multiple pathways → No single narrow test
- ✅ Free training → No socioeconomic gatekeeping
- ✅ Capability-first routing → No cost-driven discrimination
- ✅ Cryptographic logging → Every decision is auditable
- ✅ Constitutional constraints → Rule of Threes blocks harmful features
Investors cannot override it.
Founders cannot compromise it.
Market pressure cannot erode it.
AI cannot bypass it.
This is what makes HUMAN trustworthy.
HUMAN-ONLY GUARANTEES
The CG guarantees:
- You can always see your capability graph.
- You can always edit/remove sensitive nodes.
- You can always revoke access instantly.
- You can lock your graph with biometric consent.
- You can delete your graph and take it with you (enterprise cannot).
The Capability Graph is a human possession, not an enterprise asset.
HOW HUMANOS USES THE GRAPH
HumanOS uses the graph to:
- route work safely
- escalate at the right time
- prevent overwhelm
- match tasks to capability
- identify mentorship opportunities
- guide AI boundaries
Example:
If a person shows high "ambiguity triage," HumanOS may route early-warning tasks.
If they show strong "ethical escalation," HumanOS uses them for high-stakes checks.
Never exploitative.
Always protective.
WHY THE CAPABILITY GRAPH WINS
Because it achieves what no one else is even attempting:
- Representing human capability with dignity and truth
- Protecting people from algorithmic judgment
- Enabling AI to recognize when it needs a human
- Giving enterprises safe, verifiable human oversight
- Lifting people with gaps, nonlinear histories, or invisible experience
- Making growth continuous, guided, and owned by the person
The CG is the missing counterpart to AI models.
AI has weights.
Humans need capability graphs.
This is the match.
EDGE CACHING & LOCAL CAPABILITY DATA
See: 49_devops_and_infrastructure_model.md for complete edge/device architecture.
Where Capability Data Lives
| Data Type | Location | Update Frequency |
|---|---|---|
| Full capability graph | User's Passport Vault (device) | Real-time |
| Capability profile summary | Edge cache | 1-5 minute TTL |
| Capability proofs | Device (user-generated ZK proofs) | On-demand |
| Public attestations | Edge + Ledger | On attestation |
| Matching indices | Regional cloud | Batch updated |
Edge Caching Strategy
// Capability profile caching at edge
const CAPABILITY_CACHE_POLICY = {
// Public profile (safe to cache)
profileSummary: {
ttl: "5m",
staleWhileRevalidate: "1h",
key: (did) => `capability:profile:${did}`
},
// Capability existence check (for routing)
capabilityCheck: {
ttl: "1m",
key: (did, capability) => `capability:check:${did}:${capability}`
},
// Attestation verification (stable)
attestation: {
ttl: "24h", // Attestations don't change
key: (attestationId) => `capability:attestation:${attestationId}`
}
};
On-Device Capability Access
Users can access and prove capabilities without cloud:
// On-device capability proof generation
class LocalCapabilityProver {
async proveCapability(
capability: string,
verifier: DID,
minLevel: number
): Promise<ZKProof> {
// 1. Read capability from local vault
const localGraph = await this.vault.getCapabilityGraph();
const capNode = localGraph.find(capability);
// 2. Generate ZK proof on-device
const proof = await zkSnark.prove({
statement: `I have ${capability} at level >= ${minLevel}`,
witness: capNode,
publicInputs: [verifier, capability, minLevel]
});
// No cloud call needed — proof is self-contained
return proof;
}
}
Offline Capability Queries
HumanOS can make routing decisions offline using cached capability data:
// Offline capability matching
async function matchLocalCapabilities(task: Task): Promise<MatchResult> {
// 1. Get local capability cache
const localCache = await device.getCapabilityCache();
// 2. Check if user can handle task
const requiredCapabilities = extractRequirements(task);
const matches = requiredCapabilities.every(cap =>
localCache.has(cap.id) && localCache.get(cap.id).level >= cap.minLevel
);
// 3. Return match (will sync provenance when online)
return {
matched: matches,
offline: true,
syncRequired: true
};
}
Result: Capability verification and proof generation work entirely on-device. Cloud is only needed for:
- Complex multi-user matching (Workforce Cloud)
- Global capability index updates
- Cross-user attestation verification
CAPABILITY GRAPHS AS UNIVERSAL ABSTRACTION
The Capability Graph is not just for humans.
The same abstraction that tracks human skills, credentials, and evidence must apply to ALL resources in the HUMAN ecosystem:
AI Models Have Capability Graphs
Every AI model has a capability profile:
interface AIModelCapabilityProfile {
provider: 'anthropic' | 'openai' | 'deepseek' | 'google' | 'local';
model: string; // 'claude-sonnet-4', 'gpt-4o'
mode: ModelMode; // streaming, batch, extended_thinking
// Same capability node structure as humans!
capabilities: {
reasoning: CapabilityScore; // Chain-of-thought, analysis
coding: CapabilityScore; // Code generation, debugging
synthesis: CapabilityScore; // Combining sources, creativity
factualRecall: CapabilityScore; // Knowledge retrieval
instruction: CapabilityScore; // Following complex instructions
safety: CapabilityScore; // Refusal appropriateness
speed: CapabilityScore; // Latency characteristics
};
// Constraints
contextWindow: number;
outputLimit: number;
supportedModalities: string[]; // text, code, image, audio
// Known issues (critical for routing!)
knownWeaknesses: string[]; // "hallucinates dates", "poor at math"
avoidFor: string[]; // Task types to avoid
// Cost - real dollars, not tokens
pricing: {
inputPerMillionTokens: number;
outputPerMillionTokens: number;
perMinute?: number; // For realtime voice
};
}
Agents Have Capability Graphs
Every agent in the HUMAN ecosystem has:
- Tools — What tools does this agent have access to?
- Domains — What knowledge domains is it trained for?
- Permissions — What actions can it take?
- Trust Level — Based on track record and verification
Services Have Capability Graphs
External services also have queryable profiles:
- SLAs — What latency and availability guarantees?
- Failure Modes — What are known failure patterns?
- Capacity — What throughput can it handle?
- Cost — What does it cost per operation?
The Universal Query Interface
The same query interface works for ALL resource types:
interface CapabilityQuery {
requiredCapabilities: string[]; // What capabilities are needed
minConfidence?: number; // Minimum capability score
constraints?: {
maxCost?: number; // Dollar budget
maxLatency?: number; // Time budget
safetyRequirements?: string[]; // Non-negotiable guardrails
};
}
// Works for humans, AI models, agents, and services
function findMatchingResources(
query: CapabilityQuery,
resourcePool: Resource[]
): Resource[] {
return resourcePool.filter(r =>
meetsCapabilityRequirements(r.capabilities, query.requiredCapabilities) &&
meetsConstraints(r, query.constraints)
);
}
Why This Unification Matters
- Single routing primitive — The Universal Routing Primitive uses capability graphs for all resource types
- Consistent evaluation — Same logic for matching humans to tasks and models to queries
- Explainable routing — "We chose Claude because it scored higher on reasoning"
- Learning loop — Quality feedback updates capability profiles for all resources
See: 35_capability_routing_pattern.md — The capability-first routing pattern
ANTI-SOCIAL-CREDIT TECHNICAL GUARANTEES
Critical Design Commitment:
The Capability Graph is designed to REVEAL human capability, not RANK humans. This is not aspirational—it's enforced through technical design and governance.
The Risk We're Defending Against
Social credit systems:
- Rank and sort humans
- Create hierarchies and scores
- Enable discrimination and gatekeeping
- Are owned/controlled by central authorities
- Lack individual consent and control
The Capability Graph is fundamentally different:
- Evidence-based, not score-based
- Opt-in and consent-driven
- Human-owned and portable
- Context-specific, not universal ranking
- Designed for revelation, not exclusion
Technical Safeguards (Implemented)
1. No Global Leaderboards (Ever)
Rule: The system will NEVER display:
- "Top 100 nurses"
- "Best JavaScript developers"
- "Highest-rated reviewers"
- Any global ranking or comparative scoring
Technical enforcement:
- No API endpoints that return ranked lists of humans
- UI explicitly prohibits comparative displays
- Analytics aggregated only (no individual rankings)
Exception: Resource routing (internal to HumanOS) uses capability matching for task assignment, but this is NEVER exposed as a ranking.
// ❌ FORBIDDEN
function getTopDevelopers(count: number): Developer[] {
// This will never exist
}
// ✅ ALLOWED
function findQualifiedDevelopers(
requiredCapabilities: string[],
minConfidence: number
): Developer[] {
// Returns qualified candidates, not ranked list
// Order is NOT significant
}
2. No Numeric Scores Visible to Third Parties
Rule: Capability weights (0.0-1.0) are internal system values, NEVER displayed to:
- Other humans
- Enterprises
- Third-party services
- Anyone except the capability owner
What IS visible:
- "Has demonstrated capability in X"
- Evidence pointers (credentials, work history, attestations)
- Context-specific qualifications
- Confidence intervals (e.g., "High confidence in medical triage")
What is NOT visible:
- "0.87 capability score"
- Numeric comparisons between people
- Percentile rankings
// ❌ FORBIDDEN - Exposing numeric scores
interface PublicCapabilityView {
name: string;
score: number; // NO
}
// ✅ ALLOWED - Evidence-based disclosure
interface PublicCapabilityView {
name: string;
evidencePointers: Evidence[];
confidenceLevel: 'low' | 'medium' | 'high';
lastDemonstrated: Date;
}
3. Capability Assertions Require Evidence Pointers
Rule: Every capability claim must link to verifiable evidence:
- Work completed (provenance logs)
- Training completed (Academy records)
- Credentials earned (external attestations)
- Peer attestations (signed by other Passport holders)
No unsupported claims:
- Can't just say "I'm good at X"
- Must point to WHERE and WHEN capability was demonstrated
- Evidence is cryptographically signed
interface CapabilityAssertion {
capability: string;
evidence: Evidence[]; // REQUIRED, not optional
attestedBy: PassportDID;
timestamp: Date;
signature: string;
}
interface Evidence {
type: 'work' | 'training' | 'credential' | 'peer_attestation';
source: string; // Verifiable pointer
date: Date;
signature: string;
}
4. Regular Bias Audits
Commitment: Independent audits for:
- Algorithmic bias in capability inference
- Demographic disparities in capability weights
- Access patterns (who gets opportunities?)
- Evidence collection fairness
Conducted by:
- External ethics advisory board member
- Third-party algorithmic fairness researchers
- Scheduled quarterly (minimum)
Audit findings:
- Published publicly (aggregated, anonymized)
- Action plan for any identified bias
- System updates tracked and documented
Enforcement:
- Ethics board has authority to flag concerns
- HUMAN Labs must respond within 30 days
- Foundation (post-transition) has veto power on standards
5. Public Commitment Document
"The Capability Truth Commitment" (Published at launch)
HUMAN commits to:
- Never ranking humans globally
- Never exposing numeric capability scores to third parties
- Requiring evidence for all capability assertions
- Conducting regular bias audits
- Publishing audit findings publicly
- Maintaining human ownership of capability data
- Enabling opt-out and data deletion at any time
- Prohibiting capability data from being sold
- Enforcing these principles via governance structure
- Transitioning governance to independent Foundation
This is a binding commitment, not marketing.
Governance Safeguards
Phase 1: HUMAN Labs (Seed to Series A)
Who controls standards:
- HUMAN Labs proposes capability standards
- Ethics advisory board provides oversight
- All decisions logged publicly
- Community can review and provide feedback
Constraints:
- No standards that enable ranking
- No standards that expose numeric scores
- All standards must include evidence requirements
- Bias audit findings must be addressed
Phase 2: Governance Council (Series A to B)
Who controls standards:
- Multi-stakeholder governance council
- Labs proposes, council reviews/approves
- Includes: enterprises, researchers, privacy advocates
- Public comment period (30 days minimum)
Constraints:
- Same as Phase 1, plus:
- Multi-party consensus for breaking changes
- Independent ethics board has veto power on discriminatory standards
Phase 3: Foundation (Series B+)
Who controls standards:
- Independent Foundation
- Separated from HUMAN Labs operations
- Community governance with elected board
- HUMAN Labs becomes one stakeholder among many
Constraints:
- Constitutional commitment to anti-ranking principles
- Foundation charter prohibits social credit uses
- Governance structure prevents capture by any single party
User Rights (Always Enforced)
Every human with a Capability Graph has the right to:
- View their own data (always free)
- Export their data (portable format)
- Delete their data (right to be forgotten)
- Contest inaccuracies (dispute resolution process)
- Control visibility (who sees what capabilities)
- Opt out of inference (no ML-derived capabilities without consent)
- Understand decisions (why was I routed/not routed for a task?)
- Appeal routing decisions (human review available)
These are not features. These are rights.
Why This Matters
The Capability Graph will be attacked.
Critics will say:
- "This is just LinkedIn with crypto"
- "You're building a social credit system"
- "This enables discrimination"
- "Who decides what capabilities matter?"
Our defense:
- Technical: We've designed anti-ranking into the system
- Governance: Independent oversight prevents misuse
- Rights: Users control their data and can opt out
- Transparency: Public audits, published findings, open governance
- Evidence: Not our word—external ethics board validates
This is not theoretical. This is operational.
Cross-References
- See:
13_foundational_principles.md- Principle Two: "It is not for exclusion. It is for revelation." - See:
48_governance_model_and_constitutional_layer.md- Governance transition timeline - See:
kb/internal/founder-decisions-2025-12-11.md- Decision #1: Social Credit Defense
Metadata
Source Sections:
- Lines 32,241-32,593: SECTION 81 — The Capability Graph Engine v0.1
Merge Strategy: Extract directly (single comprehensive spec)
Strategic Purposes:
- Building (primary)
- Companion
- Product Vision
Cross-References:
- See:
35_capability_routing_pattern.md- The capability-first routing pattern - See:
04_the_five_systems.md- Capability Graph overview - See:
05_the_human_protocol.md- Graph in the loop - See:
20_passport_identity_layer.md- Identity integration - See:
22_humanos_orchestration_core.md- How HumanOS uses the graph - See:
24_human_academy.md- How Academy feeds the graph - See:
25_workforce_cloud.md- How work updates the graph - See:
49_devops_and_infrastructure_model.md- Edge caching for capability profiles - See:
65_cost_controls_and_ai_optimization.md- LLM routing using capability profiles - See:
97_api_specification_capability_graph.md- Capability Graph API
Line Count: ~355 lines
Extracted: November 24, 2025
Version: 2.0 (Complete Reorganization)