HUMΛN
Architecture
Architecture

Protocol-Level Prompt Management: Why AI Prompts Deserve First-Class Identity

HUMΛN Team··12 min·Technical + General

Every AI system runs on prompts. They shape how models think, what they produce, and whether the output is useful or garbage. And yet, across the industry, prompts are treated like throwaway strings — hardcoded in source files, invisible to monitoring, ungoverned by access control, and impossible to improve systematically.

This is the equivalent of deploying production code without version control. It worked in 2005. It doesn't work now.

HUMΛN takes a fundamentally different approach. In the HUMΛN Agent Operating System (HAIO), prompts are first-class managed artifacts — with identity, governance, observability, and a self-improving refinement loop baked into the protocol layer.

This isn't a feature. It's an architectural decision that changes how AI systems are built, operated, and improved.

The Problem: Prompts as Dark Matter

Consider what happens in a typical AI system today:

// Somewhere in agent code...
const response = await llm.complete({
  messages: [{
    role: 'system',
    content: 'You are a helpful assistant that analyzes contracts. Focus on risks and obligations.'
  }]
});

This prompt has no identity. No version. No access control. No telemetry. When the response quality degrades, you can't trace it back to which prompt was used. When someone edits it, there's no audit trail. When you want to know which prompts are costing the most, you're blind.

Now multiply this by hundreds of agents across dozens of teams. You get:

  • No governance: Anyone can edit any prompt with no review process
  • No observability: Which prompts are used how often? Which ones produce the best results? Unknown.
  • No composition: Want to layer a persona, a domain lens, and a task prompt? Copy-paste and hope for the best.
  • No cost visibility: A prompt stack might consume 4,000 tokens per call. You won't know until the invoice arrives.
  • No improvement loop: Prompts get written once and rot. There's no mechanism to identify underperformers and propose fixes.

This is the state of prompt management across the AI industry. It's a governance vacuum.

The HUMΛN Approach: Prompts as Managed Artifacts

In HUMΛN's HAIO protocol, every prompt has:

  1. A canonical URI — unique, namespaced, version-pinned identity
  2. Delegation-based access control — who can read, write, publish, and rollback
  3. Schema-validated inputs — typed variables with defaults and required fields
  4. Inheritance and composition — multi-layer prompt stacks with provenance
  5. Protocol-level telemetry — every LLM call records which prompts were used
  6. A self-improving feedback loop — telemetry drives refinement proposals

Let's walk through each one.

Prompt URI Namespace Model

Every prompt in HUMΛN has a canonical URI, following the same addressing model as agent identities:

prompt://core/companion.canon.root-persona@0.1
prompt://org/acme/legal.contract-analysis@2.1.0
prompt://marketplace/human.companion.task.summarize@0.1

The URI encodes three things: scope (core, org, marketplace), namespace (hierarchical dot-notation), and version (semver).

Three scopes exist, forming a hierarchy where each layer can extend the one above:

graph TD Core["prompt://core/<br/>HUMΛN stewards · Protocol-level<br/>Always human-approved"] Org["prompt://org/acme/<br/>Org-scoped · Extends core<br/>Delegation-gated access"] MP["prompt://marketplace/<br/>Publisher templates<br/>Versioned and installable"] Agent["Your Agent<br/>ctx.prompts.load()"] Core -->|extends| Org MP -->|installed into| Org Org -->|resolution priority| Agent Core -.->|fallback resolution| Agent
  • Core (prompt://core/): Protocol-level prompts maintained by HUMΛN stewards. These define the canonical behaviour of the system — personas, lenses, foundational tasks. Changes require human approval, always.
  • Org (prompt://org/{orgId}/): Organisation-level prompts customised for specific teams and use cases. Organisations can extend core prompts with local overrides.
  • Marketplace (prompt://marketplace/): Published prompt templates that organisations can install and customise.

Short-Form Resolution

Developers don't need to write full URIs in code. The system resolves short keys within the agent's org context:

// Developer writes:
const prompt = await ctx.prompts.load('contract-analysis');

// System resolves (org-first, then core):
// → prompt://org/acme/legal.contract-analysis@active

Resolution order: (1) org-level prompt matching the short name, (2) core prompt, (3) explicit full URI (no resolution needed).

This is the "Magic by Default, Control When Needed" principle in action: 90% of the time, short keys are all you need. When you need to pin to a specific version or cross-org prompt, the full URI is always available.

Delegation-Based Access Control

Every prompt operation is gated by delegation scopes. This uses the same Capability Boundary Engine (CBE) pattern as all other HUMΛN resources:

Scope What It Grants
prompt:read:* Read any prompt in the org
prompt:read:companion.task.* Read only companion task prompts
prompt:write:* Edit or create any prompt
prompt:publish:* Publish prompt versions to the database
prompt:rollback:* Rollback to a previous version
prompt:admin:* Cross-org administration (stewards only)

Wildcard matching works at the namespace level. An agent with prompt:read:legal.* can load any prompt in the legal namespace but can't access companion.task.* prompts.

This enforcement happens at every layer:

  • API endpoints: Middleware checks delegation before serving any prompt data
  • ctx.prompts: The SDK verifies scopes before returning content to agent code
  • CLI: Server-side enforcement — the CLI passes the auth token, the server checks scopes

No listing or editing of prompts you're not authorised to see. Period.

Inheritance and Composition

Prompts can extend other prompts, creating layered compositions:

# prompts/orgs/acme/legal/contract-analysis.md
---
id: contract-analysis
namespace: legal
type: task
scope: org
extends: prompt://core/companion.canon.root-persona
inputSchema:
  contract: { type: string, required: true }
  focus_areas: { type: string, required: false, default: "risks, obligations" }
version: '1.0.0'
---
Analyze the following contract with focus on {{focus_areas}}.

Contract:
{{contract}}

This org-level prompt extends the core persona, inheriting its identity and principles while adding domain-specific instructions. The system resolves the full inheritance chain, detects cycles, and produces an EffectivePrompt with full layer provenance.

For complex interactions, agents compose multiple prompts explicitly:

const composed = await ctx.prompts.compose([
  'root-persona',       // Core identity
  'lens-legal',         // Domain lens
  'contract-analysis',  // Specific task
], { variables: { contract: input.contract } });

The composed result includes metadata about every layer — which prompts contributed, their URIs, versions, and scopes. This metadata flows through the entire LLM call into provenance.

Version Control and the CLI

Prompts are version-controlled with full publish/rollback/deprecate lifecycle:

# Development workflow (local, no delegation needed)
human prompts lint                              # Validate schema and inheritance
human prompts render contract-analysis \
  --var contract="Sample text"                  # Preview rendered output
human prompts cost contract-analysis \
  --model gpt-4o                                # "~1,200 tokens, ~$0.006/call"

# Publishing (requires prompt:publish scope)
human prompts publish contract-analysis         # Publishes v1.0.0 to DB

# Operations
human prompts versions contract-analysis        # List all versions
human prompts rollback contract-analysis \
  --to 1.0.0                                    # Instant rollback
human prompts deprecate contract-analysis \
  --version 0.9.0                               # Mark old version

Published prompts are stored in a database with immutable version history. In development, file-based prompts take effect immediately. In production, published DB versions take precedence. This dual-mode design means developers get instant feedback during authoring while production gets the stability of versioned releases.

Why This Matters for the Industry

Prompts are the new source code for AI systems. They determine what models produce, how they reason, and whether the output meets quality standards. Treating them as unmanaged strings is a governance failure that the industry will eventually reckon with.

HUMΛN's approach — identity, governance, composition, observability, and self-improvement at the protocol level — is what prompt management looks like when you design it from first principles rather than bolting it on as an afterthought.

The prompt management system is part of the HAIO protocol. It works for every agent, not just the Companion. It's delegation-gated, not permission-free. And it closes the loop between usage, performance, and improvement.

Prompts deserve better than hardcoded strings. At HUMΛN, they get it.


This is the first in a three-part series on HUMΛN's prompt management architecture. Next: The Self-Improving Prompt Loop — how telemetry, model affinity, and the Prompt Refinement Agent create a virtuous cycle of continuous improvement.