From Inline Strings to ctx.prompts: A Developer's Guide to HUMΛN Prompt Management

If you've built agents on HUMΛN before, you've probably written prompts like this:

const result = await ctx.llm.complete({
  messages: [{
    role: 'system',
    content: 'You are a helpful assistant that summarizes documents concisely.'
  }, {
    role: 'user',
    content: `Summarize this: ${input.document}`
  }]
});

It works. But it has problems: no version control, no access control, no telemetry, no way to improve it systematically, and no provenance — if something goes wrong, you can't trace which prompt produced the result.

HUMΛN's ctx.prompts API changes this. This guide walks through the full developer workflow: authoring prompts, validating them, composing multi-layer stacks, publishing versions, and wiring telemetry — all with real code examples.

Step 1: Author a Prompt File

Prompts live as markdown files with YAML frontmatter. Create a file in the prompts directory:

# prompts/orgs/YOUR_ORG/research/document-summary.md
---
id: document-summary
namespace: research
type: task
scope: org
extends: prompt://core/companion.canon.root-persona
inputSchema:
  document: { type: string, required: true, description: "Document text to summarize" }
  style: { type: string, required: false, default: "concise bullets" }
  max_length: { type: string, required: false, default: "200 words" }
version: '1.0.0'
---
Summarize the following document in {{style}} format.
Keep the summary under {{max_length}}.

Focus on:
- Key findings and conclusions
- Action items if any
- Critical data points

Document:
{{document}}

Key elements:

id: Short identifier used in code (ctx.prompts.load('document-summary'))
namespace: Hierarchical grouping (research, legal.contracts, companion.task)
extends: Parent prompt for inheritance — this task inherits the core persona
inputSchema: Typed variables with required/optional and defaults
version: Semver — immutable once published

Step 2: Validate and Preview

Before publishing, validate your prompt locally:

# Lint all prompts — checks schema, inheritance chains, variable consistency
pnpm prompt:lint

# Render a specific prompt with test variables
pnpm prompt:render document-summary \
  --var document="The quarterly report shows 23% revenue growth..." \
  --var style="executive brief"

# Estimate token count and cost
pnpm prompt:cost document-summary --model gpt-4o
# Output: ~850 tokens, ~$0.004/call (input) + ~$0.012/call (output est.)

The linter catches common mistakes:

Missing required variables in the template body
Variables in the template that aren't declared in inputSchema
Broken inheritance chains (parent prompt doesn't exist)
Schema type mismatches

Step 3: Load and Render in Agent Code

In your agent handler, use ctx.prompts instead of inline strings:

import { AgentHandler } from '@human/agent-sdk';

export const handler: AgentHandler = async (ctx, input) => {
  // Load prompt — resolves short key to full URI within org context
  // Delegation checked: agent must have prompt:read:research.document-summary
  const prompt = await ctx.prompts.load('document-summary');

  // Render with validated variables
  // Required variables enforced, defaults applied for optional ones
  const rendered = prompt.render({
    document: input.document,
    style: input.style ?? 'concise bullets',
  });

  // LLM call with prompt identity threaded into provenance
  const result = await ctx.llm.complete({
    messages: [{ role: 'system', content: rendered }],
    promptMetadata: prompt.toCallMetadata(),
  });

  return result;
};

What happens under the hood:

load('document-summary') resolves to prompt://org/{orgId}/research.document-summary@active
Delegation is verified: the agent's scopes must include prompt:read:research.document-summary (or prompt:read:*)
render() validates variables against inputSchema, applies defaults, and substitutes {{placeholders}}
toCallMetadata() generates a PromptCallMetadata object with full URI, version, and composition info
The metadata flows through ctx.llm.complete() into provenance, so this call is traceable

Step 4: Compose Multi-Layer Prompts

For complex interactions, compose multiple prompts into a single stack:

export const handler: AgentHandler = async (ctx, input) => {
  // Compose three layers — all delegation-checked individually
  const composed = await ctx.prompts.compose([
    'root-persona',          // prompt://core/companion.canon.root-persona@0.1
    'lens-research',         // prompt://org/acme/research.lens@1.0
    'document-summary',      // prompt://org/acme/research.document-summary@1.0
  ], {
    variables: {
      document: input.document,
      style: 'structured analysis',
    },
  });

  // composed.content = concatenated layers with separators
  // composed.metadata = all URIs, all versions, all layers
  // composed.estimatedTokens = total token estimate

  const result = await ctx.llm.complete({
    messages: [{ role: 'system', content: composed.content }],
    promptMetadata: composed.metadata,  // Full provenance of all layers
  });

  return result;
};

The composition metadata records which prompts contributed at which layer, so provenance shows the complete picture — not just "a system prompt was used" but exactly which persona, lens, and task prompt combined to produce this response.

Step 5: Wire Feedback Signals

Agents can report on prompt quality to feed the telemetry loop:

export const handler: AgentHandler = async (ctx, input) => {
  const prompt = await ctx.prompts.load('document-summary');
  const rendered = prompt.render({ document: input.document });

  const result = await ctx.llm.complete({
    messages: [{ role: 'system', content: rendered }],
    promptMetadata: prompt.toCallMetadata(),
  });

  // Evaluate output quality and feed back
  const parsed = JSON.parse(result.content);

  if (parsed.summary && parsed.summary.length > 50) {
    // Good result — positive signal
    await ctx.llm.recordPromptFeedback({
      provenanceId: result.provenanceId,
      signal: 'positive',
      source: 'agent',
      detail: 'Produced structured summary with sufficient detail',
    });
  } else {
    // Poor result — negative signal
    await ctx.llm.recordPromptFeedback({
      provenanceId: result.provenanceId,
      signal: 'negative',
      source: 'agent',
      detail: 'Summary too short or unparseable',
    });
  }

  return result;
};

These signals accumulate and feed into the Prompt Refinement Agent, which uses them to identify underperforming prompts and propose improvements.

Step 6: Publish and Manage Versions

When your prompt is ready for production:

# Publish to the database (requires prompt:publish scope)
human prompts publish document-summary
# Output:
# Published: prompt://org/YOUR_ORG/research.document-summary@1.0.0
# Active version updated. Previous: none (first publish)

# Later, update the prompt file and publish again
human prompts publish document-summary
# Output:
# Published: prompt://org/YOUR_ORG/research.document-summary@1.1.0
# Active version updated. Previous: 1.0.0

# Check version history
human prompts versions document-summary
# 1.1.0  active      2026-02-14  published by rick@human.dev
# 1.0.0  available   2026-02-10  published by rick@human.dev

# Rollback if the new version causes issues
human prompts rollback document-summary --to 1.0.0
# Rolled back: document-summary@1.0.0 is now active

# Check performance
human prompts performance document-summary
# Calls: 234 (last 7 days) | Avg tokens: 847 | Positive: 81%

In development, file-based prompts take effect immediately — edit the file, the agent picks it up. In production, the published DB version takes precedence. This gives you instant iteration during development with the stability of versioned releases in production.

Step 7: Use Effective Prompts (Inheritance Resolution)

If you need the fully-resolved prompt (with inheritance applied):

const effective = await ctx.prompts.getEffective('document-summary');

if (effective) {
  // effective.content = core persona + org task merged
  // effective.layers = [{scope: 'core', ...}, {scope: 'org', ...}]
  // effective.effectiveVersion = hash of all contributing versions
  console.log(`Resolved ${effective.layers.length} layers`);
}

This resolves the full inheritance chain (child -> parent -> grandparent), merging content with provenance tracking at every level.

Migration Guide: Inline to Managed

Converting existing agents from inline prompts to ctx.prompts:

Before (inline)

export const handler: AgentHandler = async (ctx, input) => {
  const result = await ctx.llm.complete({
    messages: [{
      role: 'system',
      content: 'You are a helpful assistant that summarizes documents concisely.'
    }, {
      role: 'user',
      content: `Summarize this: ${input.document}`
    }]
  });
  return result;
};

After (managed)

export const handler: AgentHandler = async (ctx, input) => {
  const prompt = await ctx.prompts.load('document-summary');
  const rendered = prompt.render({ document: input.document });

  const result = await ctx.llm.complete({
    messages: [{ role: 'system', content: rendered }],
    promptMetadata: prompt.toCallMetadata(),
  });

  if (result.parsed?.confidence > 0.8) {
    await ctx.llm.recordPromptFeedback({
      provenanceId: result.provenanceId,
      signal: 'positive',
      source: 'agent',
    });
  }

  return result;
};

The migration gives you: version control, delegation-based access, schema validation, token cost estimation, full provenance, and participation in the telemetry/refinement loop — for the cost of a few extra lines of code.

Quick Reference: ctx.prompts API

Method	Description	Delegation
`load(id)`	Load prompt by short key or full URI	`prompt:read:{key}`
`compose(ids, opts)`	Compose multiple prompts into a stack	`prompt:read` for each
`getEffective(key)`	Resolve inheritance chain	`prompt:read` for chain
`list(filter?)`	List accessible prompts	Filtered by `prompt:read`
`estimateTokens(id)`	Estimate token count	`prompt:read:{key}`

Method on LoadedPrompt	Description
`render(variables)`	Substitute and validate variables
`toCallMetadata()`	Generate PromptCallMetadata for LLM calls

What's Next

With managed prompts, your agents participate in a system-wide improvement loop. The Prompt Refinement Agent monitors telemetry, model affinity data optimises routing, and improvement proposals surface for human review.

You write the prompt once. The system makes it better over time.

This is the third in a three-part series on HUMΛN's prompt management architecture. Previously: Protocol-Level Prompt Management and The Self-Improving Prompt Loop.