CLI & SDK Scaffolding for HUMΛN Workflows — From Zero to Production in 15 Minutes

The problem with building workflows from scratch

Building a production-grade agentic workflow has a lot of surface area:

6–8 agents with consistent SDK compliance
Typed event contracts for inter-agent communication
Feedback emission at every step (mandatory)
Manifest declaration (workflow + CP + Companion + Workforce)
Marketplace bundle packaging
Test fixtures and mock contexts
Database migrations for config layers

If you start from scratch, a first-pass implementation takes a full day. Most of that time is scaffolding — not product logic.

The HUMΛN CLI eliminates the scaffolding. Here's how.

Creating a new workflow

human workflow create my-intelligence-workflow

What this generates:

packages/agents-reference/src/agents/my-intelligence-workflow/
├── types.ts                    # Domain-specific signal/artifact types
├── source-scout.ts             # Source monitoring agent (template)
├── trust-safety.ts             # Trust & safety gate (template)
├── signal-normalizer.ts        # Signal normalization (template)
├── signal-judge.ts             # Signal scoring (template)
├── opportunity-router.ts       # Routing (template)
├── artifact-workers.ts         # Artifact generation (template)
├── delivery-orchestrator.ts    # Delivery (template)
├── learning-engine.ts          # Learning (template)
├── workflow.ts                 # Orchestrator (template)
├── manifest.ts                 # humanos.workflow.v1 manifest (template)
├── marketplace-bundle.ts       # Bundle definition (template)
├── event-contracts.ts          # Typed event payloads (template)
├── install-wizard.ts           # Install wizard steps (template)
├── agent-manifests.ts          # Agent manifests (template)
├── connector-manifests.ts      # Connector manifests (template)
└── index.ts                    # Public exports

Every template is pre-wired with:

Correct ctx.* SDK calls (no OpenAI imports, no direct connector HTTP)
Mandatory ctx.events.emit('humanos.<workflow>.feedback', {...}) at end of every agent
Canonical @canon-deviation comment pattern for deviations
Full type safety from the shared types file

Interactive prompts:

What kind of sources does this workflow monitor? (web/api/event/manual)
What personas need to be served? (enter comma-separated roles)
Which artifact types will be generated? (select from registry)
Does any artifact type require human review? (yes → HITL scaffolding added)
Install preset count? (1–4)

With smart defaults, most builders just hit enter 5 times.

Creating a new agent

human agent create signals-my-custom-agent --workflow signals

Generates my-custom-agent.ts inside the workflow directory with:

import { handler } from '@human/agent-sdk';
import type { ExecutionContext } from '@human/agent-sdk';

export const AGENT_ID = 'hmn-signals-my-custom-agent';
export const VERSION = '1.0.0';

// Declare the capability strings this agent satisfies.
// The orchestrator routes to this agent by capability (not by ID directly).
export const CAPABILITIES = ['signals.my_custom_action'];

interface Input {
  workflow_run_id: string;
  org_did: string;
  // TODO: add your domain-specific fields
}

interface Output {
  // TODO: define your output
}

const execute = async (ctx: ExecutionContext, input: Input): Promise<Output> => {
  const { workflow_run_id, org_did } = input;
  ctx.log.info('my-custom-agent: starting', { workflow_run_id, org_did });

  // TODO: your implementation

  // MANDATORY: emit feedback at end of every run — this feeds the learning engine.
  // Use a typed feedback_type, not a free-form string.
  await ctx.events.emit('humanos.signals.feedback', {
    feedback_type: 'my_custom_signal',
    source: 'my-custom-agent',
    workflow_run_id: input.workflow_run_id,
    org_did: input.org_did,
    agent_id: AGENT_ID,
    signal_strength: 1.0,
    metadata: {
      // add metrics that matter for learning
    },
  });

  return { /* output */ };
};

export default handler({ name: AGENT_ID, version: VERSION, capabilities: CAPABILITIES, execute });

The feedback emission scaffold is always included. You can't accidentally forget it.

SDK helpers for common patterns

The @human/agent-sdk package ships first-class helpers for every canonical pattern.

Making LLM calls (provider-agnostic)

// ctx.llm.complete is provider-agnostic. Never import OpenAI or Anthropic directly.
const result = await ctx.llm.complete({
  prompt: [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: `Analyze this signal: ${JSON.stringify(signal)}` },
  ],
  temperature: 0.3,    // factual extraction: low temperature
  maxTokens: 512,
});

const analysis = result.content;                // string response
const tokens = result.usage.totalTokens;        // camelCase — not total_tokens
const cost = result.cost.usd;                   // cost tracking built in

Reading config (magic by default)

Config is provided through input as part of the resolved execution envelope. The orchestrator passes the org's effective config to each agent. Agents read from input — they don't call a config service directly.

// ✅ Read from input (resolved envelope — magic by default)
const threshold = input.thresholds?.min_confidence ?? 0.65;
const digestCadence = input.preferences?.digest_cadence ?? 'daily';

// The 10% override path: operators configure via the CP config form or
// the reference-config-pack.ts primitives. Agents just read input.

This keeps agents stateless and testable — no service calls needed for unit tests.

Calling connectors (platform-mediated)

// ✅ Always call connectors via ctx.call.agent — never direct HTTP
const result = await ctx.call.agent(
  'connector.human-web-capture.content.capture',
  {
    urls: ['https://example.com/blog/post'],
    org_did: input.org_did,
  }
);

// Each connector is a HUMΛN agent. The platform handles auth, rate limiting,
// credential injection, and retry. Your code handles the result.

Creating artifacts with full provenance

const startMs = Date.now();

const result = await ctx.llm.complete({ prompt: messages, temperature: 0.4, maxTokens: 800 });

// ctx.artifacts.create uses: kind, body, production, metadata
const artifact = await ctx.artifacts.create({
  kind: 'signals.executive_brief',        // ← 'kind', not 'artifact_type'
  body: {
    what_changed: result.content,
    confidence: signal.confidence,
    // ... artifact payload
  },
  production: {
    total_cost_usd: result.cost.usd,
    total_tokens: result.usage.totalTokens,   // ← camelCase
    wall_time_ms: Date.now() - startMs,
    agent_chain: [{
      agent_id: AGENT_ID,
      agent_version: VERSION,
      role: 'primary',
      model_id: 'platform-default',
      cost_usd: result.cost.usd,
      tokens_used: {
        promptTokens: result.usage.promptTokens,
        completionTokens: result.usage.completionTokens,
        totalTokens: result.usage.totalTokens,
      },
      duration_ms: Date.now() - startMs,
      outcome: 'success',
    }],
  },
  metadata: {
    signal_id: signal.signal_id,
    source_urls: signal.evidence_urls,
    confidence: signal.confidence,
  },
});

const artifactId = (artifact as { artifact_id: string }).artifact_id;

Requesting approval (HITL gate)

// ctx.approval.request routes an artifact to Workforce Cloud as a work item.
// The renderer_id must match a declared workforce_module.work_item_renderers[].id
// This is fire-and-forget — the pipeline continues; the artifact waits in Workforce.

if (workforceInstallationId) {
  await ctx.approval.request({
    installation_id: workforceInstallationId,
    renderer_id: 'prd-review',             // matches manifest.workforce_module renderer
    artifact_id: artifactId,
    urgency: 'high',                       // 'low' | 'medium' | 'high' | 'critical'
    metadata: {
      entity: signal.entity_name,
      signal_confidence: signal.confidence,
      workflow_run_id: input.workflow_run_id,
    },
  });
}

Calling another agent

// Cross-agent call: two arguments — capability string + input payload
const result = await ctx.call.agent(
  'human-agent-marketing-writer',
  {
    brief: contentBriefPayload,
    source_artifact_id: contentBriefArtifact.artifact_id,
    org_did: input.org_did,
  }
);

Requesting human escalation

// ctx.escalate is the Fourth Law in practice: AI must know when it doesn't know.
// Use the builder functions from signals-escalation-schemas.ts for typed schemas.

await ctx.escalate({
  reason: 'Signal confidence below threshold — human review needed',
  question: `Should this ${signal.signal_type} signal from ${signal.entity_name} be routed?`,
  context: {
    signal_id: signal.signal_id,
    confidence: signal.confidence,
    evidence_urls: signal.evidence_urls,
  },
  allowedActions: [
    { id: 'approve_route', label: 'Approve — route this signal' },
    { id: 'suppress', label: 'Suppress this signal' },
    { id: 'boost_confidence', label: 'Boost confidence weight for this source' },
  ],
  requiredCapability: 'signals.confidence.review',
  priority: 'medium',
  routingMode: 'workforce_cloud',
  schema: {
    displayAs: 'confidence_review',
    title: 'Low-Confidence Signal Review',
    description: `Confidence ${signal.confidence} is below threshold ${minConfidence}`,
  },
});

Persistent memory

// ctx.memory.persistent uses a key-value store. Values are always strings.
// Parse to your type; serialize back to string on write.

const rawCount = await ctx.memory.persistent.get(`signal_count_${signal.entity_name}`);
const count = rawCount !== null ? parseInt(rawCount, 10) : 0;

await ctx.memory.persistent.set(
  `signal_count_${signal.entity_name}`,
  String(count + 1)           // ← always serialize to string
);

Emitting feedback (canonical pattern)

// Always at the end of every agent's execute function.
// feedback_type is typed — use domain-specific names, not generic strings.
// signal_strength is 0.0–1.0: ratio of successful/expected outcomes.

await ctx.events.emit('humanos.signals.feedback', {
  feedback_type: 'verdict_signal',
  source: 'signal-judge',
  workflow_run_id: input.workflow_run_id,
  org_did: input.org_did,
  agent_id: AGENT_ID,
  signal_strength: passedCount / totalCount,
  metadata: {
    total_signals: signals.length,
    passed: passedCount,
    suppressed: signals.length - passedCount,
    avg_confidence: avgConfidence,
  },
});

Creating a new install preset

human workflow add-preset signals --name "Partnership Watch"

Appends a new preset to marketplace-bundle.ts:

'partnership-watch': {
  id: 'partnership-watch',
  name: 'Partnership Watch',
  description: 'Monitor partnership ecosystem for integration opportunities',
  icon: '🤝',
  default_config: {
    sources: { approved_domains: [], news_topics: ['partnership', 'integration', 'api'] },
    routing: {
      persona_weights: {
        founder: { strategic_alignment: 1.0, tech_signals: 0.5, market_intelligence: 0.8 },
        developer: { strategic_alignment: 0.3, tech_signals: 1.0, market_intelligence: 0.4 },
      },
    },
    workers_enabled: ['executive_brief', 'integration_planner'],
    preferences: { digest_cadence: 'daily', delivery_mode: 'immediate' },
  },
},

Creating a Workforce work item renderer

human workflow add-renderer signals --artifact-type signals.custom_draft

Appends to the workforce_module.work_item_renderers array in manifest.ts:

{
  id: 'custom-draft-review',
  label: 'Review Custom Draft',
  artifact_kinds: ['signals.custom_draft'],   // array — can handle multiple kinds
  renderer_type: 'review_artifact',
  actions: [
    { id: 'approve', label: 'Approve', effect: 'release_to_delivery' },
    { id: 'revise', label: 'Request revision', effect: 'create_revised_artifact' },
    { id: 'dismiss', label: 'Dismiss', effect: 'dismiss_with_feedback' },
  ],
  feedback_emit: true,
},

And adds the matching ctx.approval.request() scaffold to artifact-workers.ts:

await ctx.approval.request({
  installation_id: input.workforce_installation_id,
  renderer_id: 'custom-draft-review',    // must match renderer id above
  artifact_id: artifactId,
  urgency: 'medium',
  metadata: { /* context for reviewer */ },
});

Testing your workflow locally

The agent SDK ships with a test context factory that gives you a MockExecutionContext implementing every ctx.* call without any live infrastructure. Import from @human/agent-sdk:

import { handler, type ExecutionContext } from '@human/agent-sdk';

// For testing, construct a minimal mock context:
const mockCtx = {
  llm: {
    complete: vi.fn().mockResolvedValue({
      content: 'Mock LLM response',
      usage: { promptTokens: 100, completionTokens: 50, totalTokens: 150 },
      cost: { usd: 0.0003 },
    }),
  },
  events: { emit: vi.fn().mockResolvedValue(undefined) },
  artifacts: {
    create: vi.fn().mockResolvedValue({ artifact_id: 'art_test_123' }),
  },
  memory: {
    persistent: {
      get: vi.fn().mockResolvedValue(null),
      set: vi.fn().mockResolvedValue(undefined),
    },
  },
  call: {
    agent: vi.fn().mockResolvedValue({ data: { /* mock output */ } }),
    human: vi.fn().mockResolvedValue(undefined),
  },
  approval: { request: vi.fn().mockResolvedValue(undefined) },
  log: {
    info: vi.fn(),
    warn: vi.fn(),
    error: vi.fn(),
  },
} as unknown as ExecutionContext;

describe('Signal Judge', () => {
  it('escalates low-confidence signals to human review', async () => {
    const lowConfidenceSignal = {
      signal_id: 'sig_001',
      confidence: 0.4,   // below the 0.72 threshold
      signal_type: 'competitor_launch',
      entity_name: 'Competitor Corp',
      urgency: 'high',
      // ...
    };

    await execute(mockCtx, {
      signals: [lowConfidenceSignal],
      org_did: 'did:org:test',
      workflow_run_id: 'run_001',
    });

    // Verify escalation was triggered
    expect(mockCtx.escalate).toHaveBeenCalledWith(
      expect.objectContaining({
        reason: expect.stringContaining('confidence'),
        priority: expect.any(String),
      })
    );

    // Verify feedback was emitted (mandatory)
    expect(mockCtx.events.emit).toHaveBeenCalledWith(
      'humanos.signals.feedback',
      expect.objectContaining({ feedback_type: 'verdict_signal' })
    );
  });
});

No live API keys. No database. Runs in < 100ms per test. The mock pattern is consistent across all 8 agents — copy the mockCtx shape above and adjust vi.fn() return values for each test scenario.

The time math

Task	Without CLI	With CLI
New workflow scaffold	4 hours	5 minutes
New agent	45 minutes	3 minutes
Install preset	30 minutes	2 minutes
Workforce renderer	20 minutes	2 minutes
Test harness setup	60 minutes	Built in
Total for full workflow	~8 hours	~45 minutes

At 10 workflows, that's 73 hours saved. The CLI pays for itself after the first feature.

What's coming next in the CLI

human workflow test --live — runs a full workflow against a sandbox environment with mocked connectors
human workflow publish — packages the workflow as a marketplace bundle and validates the manifest
human workflow fork signals — forks the Signals reference workflow into your namespace with all references updated
human agent swap signals --slot executive_brief --with my-custom-brief — swaps a worker binding in the manifest

Start building: npm install -g @human/cli → human workflow create → ship something worth monitoring.

Community: community.builtwithhuman.com