Agents Should Learn. But Not Like This.
The tension everyone feels
Founders want agents that improve. Security and compliance teams fear silent behavior change. Both are right.
The industry’s default is not malice — it is convenience: stash embeddings in a side table, tweak routing in application code, call it “learning,” and move on. That is how you ship usefulness without inspectability — the trap this series is written to avoid.
What “wrong” looks like
- Hidden memory: outcomes never become durable, auditable feedback tied to scope.
- Silent adaptation: preferences move without a proposal, approval, or rollback path.
- Policy by folklore: thresholds and gates live in constants and Slack threads instead of
GET /v1/humanos/policy/effective.
What “right” starts with
HUMΛN’s answer is not “don’t learn.” It is: learning and adaptation are first-class HumanOS capabilities — feedback append-only, tuning materialized with provenance, policy merged deterministically, preferences distinguished from policy.
You will see the same REST surface everywhere in this series — for example POST /v1/humanos/feedback/events and POST /v1/humanos/learning/proposals — because path accuracy is trust for developers who copy-paste.
Next in the series
Part 2 — Memory, feedback, adaptation, capability names the four ideas people collapse into “learning.”
Related (HumanOS primitives): Learning, tuning, and rollback · Policy threshold and config
Learning & Adaptation — Part 1 of 6
Code & Docs