Source: wiki synthesis: Loop Engineering (Cobus Greyling), Loop Engineering (Addy Osmani’s Essay), Reflecting on a Year of Claude Code (Cherny & Wu), Dynamic Workflows, 12-Factor Agents, Clawd Chief, The Verification Frontier
Across 2026 the wiki kept documenting the same shift from two unconnected directions. From inside Anthropic, Boris Cherny describes a career that went source code → agent → loop: “I don’t talk to an agent anymore. I talk to a loop… and it prompts Claude for me.” From the tool-agnostic OSS world, Cobus Greyling and Addy Osmani name the identical move — “replacing yourself as the person who prompts the agent; design the system that does it instead” — and ship it as a pattern catalog with primitives, a readiness ladder, and a failure-mode index. This article connects those threads: the loop has become the unit of agentic work, the leverage point migrated prompt → harness → loop, and the single primitive that turns a loop from reckless to shippable is the maker/checker verifier — which is the verification frontier thesis applied one level down.
Key Takeaways
- One pattern, two vocabularies. Claude-Code-native practice (
/loop,/goal, Routines, Dynamic Workflows) and tool-agnostic frameworks (loop-engineering, 12-Factor Agents, Clawd Chief’s “agents are cron jobs and markdown files”) are describing the same control system. loop-engineering is the explicit naming/generalization of what Anthropic’s team already does — its ownsources.mdcites Cherny and Osmani as the seed. - The leverage ladder is prompt → harness → loop. Prompt engineering tunes one turn; harness engineering sets up one session; loop engineering schedules + verifies + persists across many sessions. Each rung is a strictly larger unit of leverage.
- There is a shared loop anatomy. Every source assembles the same parts: a scheduler, a triage skill, durable state, isolated worktrees, a maker/checker split, connectors, and a human gate. The vocabulary differs; the wiring doesn’t.
- Verification is the gate that unlocks autonomy. loop-engineering’s L0→L3 readiness ladder is the same trust gradient the verification frontier describes: “no verification → no autonomy.” Boris earned trust in auto-mode the same way loop-engineering demands a verifier — by turning red-team attacks into evals.
- The failure modes are now catalogued. What used to be folklore (infinite fix loops, verifier theater, token burn, notification fatigue, cognitive surrender) is a documented S1/S2/S3 catalog you can design against — the operational complement to the verification thesis.
The leverage ladder: prompt → harness → loop
The connecting spine is a migration of what an engineer interacts with, and both clusters draw the same three rungs:
| Rung | Unit | Claude-Code framing | Tool-agnostic framing |
|---|---|---|---|
| Prompt | one turn | the message you type | (the thing loop engineering replaces) |
| Harness | one session | minimal system prompt + tools; “be a context minimalist” | Osmani’s agent harness engineering — the single-session sandbox |
| Loop | many sessions over time | ”talk to a loop / routine, it prompts Claude” (Cherny’s 2nd leap) | loop-engineering: schedule + state + verification chain |
Cherny frames it as two leaps — source code → agent, then agent → loop. Osmani’s vocabulary makes the harness/loop boundary explicit: Harness = single session setup; Loop = harness + schedule + state + verification. Same ladder, two namings. Karpathy’s vibe-coding → agentic-engineering is the third independent statement of the same progression.
Two namings of one pattern
The payoff of connecting these articles is seeing that the Claude-native and tool-agnostic vocabularies are a one-to-one map:
| The move | Claude-Code-native (claude-ai) | Tool-agnostic (agents-agentic-systems) |
|---|---|---|
| Self-prompting heartbeat | scheduled tasks / Routines / /loop | Automations / Scheduling primitive |
| Encode intent once, read every run | ”write the fix to CLAUDE.md or a skill” so Claude “runs forever” | Skills primitive (pays down intent debt) |
| Safe parallelism | desktop app auto work-tree cloning | Worktrees (isolation: worktree) |
| Don’t grade your own homework | auto-mode classifier; “can the agent run the thing?” | Sub-agents maker/checker split; verifier on a stronger model |
| Durable spine outside the chat | repo state / issue trackers | Memory / State primitive (STATE.md) |
| Reach real tools | MCP connectors | Plugins & Connectors (MCP) primitive |
| Orchestrate many at once | Dynamic Workflows (hundreds–thousands of agents) | the seven patterns + multi-loop coordination |
Cherny’s “agents are cron jobs and markdown files” twin — Ryan Carson’s Clawd Chief — sits exactly in the middle of this table: two load-bearing markdown files (priority-map, auto-resolver) + a 15-minute schedule is a loop with the scheduling, skills, and state primitives and a deferred verifier.
Verification is what makes a loop shippable
This is where the connection closes back onto the verification frontier. That article’s law — AI loops compound where verification is cheap and stall where it’s expensive — is the theory; loop engineering is the operating manual for the cheap-verification case:
- The readiness ladder is a verification gradient. L1 report-only needs no verifier (it doesn’t act). L2 assisted requires a separate verifier + worktree + max-attempts. L3 unattended requires proven verification and cost observability. You climb the ladder exactly as fast as your verification surface grows — Jeremy’s “no verification → no autonomy” stated as a checklist.
- Maker/checker is cheap verification at the loop level. loop-engineering’s “the implementer must never grade its own homework” is the same adversarial-verification pattern Dynamic Workflows bakes into its orchestrator (a separate agent checks each finding) and that AutoAgent reduces to a reward file. “Verifier Theater” (verifier approves, CI fails) is the named failure when this verification is fake.
- Anthropic earned auto-mode trust the loop-engineering way. Cherny’s team “collected thousands of agent trajectories, brought in red-teamers, turned attacks into evals, iterated until all were denied.” That is building the verifier before granting autonomy — the same discipline loop-engineering’s checklist enforces, at frontier scale.
The corollary: as generation gets cheap, the loop’s bottleneck moves to review (Amdahl’s law), which is why loop-engineering treats comprehension debt and cognitive surrender as first-class S2 failures — the human-judgment frontier the verification thesis says stays expensive to verify.
A related sub-pattern: loops that improve themselves
One branch of the agentic-systems cluster runs the loop on the harness itself: Browserbase Autobrowse, Reflexio, and AutoAgent all hill-climb on a cheap criterion, then graduate a successful run into a reusable SKILL.md/playbook. That’s the Skills primitive being written by the loop instead of by you — the same converge-then-graduate ratchet Karpathy’s AutoResearch walkthrough describes, and the mechanism by which a loop pays down its own intent debt over time.
What combining them enables
- A shared checklist for any loop, on any tool. loop-engineering’s design checklist + failure catalog are tool-agnostic, so they apply equally to a Claude Code
/loop, a Codex Automation, or a GitHub Action — a portable rubric the Claude-native docs never wrote down. - A cost lens the AIOS pattern lacks. The AIOS pattern gets you context + skills + cadence; loop engineering adds the operating layer the AIOS leaves implicit — token budgets, cadence-as-multiplier math, kill switches, run logs.
- A vocabulary that survives tool switches. Because the primitives map cleanly across Grok / Claude Code / Codex / Actions, a team can move runtimes without re-learning the pattern — the same portability bet 12-Factor Agents makes for LLM apps generally.
Try It
- Place every automation on the readiness ladder. For each
/loop, Routine, or scheduled agent you run, label it L1/L2/L3 and check it has the verification its rung requires. Anything acting at L2+ without a separate verifier is a Verifier-Theater risk — fix that first. - Run the failure catalog as a pre-mortem. Before enabling a new loop, walk the S1/S2/S3 list (infinite fix loop, token burn, notification fatigue, over-reach, escalation failure) and name the mitigation for each. This is the cheapest insurance loop engineering offers.
- Invest in the verifier, not the generator. Per the verification frontier: the biggest unlock on a stuck loop is usually a cheaper verification surface — a test suite, an eval, an adversarial-review subagent — not a better prompt.
- Encode the fix, not the correction. Adopt Cherny’s rule across your loops: when the agent errs, write the lesson into
CLAUDE.mdor a skill (the Skills primitive) so the loop stops re-making it — the difference between a loop that drifts and one that compounds.
Related
- Loop Engineering — Addy Osmani’s Essay — the canonical primary essay this synthesis names as a seed; “replace yourself as the prompter,” five primitives across Claude Code + Codex.
- Loop Engineering (Cobus Greyling) — the tool-agnostic pattern catalog at the center of this synthesis: six primitives, seven patterns, readiness ladder, failure modes.
- Verifier-First Loops — the maker/checker verifier discipline this article calls the shippability gate, stated as a pre-flight checklist.
- Should You Build a Loop? — the cost/security operating layer (four-condition test, token math, security tax) that complements the readiness ladder.
- Reflecting on a Year of Claude Code — Cherny’s “source → agent → loop” two-leaps framing and the first-party “my job is to write loops.”
- The Verification Frontier — the law beneath the readiness ladder: loops compound only where verification is cheap.
- Dynamic Workflows — the Claude-native orchestrate-many-agents mechanism with adversarial verification built in.
- 12-Factor Agents — “own your control flow / context / prompts”; the same discipline for LLM apps, portable across tools.
- Clawd Chief — “agents are cron jobs and markdown files”; the loop pattern at solo-founder scale.
- The 2026 Claude Code AIOS Pattern — the surrounding OS (context + skills + cadence); loops are its orchestration layer.
- From Vibe Coding to Agentic Engineering — Karpathy’s independent statement of the same source → agent → loop progression.
- Agent Loops (topic) — the hands-on learning path for this material; the Write Loops, Not Prompts explainer traces the ReAct → AutoGPT → Ralph Loop →
/goallineage behind the prompt → harness → loop shift.
Open Questions
- Independent convergence or shared lineage? loop-engineering explicitly cites Cherny and Osmani, so the OSS naming is partly downstream of Anthropic practice rather than a fully independent discovery. Clawd Chief and 12-Factor look more parallel. The exact dependency graph is fuzzy.^[inferred]
- Where do
/loop,/goal, and “routine” precisely differ? The first-party sources use the terms loosely; the crisp primitive boundaries live in scheduled tasks + Routines, not in the loop-engineering catalog. - Does the loop become the unit at team scale? Every source here is single-operator or single-repo. Whether “the team talks to loops” holds for multi-operator orgs is the same open question the AIOS pattern flags.