Source: wiki synthesis: Loop Engineering (Cobus Greyling), Loop Engineering (Addy Osmani’s Essay), Reflecting on a Year of Claude Code (Cherny & Wu), Dynamic Workflows, 12-Factor Agents, Clawd Chief, The Verification Frontier

Across 2026 the wiki kept documenting the same shift from two unconnected directions. From inside Anthropic, Boris Cherny describes a career that went source code → agent → loop: “I don’t talk to an agent anymore. I talk to a loop… and it prompts Claude for me.” From the tool-agnostic OSS world, Cobus Greyling and Addy Osmani name the identical move — “replacing yourself as the person who prompts the agent; design the system that does it instead” — and ship it as a pattern catalog with primitives, a readiness ladder, and a failure-mode index. This article connects those threads: the loop has become the unit of agentic work, the leverage point migrated prompt → harness → loop, and the single primitive that turns a loop from reckless to shippable is the maker/checker verifier — which is the verification frontier thesis applied one level down.

Key Takeaways

  • One pattern, two vocabularies. Claude-Code-native practice (/loop, /goal, Routines, Dynamic Workflows) and tool-agnostic frameworks (loop-engineering, 12-Factor Agents, Clawd Chief’s “agents are cron jobs and markdown files”) are describing the same control system. loop-engineering is the explicit naming/generalization of what Anthropic’s team already does — its own sources.md cites Cherny and Osmani as the seed.
  • The leverage ladder is prompt → harness → loop. Prompt engineering tunes one turn; harness engineering sets up one session; loop engineering schedules + verifies + persists across many sessions. Each rung is a strictly larger unit of leverage.
  • There is a shared loop anatomy. Every source assembles the same parts: a scheduler, a triage skill, durable state, isolated worktrees, a maker/checker split, connectors, and a human gate. The vocabulary differs; the wiring doesn’t.
  • Verification is the gate that unlocks autonomy. loop-engineering’s L0→L3 readiness ladder is the same trust gradient the verification frontier describes: “no verification → no autonomy.” Boris earned trust in auto-mode the same way loop-engineering demands a verifier — by turning red-team attacks into evals.
  • The failure modes are now catalogued. What used to be folklore (infinite fix loops, verifier theater, token burn, notification fatigue, cognitive surrender) is a documented S1/S2/S3 catalog you can design against — the operational complement to the verification thesis.

The leverage ladder: prompt → harness → loop

The connecting spine is a migration of what an engineer interacts with, and both clusters draw the same three rungs:

RungUnitClaude-Code framingTool-agnostic framing
Promptone turnthe message you type(the thing loop engineering replaces)
Harnessone sessionminimal system prompt + tools; “be a context minimalist”Osmani’s agent harness engineering — the single-session sandbox
Loopmany sessions over time”talk to a loop / routine, it prompts Claude” (Cherny’s 2nd leap)loop-engineering: schedule + state + verification chain

Cherny frames it as two leapssource code → agent, then agent → loop. Osmani’s vocabulary makes the harness/loop boundary explicit: Harness = single session setup; Loop = harness + schedule + state + verification. Same ladder, two namings. Karpathy’s vibe-coding → agentic-engineering is the third independent statement of the same progression.

Two namings of one pattern

The payoff of connecting these articles is seeing that the Claude-native and tool-agnostic vocabularies are a one-to-one map:

The moveClaude-Code-native (claude-ai)Tool-agnostic (agents-agentic-systems)
Self-prompting heartbeatscheduled tasks / Routines / /loopAutomations / Scheduling primitive
Encode intent once, read every run”write the fix to CLAUDE.md or a skill” so Claude “runs forever”Skills primitive (pays down intent debt)
Safe parallelismdesktop app auto work-tree cloningWorktrees (isolation: worktree)
Don’t grade your own homeworkauto-mode classifier; “can the agent run the thing?”Sub-agents maker/checker split; verifier on a stronger model
Durable spine outside the chatrepo state / issue trackersMemory / State primitive (STATE.md)
Reach real toolsMCP connectorsPlugins & Connectors (MCP) primitive
Orchestrate many at onceDynamic Workflows (hundreds–thousands of agents)the seven patterns + multi-loop coordination

Cherny’s “agents are cron jobs and markdown files” twin — Ryan Carson’s Clawd Chief — sits exactly in the middle of this table: two load-bearing markdown files (priority-map, auto-resolver) + a 15-minute schedule is a loop with the scheduling, skills, and state primitives and a deferred verifier.

Verification is what makes a loop shippable

This is where the connection closes back onto the verification frontier. That article’s law — AI loops compound where verification is cheap and stall where it’s expensive — is the theory; loop engineering is the operating manual for the cheap-verification case:

  • The readiness ladder is a verification gradient. L1 report-only needs no verifier (it doesn’t act). L2 assisted requires a separate verifier + worktree + max-attempts. L3 unattended requires proven verification and cost observability. You climb the ladder exactly as fast as your verification surface grows — Jeremy’s “no verification → no autonomy” stated as a checklist.
  • Maker/checker is cheap verification at the loop level. loop-engineering’s “the implementer must never grade its own homework” is the same adversarial-verification pattern Dynamic Workflows bakes into its orchestrator (a separate agent checks each finding) and that AutoAgent reduces to a reward file. “Verifier Theater” (verifier approves, CI fails) is the named failure when this verification is fake.
  • Anthropic earned auto-mode trust the loop-engineering way. Cherny’s team “collected thousands of agent trajectories, brought in red-teamers, turned attacks into evals, iterated until all were denied.” That is building the verifier before granting autonomy — the same discipline loop-engineering’s checklist enforces, at frontier scale.

The corollary: as generation gets cheap, the loop’s bottleneck moves to review (Amdahl’s law), which is why loop-engineering treats comprehension debt and cognitive surrender as first-class S2 failures — the human-judgment frontier the verification thesis says stays expensive to verify.

One branch of the agentic-systems cluster runs the loop on the harness itself: Browserbase Autobrowse, Reflexio, and AutoAgent all hill-climb on a cheap criterion, then graduate a successful run into a reusable SKILL.md/playbook. That’s the Skills primitive being written by the loop instead of by you — the same converge-then-graduate ratchet Karpathy’s AutoResearch walkthrough describes, and the mechanism by which a loop pays down its own intent debt over time.

What combining them enables

  • A shared checklist for any loop, on any tool. loop-engineering’s design checklist + failure catalog are tool-agnostic, so they apply equally to a Claude Code /loop, a Codex Automation, or a GitHub Action — a portable rubric the Claude-native docs never wrote down.
  • A cost lens the AIOS pattern lacks. The AIOS pattern gets you context + skills + cadence; loop engineering adds the operating layer the AIOS leaves implicit — token budgets, cadence-as-multiplier math, kill switches, run logs.
  • A vocabulary that survives tool switches. Because the primitives map cleanly across Grok / Claude Code / Codex / Actions, a team can move runtimes without re-learning the pattern — the same portability bet 12-Factor Agents makes for LLM apps generally.

Try It

  1. Place every automation on the readiness ladder. For each /loop, Routine, or scheduled agent you run, label it L1/L2/L3 and check it has the verification its rung requires. Anything acting at L2+ without a separate verifier is a Verifier-Theater risk — fix that first.
  2. Run the failure catalog as a pre-mortem. Before enabling a new loop, walk the S1/S2/S3 list (infinite fix loop, token burn, notification fatigue, over-reach, escalation failure) and name the mitigation for each. This is the cheapest insurance loop engineering offers.
  3. Invest in the verifier, not the generator. Per the verification frontier: the biggest unlock on a stuck loop is usually a cheaper verification surface — a test suite, an eval, an adversarial-review subagent — not a better prompt.
  4. Encode the fix, not the correction. Adopt Cherny’s rule across your loops: when the agent errs, write the lesson into CLAUDE.md or a skill (the Skills primitive) so the loop stops re-making it — the difference between a loop that drifts and one that compounds.

Open Questions

  • Independent convergence or shared lineage? loop-engineering explicitly cites Cherny and Osmani, so the OSS naming is partly downstream of Anthropic practice rather than a fully independent discovery. Clawd Chief and 12-Factor look more parallel. The exact dependency graph is fuzzy.^[inferred]
  • Where do /loop, /goal, and “routine” precisely differ? The first-party sources use the terms loosely; the crisp primitive boundaries live in scheduled tasks + Routines, not in the loop-engineering catalog.
  • Does the loop become the unit at team scale? Every source here is single-operator or single-repo. Whether “the team talks to loops” holds for multi-operator orgs is the same open question the AIOS pattern flags.