Agent loops — small programs (or slash-commands, workflows, schedules) that prompt an agent for you, read what it produced, decide whether it’s done, and reprompt until it is. This topic is a hands-on learning path for understanding and safely building them: the lineage, the mental model, the cost controls that keep them from burning your budget, the verification discipline that makes them trustworthy, and concrete loops you can run this week.

The leverage claim that started the topic (Boris Cherny, creator of Claude Code): “I don’t prompt Claude anymore. I have loops running that prompt Claude… My job is to write loops.” This topic is where the wiki learns that craft.

Start Here — a learning path

  1. Mental model + lineageWrite Loops, Not Prompts (explainer). What a loop is, how we got here (ReAct → AutoGPT → Ralph Loop → /goal → agent loops), the 3 cost controls, and 3 beginner starter loops. Begin here.
  2. The thesis (primary source)Loop Engineering — Addy Osmani’s Essay. The canonical essay that named the practice: “replace yourself as the prompter,” the five building blocks, and the argument that they now ship inside both Claude Code and Codex.
  3. The reference catalogLoop Engineering (Cobus Greyling). Six primitives, seven production patterns, the L0→L3 readiness ladder, and a failure-mode catalog you can design against. The deeper “how to build it right” layer (built on the essay above).
  4. The verification disciplineVerifier-First Loops. Write the verifier before you launch the loop; proof outside the agent (tests, screenshots, artifacts). The topic’s load-bearing open question, answered.
  5. The decision + economicsShould You Build a Loop?. The four-condition test, the token-cost math, and the security tax — when not to build one.
  6. The synthesisThe Loop Is the Unit of Work. Why the leverage point moved prompt → harness → loop, and how the Claude-native and tool-agnostic vocabularies are one pattern.
  7. The first-party sourceReflecting on a Year of Claude Code. Boris Cherny & Cat Wu on “talk to a loop, not an agent,” /babysit, and auto-mode.

Articles in this topic

  • Write Loops, Not Prompts — Agent Loops Explained — lineage (ReAct/AutoGPT/Ralph Loop//goal), the one-sentence definition, three cost controls (max iterations · no-progress detection · token ceilings), skills-as-the-thing-that-makes-loops-work, three starter loops (issue-backlog · front-end-verification · code-review/babysit), and the readiness gate. Start here.
  • Loop Engineering — Addy Osmani’s Essay (the canonical thesis) — the primary source that named loop engineering (X Article, 2026-06-08, 1.8M views): “replace yourself as the prompter,” the five building blocks (+ memory) mapped onto both Claude Code and Codex, a worked daily-triage loop example, and the residual risks (verification, comprehension debt, cognitive surrender). The Cobus repo below is built on this.
  • Loop Engineering — Cobus Greyling’s Cross-Tool Pattern Reference + CLIs — the reference catalog: six primitives, seven production patterns, the L0→L3 readiness ladder, an S1/S2/S3 failure-mode catalog, cross-tool examples (Grok / Claude Code / Codex / GitHub Actions), and the loop-audit/loop-init/loop-cost npm CLIs.
  • Verifier-First Loops — Proof Outside the Agent — the verification discipline (omarsar0 · alphabatcher · Karpathy): write the verifier first (done-condition · per-pass check · saved artifact · failure→retry), keep proof outside the agent’s self-report, split plan/execute/evaluate across model families, and prefer multimodal goals. Anchored on Karpathy’s “if you can’t evaluate, you can’t auto-research it.”
  • Should You Build a Loop? The Four-Condition Test, Cost Math & Security Tax — the decision/economics layer (plutos_eth): the four-condition test for whether to build a loop at all, token-cost ranges, cost-per-accepted-change, four silent failure modes (Ralph Wiggum · self-preference · agentic laziness · goal drift), and a 30-day loop security checklist.
  • The Loop Is the Unit of Work — the synthesis: how the leverage point migrated prompt → harness → loop, why the Claude-native and tool-agnostic vocabularies are one pattern, and why the maker/checker verifier is what makes a loop shippable (the verification gradient = the readiness ladder).

Core mechanisms (Claude Code)

  • Dynamic Workflows — orchestration logic that lives outside the agent (a JS function), with per-sub-agent token caps and max-iteration variables.
  • [[claude-ai/claude-code-goal-command-walkthrough|/goal Walkthrough]] — the completion-condition loop (a Ralph Loop that knows when it’s done).
  • [[claude-ai/scheduled-tasks|/loop & scheduled tasks]] and Routines — the scheduling heartbeat that turns a one-shot into a standing loop.

The hard parts (open questions)

Verification is the load-bearing, still-underdeveloped piece — against what standard do you review the loop’s output? See The Verification Frontier and this topic’s research agenda for the open questions (handling ambiguity, where planning lives, where the human steps back in).