Source: ai-research/addyosmani-loop-engineering-essay-2026-06-14.md — Author: Addy Osmani (@addyosmani) · URL: https://x.com/addyosmani/status/2064127981161959567 (X Article) · Posted: 2026-06-08 · 1.8M views. Claims about product capabilities (Codex Automations tab, Claude Code /loop / /goal / worktrees) are first-hand author observations; the bcherny and steipete quotes are reproduced verbatim and corroborated elsewhere in the wiki.
This is the canonical primary essay that named and popularized loop engineering — the source the rest of the moment (including Cobus Greyling’s tooling repo, built the next day) derives from. Osmani’s thesis: “Loop engineering is replacing yourself as the person who prompts the agent. You design the system that does it instead.” A loop is a recursive goal — you define a purpose and the agent iterates until done. His sharpest, most-cited observation is that this is “not really a tool thing anymore”: the five building blocks now ship inside both Claude Code and Codex, so you design a loop shape that survives whichever tool you happen to be in.
Key Takeaways
- The leverage point moved, the work didn’t get easier. For two years you wrote a prompt, read the diff, prompted again — you were the loop. Now you build a small system that finds the work, hands it out, checks it, records what’s done, and decides the next thing — and you let it poke the agents. Osmani frames this as one floor above agent harness engineering: “the harness, but it runs on a timer, it spawns little helpers, and it feeds itself.”
- The named originators. @steipete: “You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.” @bcherny (head of Claude Code): “I don’t prompt Claude anymore. I have loops running that prompt Claude… My job is to write loops.” (Same quote sourced in Reflecting on a Year of Claude Code.)
- Five building blocks + memory — and both products have all five. The capability is identical across Claude Code and Codex; only the names differ. This product-symmetry argument is Osmani’s distinctive contribution.
- Osmani is openly skeptical. “Its still early, I’m skeptical and you absolutely have to be careful about token costs… concerns re: slop are valid.” Prompting directly is “still effective” — “it’s all about finding the right balance.”
- The residual risks sharpen as the loop improves, not the reverse: verification (“‘done’ is a claim and not a proof”), comprehension debt, and cognitive surrender. “Build the loop. Stay the engineer.”
The Five Building Blocks (+ Memory)
Osmani’s framing maps each primitive onto both products — the heart of the essay. (For the deeper pattern catalog, readiness ladder, and CLIs built on top of these, see the Cobus Greyling reference.)
| Block | Job | Claude Code | Codex |
|---|---|---|---|
| Automations (the heartbeat) | Schedule discovery + triage so it’s a loop, not a one-off | /loop (cadence), cron tasks, hooks, GitHub Actions | Automations tab → Triage inbox; empty runs self-archive |
| Worktrees | Parallel agents that can’t collide on files | git worktree, --worktree flag, isolation: worktree on subagents | Built-in worktree support |
| Skills | Project knowledge written once, read every run | SKILL.md folder; pays down intent debt | $skill-name / /skills; ship across repos as a plugin |
| Plugins & Connectors (MCP) | Reach real tools — issues, DB, staging API, Slack | MCP servers | MCP — same connector usually works in both |
| Sub-agents (maker/checker) | The one who writes ≠ the one who checks | subagents in .claude/agents/, agent teams | agents as TOML in .codex/agents/ (per-agent model + effort) |
| + Memory / State | Durable spine outside the conversation — “the agent forgets, the repo doesnt” | STATE.md / markdown / Linear board | same |
The in-session primitive that is the whole point: /loop re-runs on a cadence; /goal keeps going until a condition you wrote is verified true by a separate small model — “so the agent that wrote the code isnt the one grading it.” Codex has /goal too (pause/resume/clear). The maker/checker split applied to the stop condition itself.
The Worked Example (what one loop looks like)
An automation runs every morning. Its prompt calls a triage skill that reads yesterday’s CI failures, open issues, and recent commits, and writes findings to a state file (markdown or Linear). For each finding worth doing, the thread opens an isolated worktree and sends a sub-agent to draft the fix; a second sub-agent reviews it against the project skills and existing tests. Connectors open the PR and update the ticket. Anything it can’t handle lands in the triage inbox. The state file remembers what was tried, so tomorrow’s run picks up where today stopped.
“You designed it one time. You did not prompt any of those steps… and its the same loop in Codex or in Claude Code because the pieces are the same pieces.”
The Vocabulary Osmani Standardizes
The essay is also where much of the agentic-dev glossary comes from (the Cobus repo inherits it): agent harness engineering (the single-session sandbox), the factory model (the system that builds the software), intent debt (cold-start guesses, paid down by skills), comprehension debt (gap between what exists and what you understand), the orchestration tax (human cost of coordinating parallel agents), and code agent orchestra / adversarial code review (different agents for explore / implement / verify).
Try It
- Read this first, then the tooling. This essay is the why and the product-symmetry map; the Cobus reference is the how (L0→L3 ladder, seven patterns,
loop-audit/loop-init/loop-costCLIs). Read the verifier-first discipline before you let anything run unattended. - Pick one block to start: scheduling + one triage skill + a state file is the minimum viable loop. Add worktrees, connectors, and the checker sub-agent only once the prior version proves its value.^[inferred — Osmani lists the blocks but the “minimum viable loop” ordering is the wiki’s synthesis across this essay and the Cobus repo]
- Map it onto primitives you already have in Claude Code:
/loop,/goal, scheduled tasks/Routines,isolation: worktree, and subagents. This essay’s claim is you don’t need to write-and-maintain a bash pile anymore — the pieces ship in the product. - Hold the line on review. Osmani’s own caveat: if he stopped reviewing the code or relied entirely on automated loops, “my product’s quality would suffer… a downward spiral.” Use the loop on work you understand; don’t use it to avoid understanding.
Related
- Loop Engineering — Cobus Greyling’s Reference + CLIs — the tooling layer built on top of this essay (created 2026-06-09, the day after); six-primitive taxonomy, L0→L3 readiness ladder, failure-mode catalog, npm CLIs.
- Write Loops, Not Prompts — the topic’s beginner entry point (lineage, three cost controls, three starter loops).
- Verifier-First Loops — the verification deep-dive (omarsar0 · alphabatcher · Karpathy): write the verifier before you launch the loop.
- Should You Build a Loop? The Four-Condition Test, Cost Math & Security Tax — the decision/economics layer (plutos_eth): when not to build one.
- The Loop Is the Unit of Work — the synthesis: how the leverage point migrated prompt → harness → loop.
- agent-skills (Addy Osmani) — Osmani’s companion essay; “skills pay down intent debt” is the primitive made concrete.
- Reflecting on a Year of Claude Code — Boris Cherny & Cat Wu, the first-party “my job is to write loops” source.
- Dynamic Workflows (Claude Code) — the Claude-Code-native loop mechanism this essay maps patterns onto.
Open Questions
- Osmani’s product-symmetry claim (“both products have all five now”) is a 2026-06-08 snapshot; the Codex Automations tab and Claude Code primitives are both moving fast — re-verify the parity claim on refresh.^[inferred]
- The full X Article is login-gated; body reproduced via Grok x_search. Verbatim wording is high-confidence (corroborated by the derivative Cobus repo and independently-known bcherny/steipete quotes) but not directly machine-extracted.