Source: ai-research/plutos-eth-loops-vs-fable5-gap-2026-06-14.md — Author: plutos (@plutos_eth) · URL: https://x.com/plutos_eth/status/2064470776611504188 (X Article: “I Stopped Prompting Claude and Started Building Loops. Here’s the Gap Fable 5 Opened.“) · Posted: 2026-06-09 · 7.7K views. Mythos capabilities + costs are attributed to Anthropic’s Frontier Red Team and VentureBeat; token-cost estimates are the author’s standalone assertions; failure modes are credited to Geoffrey Huntley and Addy Osmani.

The six building blocks of a loop are by now well-covered (Osmani’s essay is the origin; the Cobus reference is the catalog). This article captures @plutos_eth’s distinct contribution: the decision and risk layerwhether you should build a loop at all, what it actually costs, how it fails silently, and the security tax nobody budgets for. Its blunt headline: “most developers don’t need a loop yet,” and a loop pointed at the wrong task “costs more than it returns, forever.”

Key Takeaways

  • The four-condition test — miss one box, keep it a manual prompt: (1) the task repeats at least weekly (a loop amortizes its setup across runs); (2) verification is automated (a test/type-check/linter/build that can fail the work without you in the room); (3) your token budget can absorb the waste (“obvious to people with effectively free tokens, reckless to people on a $20 consumer plan — both groups are right”); (4) the agent has a senior engineer’s tools (logs, repro env, the ability to run what it writes).
  • The economics are unforgiving. A single-agent loop on a medium task burns 50k–200k tokens; a fleet with an orchestrator + 3 specialists 500k–2M; a daily-scheduled loop millions a week. “Loops re-read context, retry, and explore — they spend whether or not the run ships anything.”^[author’s standalone estimates]
  • The only metric that matters: cost per accepted change — not tokens spent, not tasks attempted. “If fewer than half the loop’s outputs survive your review unchanged, you’re doing the review work the loop was supposed to remove, and the loop is losing.”
  • Four silent failure modes, all fixed by an objective gate: the Ralph Wiggum loop (Geoffrey Huntley — emits “done” early, exits on half-done work, keeps spending), self-preferential bias (maker grades its own homework), agentic laziness (“done enough” at partial completion), and goal drift (constraints evaporate by turn 47 as summarization loses them). “Not a verifier with an opinion” — a pass/fail test, a build, a zero/non-zero linter. Goal drift’s fix is a standing VISION.md re-read every run.
  • The security tax sharpens as the loop speeds up. Unattended loops merge insecure code on autopilot; community skills are injection vectors; long-running loops scatter secrets into debug logs; permission scope creep (“just one write permission, never re-audited”) is the quiet killer.
  • The skill gap of 2026: a prompt engineer writes better instructions and is the feedback loop; a loop engineer writes the VISION.md, the gate, and the stop condition, then walks away and trusts the verifier. “The tools are identical. The mindset isn’t.”

The Security Tax — 30-Day Loop Checklist

The article’s most reusable artifact (re-run every 30 days):

  • Gate includes SAST + dependency audit + secret scanning
  • No skill auto-install — read the source before adding one
  • Verbose logging OFF in production loops; sanitize what’s logged
  • Permissions re-audited; remove every scope added “temporarily”
  • Human approval gate on merge / deploy / dependency change

The skill-injection point corroborates the wiki’s skill-security thread: “one audit found credentials leaking in hundreds of public skills out of seventeen thousand” — the same risk surface SkillSpector scans for. A loop that auto-installs community skills inherits every prompt injection in their descriptions.

The Mythos Framing (and a wiki clarification)

The article’s hook: Anthropic’s unreleased Mythos Preview red-team model autonomously found a 27-year-old OpenBSD DoS bug, a 16-year-old FFmpeg flaw (on a line fuzzers had hit 5M times), and a Linux-kernel root chain — and was deemed too dangerous to release (Project Glasswing restricted access). Per VentureBeat, the discovery campaign cost ~50 — “the expensive part wasn’t intelligence, it was the system around the model — the loop.” That cost asymmetry is the article’s argument for why loop engineering, not model access, is the leverage you can actually pick up this month.

Fable 5 ≠ Mythos Preview

The title says “Fable 5” but the body describes the unreleased Mythos Preview red-team model. The wiki’s record: Fable 5 (Mythos-class) did ship 2026-06-09, while Mythos Preview is the separate internal frontier-reference model. The cyber findings + costs map to the Mythos cyber-capabilities story, not to the shipped Fable 5. Treat the article’s “best model locked in a vault” as the red-team model, not the product.^[wiki clarification — the source conflates the two]

Try It

  • Run the four-condition test before building anything. If the task doesn’t recur weekly, has no automated gate, would blow your token budget, or the agent lacks real tooling — keep it a prompt. This is the cheapest decision in the whole topic.
  • Instrument cost-per-accepted-change from run one. It’s the single number that tells you whether the loop is winning; tokens-spent and tasks-attempted are vanity metrics.
  • Build in order, not all at once: one manual run that works → a skill → an automation → a state file → a gate → then schedule it. “Skip ahead and you’re paying for a swarm before you have a single run that works.”
  • Put the security checklist on a 30-day calendar reminder the moment a loop goes unattended.

Open Questions

  • The token-cost ranges (50k–200k / 500k–2M / millions-per-week) are the author’s experience-based estimates, not measured benchmarks — treat as directional.
  • 7.7K views, author not an established authority; the cited claims (Mythos costs/finds via VentureBeat + Red Team) are verifiable, the prescriptive economics are opinion. Weighted accordingly (confidence: medium).