Source: raw/I_figured_out_the_best_way_to_vibe_code.md (YouTube — Matthew Berman, “I figured out the best way to vibe code”, https://www.youtube.com/watch?v=wwfJlSF34n8)

A practitioner walkthrough of how Matthew Berman — a creator who uses every major agentic coding tool professionally — actually structures his AI coding workflow. Berman’s thesis: beginners prompt, wait, review, and prompt again; experts automate the whole loop. Most of the conceptual spine here (a loop is a trigger + repeated action + goal; worktrees; multi-model; agents.md/skills) is already covered in depth by the rest of this topic, so this article focuses on Berman’s distinctive, concrete contributions: a free Loop Library, three named loops he runs nightly, the test/docs/logging flywheel, the automation race-condition trick, and his honest “still unsolved” call on parallel merge-and-deploy. As a creator video, the macro framing (“there are levels to AI coding”) is opinion; the loops and setups below are the reproducible parts.

Key Takeaways

  • The expert move is automating the prompt-review-reprompt cycle, not prompting faster. Berman’s primary harnesses are Cursor and Codex; he rates Claude Code, Devin, and Factory highly too (he uses Claude Code less only because he burns quota fast).
  • New named resource: the Loop Library — a free, community-submittable catalog of reusable agent loops at signals.forwardfuture.ai/loop-library, announced for the first time in this video. It is the Berman-curated counterpart to Cobus Greyling’s pattern registry.
  • A loop = three things: (1) a trigger to start it, (2) an action it repeats, (3) a goal that stops it. (Same definition the rest of this topic uses — see Write Loops, Not Prompts.)
  • Automations and loops pair up: an automation fires an agent prompt on a trigger (schedule or GitHub event); a loop is the repeated-until-goal action it kicks off. Cursor and Codex both ship automations as a first-class feature.
  • The flywheel: keep 100% test coverage, always-fresh docs, and exhaustive logging in every codebase. Each is maintainable by a standing automation — and full logs are what let an agent autonomously diagnose and fix production errors.
  • Parallel merge + deploy at agent scale is genuinely unsolved (Berman’s words, after talking to the OpenAI and Cursor teams). A dozen agents racing to main serialize on CI/deploy and force each other to rebase and restart. Partial workaround: batch commits.
  • Go cloud when running many agents in parallel; stay local for speed, control, and newest features. Berman is migrating his whole workflow to cloud agents because 12-20 local agents crush his machine.
  • Encode the multi-model routing as a skill (his example: plan with Fable, write code with Composer, review with GPT-5.5) — driven by speed and cost, not capability worship.

The Loop Library (the new thing here)

Berman built and is hosting a free Loop Library at signals.forwardfuture.ai/loop-library — loops he uses, loops others use, and a submission path for your own generalized loops. It is meant to be a concrete antidote to “handwavy theoretical” loop talk. Treat it as a companion to the wiki’s existing loop references rather than a replacement: where loop-engineering gives a taxonomy + readiness ladder + failure catalog, the Loop Library is a copy-and-run recipe collection.

Three loops Berman actually runs

Each is a standing automation (scheduled) wrapping a goal-bounded loop. These are the concrete recipes worth stealing:

  • Overnight docs sweep. Trigger: nightly (e.g. 1:00 AM automation). Action: review the full codebase, compare yesterday’s changes against the docs, update any gaps (public README + internal docs). Goal: docs reflect the latest code; open a PR with the changes.
  • Sub-50ms page-load loop. Trigger: manual/on-demand. Action: load every page, modal, and sidebar; for anything over 50 ms, optimize queries and the site. Goal: every surface loads in under 50 ms — “continue until everything loads in under 50 milliseconds.” Berman let this run for hours and reports the app ended up “lightning fast.” (This is the most novel recipe — a performance loop with a hard numeric stop condition.)
  • Production error sweep. Trigger: nightly. Action: read production logs, find errors, analyze the cause, write a fix. Goal: a PR waiting for every error by morning. Depends on full log coverage (see the flywheel below).

Automations: trigger prompt tools (and the race-condition trick)

In both Cursor and Codex, an automation needs a trigger, a prompt (instructions), and optionally memories / tools / MCP servers. Berman’s worked example automates code-review follow-through:

  1. Trigger: GitHub “pull request opened.”
  2. Problem: the PR opens before the code-review bot finishes reviewing.
  3. The trick: write the dependency in plain English — “wait until you see [the reviewer’s] comments on the PR” — and the agent will literally wait. Then: “go through each comment, address them, push the new code back to the PR.”
  4. Cursor auto-detects tools the automation needs (here, the GitHub “comment on pull request” tool) and prompts you to enable them.

Codex is equivalent: build the automation via natural-language chat or manually (title, prompt, repo, schedule, memories, tools). The reusable insight is the “wait until you see X” pattern for sequencing an automation behind an async dependency without polling logic.

Sponsor disclosure

The code-review tool in Berman’s example is Greptile (the video’s paid sponsor) — connect it to a repo and it auto-reviews every PR with a summary, a 0-5 merge-confidence score, a change flowchart, and copy-paste fix prompts. The reusable part is the review auto-fix resubmit automation pattern, which is vendor-independent (it works with any PR-review bot, including [[claude-ai/whats-new-2026-w15|Claude Code’s own /autofix-pr]] cloud CI loop). Treat the specific tool endorsement as sponsored.

The test + docs + logging flywheel

Berman’s “no reason to have suboptimal code anymore” argument rests on three standing automations:

  • Tests: an automation checks coverage; if it is not 100%, it writes tests until it is.
  • Docs: the overnight docs sweep keeps every change documented.
  • Logging: log everything (a 7- or 30-day window is cheap) — because exhaustive logs are the raw material the production-error-sweep loop consumes. No logs, no autonomous error fixing.

The three reinforce each other: perfect tests + perfect docs + perfect logs make the codebase legible enough for agents to maintain it unattended.^[inferred — Berman states the three practices and calls them a flywheel; the “legibility enables unattended maintenance” framing connects them to this topic’s verification thesis]

The unsolved problem: parallel merge & deploy

Berman is candid that merging and deploying many parallel agents is not solved — a rare honest negative result for a creator video, and one the rest of this topic only touches obliquely:

  • A dozen agents all trying to land on main near-simultaneously serialize on CI/deploy. Each merge invalidates the others, which must rebase, re-run tests, and merge again — then re-trigger CI/deploy. They “stumble over each other” and lock the commit/deploy process.
  • Partial workaround — batch commits: open many PRs, then let a single agent review all the changes, combine them, and merge + deploy once. “Far from perfect.”
  • Berman claims Cursor announced (the day of recording) it is building its own Git alternative purpose-built for agent-scale deployment.^[Berman’s claim, as stated in the video — not independently verified here]

Cloud vs local agents (his decision criteria)

Cloud agentsLocal agents
ParallelismEffectively unlimited (datacenter, not your CPU/RAM)Bound by your machine; 12-20 agents “slows to a crawl”
IsolationEach agent gets a fully isolated environment — no cross-agent file conflictsEven per-agent worktrees hit “weird edge cases”
AccessFrom anywhere, incl. mobile appsTied to the machine (remote control possible, but bandwidth-bound)
SpeedPays spin-up latency per agentFaster — environment is always ready
Control / visibilityLess directMore — you see files change locally
Feature freshnessNewest features arrive laterNewest features ship here first

Cloud-specific perk he calls out: Cursor cloud agents auto-record a video + screenshots of the changes so you can verify visually instead of trusting a text summary. Setup caveat: give a cloud agent the full environment (.env.local, client secrets, env vars) you’d give a local one — treat it as a real environment. Berman’s net lean: moving his entire workflow to cloud.

Already covered elsewhere (pointers, not repetition)

Berman also walks through fundamentals this topic and claude-ai/ already document — read those for depth:

  • What a loop is Write Loops, Not Prompts and The Loop Is the Unit of Work.
  • Worktrees for parallel agents (“a second working folder… resolve conflicts at merge”) Worktrees; also primitive #2 in loop-engineering.
  • agents.md / CLAUDE.md / Cursor rules (commit style, message format, model personality, coding prefs; Cursor “rules” write to agents.md and auto-capture learned preferences) CLAUDE.md primer.
  • Skills (“anything you do more than once, make it a skill”; off-the-shelf skills install by pasting a URL; agents auto-discover which skill to use at runtime; use cases: repeated prompts, domain rules, tool instructions, quality gates) agent-skills (Addy Osmani), whose specPRDimplementtestQAdeploy lifecycle matches the ~61k-star “agent skills” repo Berman recommends.^[inferred — Berman names an off-the-shelf “agent skills” repo at ~61k stars with that exact lifecycle; the wiki’s documented match is Osmani’s agent-skills]
  • Multi-model routing as a skill (plan/execute/review across model families for speed + cost) Verifier-First Loops (split plan/execute/evaluate across models) and modes are tool subsets.
  • Agent Loops (topic index) — the learning path this article slots into as a practitioner case study.
  • Write Loops, Not Prompts — the one-sentence loop definition + three beginner starter loops; the conceptual prerequisite to Berman’s recipes.
  • Loop Engineering (Cobus Greyling) — the taxonomy/readiness/failure-catalog reference; Berman’s Loop Library is the recipe-collection counterpart.
  • Loop Library (Forward Future) — the dedicated article on the Loop Library resource itself (catalog + the architecture-satisfaction loop example).
  • Should You Build a Loop? — the cost/security decision layer behind running nightly loops like Berman’s.
  • Verifier-First Loops — the verification discipline that hardens the docs/error/test loops above.
  • Worktrees — the parallel-isolation mechanism Berman relies on locally.
  • Reflecting on a Year of Claude Code — Boris Cherny’s first-party “my job is to write loops” + /babysit PR loop, the same automation thesis from inside Anthropic.

Try It

  1. Bookmark the Loop Library (signals.forwardfuture.ai/loop-library) and lift one recipe — the production-error-sweep or overnight-docs-sweep is the lowest-risk place to start.
  2. Stand up the flywheel prerequisites first: turn on exhaustive logging (7-30 day retention) and a coverage check, so an error-sweep loop has logs to read and a test bar to hold.
  3. Build one automation in Cursor or Codex: trigger = “pull request opened”, prompt = “wait until you see the review comments on the PR, then address each one and push the fix back,” and enable the GitHub comment tool when prompted.
  4. Bound every loop with a goal, not just a schedule — e.g. “continue until every page loads under 50 ms” — so it stops on a verifiable condition (pair with the cost controls in Write Loops, Not Prompts).
  5. Decide cloud vs local by parallelism: if you routinely run 10+ agents, move them to cloud (and provision their .env/secrets); keep local for fast, high-control single-agent work.

Open Questions

  • Parallel merge-and-deploy has no clean solution yet; whether Cursor’s claimed agent-scale Git alternative (or batch-commit conventions) actually fixes it is unverified.
  • The Loop Library is new and thinly populated at launch; the durability and breadth of its recipes is unproven.^[inferred]