Dynamic Workflows in Claude Code — Anthropic Announcement (GA 2026-06-10)

Source: Introducing dynamic workflows in Claude Code (Anthropic blog, claude.com/blog/introducing-dynamic-workflows-in-claude-code; official docs code.claude.com/docs/en/workflows); operator walkthroughs raw/Opus_4.8_is_NOT_Claude_s_biggest_release_today_Ultracode_and_Dynamic_Workflows.md and raw/Claude_Code_Dynamic_Workflows_Clearly_Explained.md; best-practices deep-dive by Thariq Shihipar (@trq212, Anthropic Claude Code team), 2061907337154367865 — also published on the Claude Blog; trigger-word update via raw/x-bookmarks-recent-digest-2026-06-04.md (@ClaudeDevs, 2026-06-03); concurrency-cap + deep-research-verify specifics via raw/Claude_Code_Just_Dropped_Workflows_An_Actual_Game_Changer.md (2026-06-05); the Bun rewrite’s now-shipped production status, cost, and split-context review specifics via raw/reddit-1uru4zg.md (Jarred Sumner’s first-party bun.com/blog/bun-in-rust, r/ClaudeCode, 2026-07-09)

The official Anthropic announcement of dynamic workflows — the feature the wiki previously covered only via a third-party walkthrough (workflows` Walkthrough). Claude dynamically writes orchestration scripts that run tens to hundreds of parallel subagents in a single session, checking its work (agents refute each other until answers converge) before anything reaches you. Built for the work too big for one pass — a bug hunt across a whole service, a migration touching hundreds of files, a plan stress-tested from every angle — and explicitly framed as turning quarters of work into days. Reached general availability on 2026-06-10 (announced by @claudeai during the Code with Claude Tokyo updates; previously a research preview since late May); enabled via the ultracode setting (on by default for Max/Team/API).

Key Takeaways

What it is: Claude plans dynamically from your prompt, breaks the task into subtasks, and fans out across tens-to-hundreds of parallel subagents in one session — writing the orchestration script itself rather than you authoring it. (This is the official-announcement framing of the same Workflow primitive the workflows` walkthrough dissects at the workflow.js level.)
The self-checking loop is the differentiator: agents address the problem from independent angles, other agents try to refute what they found, and the run iterates until the answers converge — “how a workflow reaches results a single pass can’t.” Adversarial verification is built into the primitive, not bolted on.
Resumable by design: progress is saved as the run goes, so an interrupted job picks up where it left off instead of restarting. Coordination happens outside the conversation, so the plan stays on track no matter how big the task gets (no orchestrator context-rot).
Built for long-running work — runs can extend into hours and days, doing complex engineering that previously took weeks.
Token cost is real: dynamic workflows consume meaningfully more usage than a typical Claude Code session. The first time a workflow triggers, Claude Code shows what’s about to run and asks you to confirm. Start on a scoped task to calibrate.
Turn on auto mode for the best experience (it’s the permission layer that lets long parallel runs proceed without per-action prompts).
Flagship proof point — the Bun Zig→Rust port (Jarred Sumner): 99.8% of the existing test suite passing, ~750,000 lines of Rust, 11 days first-commit-to-merge, with hundreds of agents in parallel and two reviewers on each file.
Two ways to start: ask Claude to create a workflow, or turn on the Claude Code setting ultracode.
Trigger-word update (2026-06-03): the explicit trigger keyword changed from “workflow” to “ultracode” — “use a workflow for this” still works when the intent is clear, but incidental mentions of “workflow” no longer kick off a dynamic workflow (a false-positive fix from user feedback). Say “ultracode” to trigger one explicitly (@ClaudeDevs, 2026-06-03).
The Claude Code team’s own best-practices guide (Thariq Shihipar, ~2026-06-02) frames workflows as the fix for three single-context failure modes — agentic laziness, self-preferential bias, goal drift — and ships a reusable six-pattern taxonomy (classify-and-act, fan-out-and-synthesize, adversarial verification, generate-and-filter, tournament, loop-until-done) plus an example-prompt library skewed toward non-coding work. See Best practices from the Claude Code team below.
Operator note — when a loop actually pays off (the “GP-loop,” 2026-06-09). [YouTube signal — Greg Isenberg pod] A bounded-loop discipline that sharpens the loop-until-done pattern above and counterweights open-ended /goal loops: a Claude Code skill runs a code-review loop gated by an external scorer — the agent checks the GitHub PR, reads the review, fixes it, re-pushes, and won’t stop until the review scores ≥4/5 (or it caps at 5 iterations, then gives up). The load-bearing heuristic: loops only pay off in a confined process with a fixed, binary feedback signal (“where the output is black-or-white with no creativity” — i.e., code review), and they break down past ~1,000 LOC per push (too much for the agent to fully review → split into multiple PRs). A concrete boundary on autonomy: the scorer and the LOC ceiling are what keep the loop from running away. (Source: raw/WTF_Is_an_AI_Agent_Loop_Genius_or_Hype.md.)

How it works

Plan dynamically — Claude reads your prompt and decomposes it into subtasks on the fly (no pre-authored script required).
Fan out — work is distributed across subagents running in parallel.
Check before folding in — results are verified before they’re merged into the answer. Independent agents attack the problem from different angles; other agents try to refute their findings; the run keeps iterating until the answers converge.
Return one coordinated answer — you come back to a single result, not a pile of subagent transcripts. Because coordination lives outside the conversation, the main session’s context never fills with intermediate state (the “token tax” the walkthrough details).
Resume on interruption — progress is checkpointed continuously; an interrupted run continues rather than restarting.

Admins can disable workflows org-wide through managed settings.

The Bun rewrite — what it unlocks at scale

Anthropic’s headline example is Jarred Sumner’s port of Bun from Zig to Rust, run entirely through dynamic workflows (not yet in production at announcement):

One workflow mapped the correct Rust lifetime for every struct field in the Zig codebase.
The next wrote every .rs file as a behavior-identical port of its .zig counterpart — hundreds of agents in parallel, with two reviewers on each file.
A fix loop then drove the build and test suite until both ran clean.
An overnight workflow addressed unnecessary data copies and opened a PR for each for final human review.

Result: ~750,000 lines of Rust, 99.8% of the existing test suite passing, 11 days from first commit to merge. This is the same Bun rewrite referenced in The Capability Curve — now attributed concretely to dynamic workflows as the mechanism.

Update (2026-07-09) — now shipped, with cost + orchestration specifics. Jarred Sumner’s first-party writeup (bun.com/blog/bun-in-rust, surfaced via raw/reddit-1uru4zg.md) resolves the “not yet in production” caveat and adds hard numbers: the port ran on ~50 dynamic workflows kept running continuously, holding 64 Claudes in parallel (orchestration he says he’d otherwise have had to hand-build a harness for), at roughly $165k of pre-release Fable 5 usage priced at API rates, over 11 days. The review discipline is the adversarial split-context pattern stated explicitly: 1 implementer + 2+ adversarial reviewers per implementer, where the implementer never reviews and the reviewer never implements — precisely because “the Claude that wrote the code wants it merged” (the same author-merge bias humans have), so the reviewer’s only job is to refute. Now in production: Bun v1.4.0 (first Rust version, superseding Zig v1.3.14) is in canary — 128 bugs fixed, ~20% smaller binary, memory leaks eliminated — and Claude Code v2.1.181+ already runs on the Rust port (startup ~10% faster on Linux). See Fable 5’s long-horizon proof-point section for the model-capability framing.

Update (2026-07-16) — Anthropic’s own migration-methodology post adds token/behavior specifics, plus one figure worth flagging. Anthropic’s first-party “How Anthropic runs large-scale code migrations with Claude Code” (claude.com/blog/ai-code-migration, the newest post on the blog at publish time) frames the Bun port as one of two flagship examples for a generalized six-step migration process — see the new dedicated article How Anthropic Runs Large-Scale Code Migrations with Claude Code for the full methodology. New/reconciling specifics: exact token spend **5.9 billion uncached input tokens + 690 million output tokens ≈ $165, 000 a t A P I p r i c in g * * (ma t c h es t h e$ 165k figure above, now itemized); test-suite pass rate improved to 100% passing in CI before merge (up from the 99.8% cited at the original 2026-05-28 announcement — plausibly resolved between announcement and this later post, not a contradiction); 19 regressions surfaced post-merge, all since fixed; 19% smaller binary on Linux and Windows (consistent with the “~20%” cited above, just more precise); ~4% of the Rust code sits in unsafe blocks, mostly single-line pointer operations at C/C++ boundaries; a 2,000-repeated-build memory benchmark dropped from 6,745 MB to 609 MB; and 2-5% faster across HTTP serving and workloads like next build and tsc. One figure is genuinely in tension and left unresolved: this post states “a million lines of code were produced” for the port, vs. the ~750,000 lines of Rust cited above from the original announcement and Jarred’s own July 9 blog. ^[ambiguous] A plausible reconciliation is that “produced” counts cumulative/discarded output across the stress-test and fix loops rather than final merged size, but neither source states this — the two first-party figures stand uncorrected against each other pending a future source that clarifies which (if either) is the precise final-state number.

Availability & getting started

Research preview, available today in: Claude Code CLI, Desktop, and the VS Code extension, plus the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry.
Plan gating:
- Max / Team (and Claude Code via the API): on by default.
- Enterprise: off by default at launch — admin enables in Claude Code settings.
- Enterprise admins can also disable via managed settings.
Enable / start: ask Claude to create a workflow, or turn on the ultracode setting. Pair with auto mode for best results.
Docs: code.claude.com/docs/en/workflows.

Customer use (early access)

Klarna — strongest results on discovery and review across large codebases: identifying dead code and cleanup opportunities that traditional static analysis missed, speeding maintenance/refactoring.
CyberAgent — “fills the gap between firing off a single subagent and building out a full agent team. Plan to implementation just flows, so we can trust longer runs without losing visibility.”

The two quotes bracket the sweet spot: bigger than one subagent, lighter than a standing agent team — discovery/review sweeps and plan→implementation runs.

Best practices from the Claude Code team (Thariq Shihipar, 2026-06-02)

Thariq Shihipar (@trq212) and Sid Bidasaria (@sidbid), members of technical staff at Anthropic on the Claude Code team, published a best-practices deep-dive ~a week after launch (“my initial workflows experiences and learnings”; also on the Claude Blog). It is the primary-source companion to the announcement above — the framing the team itself uses. The authors’ own caveat: “best practices are still developing! Dynamic workflows often use more tokens, so think carefully about when and how to use them.”

Why a workflow beats one big context window — three failure modes

The default harness has to plan and execute in the same context window, which is great for coding but “can break down over long-running, massively parallel and/or highly structured adversarial tasks.” The longer Claude works in one context, the more it hits three named failure modes — and a workflow combats each by “orchestrating separate Claudes with their own context windows and focused, isolated goals”:

Agentic laziness — stopping before a multi-part task is finished and declaring done after partial progress (the author’s example: “addressing 20 of the 50 items in a security review”).
Self-preferential bias — Claude’s tendency to prefer its own results/findings, especially when asked to verify or judge them against a rubric. (The fix: have a separate agent do the verification.)
Goal drift — gradual loss of fidelity to the original objective across many turns, especially after compaction — each lossy summarization can drop edge-case requirements or “don’t do X” constraints.

These are the same failure modes Anthropic documents for the model itself in the Opus 4.8 system card (lazy-investigation, overconfidence/self-preference, goal-drift). Dynamic workflows are the harness-level mitigation: isolation + focused goals + cross-agent verification, instead of trusting one long context to police itself.

Six reusable patterns

The dev-recommended vocabulary for what a workflow’s orchestration script should do:

Pattern	What it does
Classify-and-act	A classifier agent decides the task/output type and routes to specialized agents, behaviors, or final determinations.
Fan-out-and-synthesize	Split into many parallel subtasks (isolated agents); a synthesizer step then merges their structured outputs and acts as a synchronization barrier.
Adversarial verification	For each agent’s output, a separate spawned agent adversarially checks it against a rubric or criteria.
Generate-and-filter	Generate many candidates, then filter / dedupe / select by rubric, verification, or quality tests to keep only the best.
Tournament	Multiple agents attempt the same task with different approaches; a judge runs pairwise comparisons until one winner remains.
Loop until done	Re-spawn agents in a loop until a dynamic stop condition (no new findings, no remaining errors) rather than a fixed iteration count.

Example prompts (straight from the author)

The post’s most reusable artifact is a library of natural-language prompts that trigger good workflows — and most are not coding tasks:

“This test fails maybe 1 in 50 runs. Set up a workflow to reproduce it, form theories and adversarially test them in worktrees /goal don’t stop until one theory works.”
“Go through my last 50 sessions and mine them for corrections I keep making and turn the recurring ones into CLAUDE.md rules.”
“Dig through incidents in Slack for the past six months and find recurring root causes where nobody has filed a ticket.”
“Take my business plan and run a workflow where different agents tear it apart from an investor’s, a customer’s, and a competitor’s perspective.”
“Here’s a folder of 80 resumes, rank them for the backend role and double-check the top ten. Interview me using the AskUserQuestion tool for a rubric.”
“I need a name for this CLI tool. Brainstorm a bunch of options and run a tournament to pick the top 3.”
“Rename our User model to Account everywhere.”
“Go through my blog post draft and verify every technical claim against the codebase — I don’t want to ship anything wrong."

"Often even more useful for non-technical work”

Thariq’s explicit claim: “workflows are sometimes even more useful for non-technical work.” The use-case catalog spans migrations/refactors (the Bun port above), deep research (the built-in /deep-research skill), deep verification, sorting (e.g. support tickets by severity via tournament/pairwise comparison), memory & rule adherence (mining sessions into CLAUDE.md rules), root-cause investigation (independent hypotheses to defeat self-preferential bias), triaging at scale (quarantine patterns), exploration & taste (design, naming, rubrics), evals, and model/intelligence routing (the workflow picks Haiku vs Opus per agent).

When NOT to use one

Workflows are new… they are not needed for every task and may end up using significantly more tokens. For regular coding tasks, try and ask yourself: does it really need more compute?

Tips

Prompting: lean on the named patterns; the ultracode trigger lets Claude decide when to spin one up; say “quick workflow” for a lighter run.
Combine with goal` and /loop for autonomous, repeat-until-done behavior.
Token budgets: cap spend by prompting a budget inline — “use 10k tokens” sets the cap. (First concrete budget-control syntax beyond the walkthrough’s budgets knob.)
Saving & sharing: press s in the workflow menu to save; check the .js into ~/.claude/workflows, or distribute via a skill — put the workflow files in the skill folder and reference them in SKILL.md. Prompt Claude to treat a shared workflow as a template, not a verbatim script, for flexibility.

Hands-on demo (operator field test — 2026-05-29)

A RoboNuggets hands-on run of dynamic workflows + ultracode on Max, confirming and quantifying the announcement claims:

ultracode defined in practice: a Claude-Code effort setting = extra-high effort + Claude decides on its own whether to invoke a dynamic workflow. It sits on the effort slider after max; the VS Code extension turns the toggle purple, and typing workflows in the terminal triggers a bespoke animation (Anthropic coded a custom UI cue for the release).
/workflows is a live status monitor: mid-run it shows the orchestrator’s drafted multi-phase plan (e.g. Phase 1 audit / Phase 2 planning / synthesis), per-agent completion state, and per-agent token consumption — the way to observe a long-running job’s progress.
Observed fan-out scale, real numbers: a 3-site brand-audit fanned out 9 audit agents → 13 live-fetch agents; an open-ended bug audit under ultracode did a pre-assessment as a sole agent first, then fanned out 8 parallel auditors, then 88 parallel sub-agents for the verification step — 96 total sub-agents for one bug report. Confirms “hundreds of parallel sub-agents” is literally true; the orchestrator’s own words: “orchestrate a fan-out audit with adversarial per-finding verification.”
Token cost, measured: the two heavy runs moved the account’s weekly rate limit from 2% → 6% (≈4% for two tasks) — concrete backing for the “meaningfully more usage” warning. ^[inferred — operator opinion that Anthropic should show absolute token counts, not a percentage]
Orchestrator behavior: acts as a manager — “using the wait productively, pre-building the report generator so the moment data lands it can produce deliverables fast.” Output was technically rich but “vanilla white-paper”; a second design-system pass was needed to make it presentable (a workflow produces the analysis, not the polish).
Field thesis: “Opus 4.8 is a great incremental release, but the way we work is dictated more by updates to the harness” — ultracode + dynamic workflows are the real unlock, not the benchmarks. ^[inferred — creator’s editorial thesis]

Second operator walkthrough (2026-05-31) — three concrete deltas

A second hands-on explainer (skill-audit run: 41 Haiku scoring agents → one Opus synthesis agent, ~5M input tokens, HTML “worst-to-best” skills ranking) surfaces operational specifics the announcement and the RoboNuggets run don’t:

/deep research is a built-in workflow-backed command. It automatically invokes a dynamic workflow — spins up parallel research agents, has them vote on each claim, and returns a cited deep-research report. The first concrete named command that triggers a workflow on its own (distinct from typing “set me up a dynamic workflow”). ^[inferred — single-creator claim, not in the Anthropic announcement]
Workflow .js files save to a global location by default — redirect them explicitly. The generated script lands in the global Claude Code working directory, not the current project; the operator had to tell it to save into the project’s .claude/workflows/ to keep the reusable workflow with the repo. Non-obvious gotcha for anyone expecting the <name>.js` path inside the project. ^[inferred — single-creator observation]
Pin all subagents to Haiku as the primary cost lever. Beyond “start scoped,” the explicit cost discipline is bound the scope, name the deliverable, and put every subagent on Haiku — most workflow spend is input tokens (cheaper than output), so a Haiku fan-out keeps a hundred-agent run affordable. Backs the “half my $200/month plan in one ~30-min prompt” anecdote from an unbounded desktop-wide crawl. ^[inferred — single-creator cost framing]
/goal vs workflow = depth vs width. The creator frames /goal as a loop (re-runs until done == true, can run 24h+) and a workflow as a width play (N agents fan out, each executes a fixed plan slice, results synthesize once — no per-iteration convergence check). Note this is a usage heuristic, not the official line: Anthropic’s announcement explicitly describes workflows as self-checking (agents refute findings until answers converge), so the “no convergence loop” framing is the creator’s simplification, not a contradiction of the primitive. ^[inferred — creator’s mental model; reconciled against the official self-checking claim above]

Third operator walkthrough (2026-06-05) — concurrency caps + deep-research verify internals

A third hands-on explainer (“Claude Code Just Dropped Workflows”) retreads the now-covered ground (workflow.js orchestrator, journal/resumability, ultracode auto-trigger, Max/Team-on vs Pro/Enterprise-off, model-per-phase) but adds two falsifiable operational specifics absent from the announcement and the prior walkthroughs:

Hard concurrency caps: 16 agents concurrent (max), 1,000 agents total per run. This bounds the “tens-to-hundreds in parallel” / “96 total sub-agents observed” framing above with an actual ceiling — excess agents queue and run as slots free; the 1,000-total is a per-run lifetime cap. ^[inferred — single unnamed creator; aligns with Anthropic’s documented Workflow limits but confirm against code.claude.com/docs/en/workflows before relying]
The /deep research verify stage, quantified: after fanning out parallel searches and deduping, it verifies the top ~25 claims with 3 independent verify agents each, and 2-of-3 refutes kills a claim before synthesis. This is the concrete adversarial-verification mechanic behind the “agents vote on each claim” note above. ^[inferred — single-creator detail]

Why it matters

It closes the “official source” gap the walkthrough flagged in its Open Questions — the feature is now formally announced, named (dynamic workflows), and given a stable enablement path (ultracode).
Adversarial convergence is now a first-class primitive. “Agents try to refute what they found until answers converge” is the same multi-angle/verify pattern the wiki documents in verification-loop autonomy and the Ara workshop — here it’s baked into the orchestrator.
Resumability + out-of-conversation coordination is what makes hours/days-long runs viable — it removes the orchestrator-context-rot ceiling that capped model-as-orchestrator patterns.
Fits the long-running-agent design space alongside goal` (single-target autonomous loop) and agent teams (persistent multi-instance coordination). Dynamic workflows are the structured, dynamically-planned, self-verifying fan-out in the middle.
The token-cost warning is load-bearing — this is a power tool that bills like one; the confirm-before-first-run gate and “start scoped” guidance are the official discipline.

Try It

Confirm access: on Max/Team or via the API, dynamic workflows are on by default — turn on ultracode (or just ask Claude to create a workflow). On Enterprise, ask your admin to enable them.
Turn on auto mode before starting a long run.
Start scoped to calibrate token usage — a single-service bug hunt or a dead-code sweep (Klarna’s use case) is the canonical first run. Expect the confirm-before-run prompt on the first trigger.
Use it for the right shape of work: big migrations, whole-codebase discovery/review, or a plan you want stress-tested from every angle — not one-off single-pass tasks.
For the mechanics (authoring workflow.js, the agent/pipeline/schema primitives, budgets, per-subagent skip/retry), read the workflows` walkthrough — it’s the hands-on companion to this announcement.
Read the official docs at code.claude.com/docs/en/workflows for the current (GA) surface.

Refresh — Boris Cherny’s autonomous-Opus playbook (first-party, 2026-06)

[X signal — @bcherny 2026-06] Two first-party posts from Claude Code’s creator frame dynamic workflows inside the broader “run Opus autonomously for hours/days” playbook. Five tips (2063792263067754658, opening “Opus is the best model for long-running work”): (1) auto mode for permissions so Claude doesn’t stop to ask; (2) dynamic workflows to orchestrate hundreds/thousands of agents on one task; (3) /goal or /loop to nudge Claude to keep going until done; (4) Claude Code in the cloud (desktop/mobile app) so you can close your laptop; (5) give Claude a way to self-verify end-to-end — Claude in Chrome for web, iOS/Android simulator MCP for mobile, or a way to start the full web server/service for backend work. Tip 5 (the self-verification harness) is the most additive over this article’s existing coverage. Separately, his “Claude Code, one year after GA” reflection (2064034799711588805; echoed by @ClaudeDevs) reports the usage shift toward auto mode, routines that proactively fix bugs before the user sees them, coding from a phone, verification best practices, and loops — linking a @bcherny × @_catwu video — now transcribed and articled → Reflecting on a Year of Claude Code.

Refresh — Dynamic workflows reach GA (first-party, 2026-06-10)

[X signal — @claudeai 2026-06-10] Anthropic confirmed dynamic workflows in Claude Code are now generally available (announced alongside the Code with Claude Tokyo updates), resolving this article’s prior “research preview” status. The same announcement paired it with two Claude Managed Agents features (public beta): scheduled deployments (agents run on timers — nightly data syncs, weekly scans) and environment variable vaults (tools receive placeholders; real keys are injected only at the network boundary for allowed domains, so the model never sees secrets — the productized form of the “secret-injection mechanism” noted in Code with Claude Tokyo / Week 25). Source: raw/x-account-claudeai-2064741174317924421.md.

workflows` Walkthrough — the hands-on deep-dive (workflow.js anatomy, agent/pipeline/schema, budgets, 3 worked examples). This article is the official announcement; that one is the mechanics.
goal` Walkthrough — the autonomous-loop primitive; the sibling long-running-agent design.
Claude Code Agent Teams — persistent multi-instance coordination; dynamic workflows sit between a single subagent and a full agent team (the CyberAgent framing).
The Capability Curve — context for the Bun Zig→Rust rewrite as a capability proof point.
When AI Builds Itself — Recursive Self-Improvement — the structural “why” behind the harness race: Anthropic’s internal data on AI accelerating AI development (8× code/quarter, 76% open-ended task success), with dynamic workflows as the mechanism that turns “quarters of work into days.”
Claude Opus 4.8 + System Card — the model-level failure modes (lazy-investigation, self-preference, goal-drift) that Thariq’s workflow rationale mirrors at the harness level.
How We Claude Code (Ara) — agent-native verification practices that complement workflow self-checking.
Week 23 Release Digest — the v2.1.154 release that flipped dynamic workflows on in the CLI (ultracode).
Claude Security Plugin — a shipped first-party consumer of this primitive: the scan is a dynamic workflow, which is why the plugin’s floor is v2.1.154 on a paid plan.
Week 22 Release Digest — surrounding Claude Code release context.
CLI Reference — auto-mode and related surfaces dynamic workflows compose with.
Loop Engineering (Cobus Greyling) — the tool-agnostic pattern reference that maps these Claude Code loop primitives (/loop, /goal, scheduled tasks) onto Grok / Codex / GitHub Actions, with a readiness ladder and failure-mode catalog.
Loop Engineering: Getting Started with Loops — Anthropic’s own post naming dynamic workflows as the mechanism a “proactive loop” composes in, alongside /goal and /schedule, for its parallel-worktree-plus-judge example.
The Loop Is the Unit of Work — the synthesis: dynamic workflows are the Claude-native “orchestrate many” rung of the prompt → harness → loop ladder, with verification as the gate.
How Anthropic Runs Large-Scale Code Migrations with Claude Code — Anthropic’s own generalized six-step methodology, using this article’s Bun port (plus a second example, Mike Krieger’s Python→TypeScript migration) as its two flagship case studies.

Open Questions

ultracode vs CLAUDE_CODE_WORKFLOWS=1. The walkthrough cited the env-var enablement (v2.1.147 era); the official post names the ultracode setting and “on by default for Max/Team/API.” Update (W23): Claude Code v2.1.154 shipped ultracode as the live effort-slider setting (= xhigh + Claude auto-deciding whether to invoke a workflow), and the RoboNuggets demo above confirms its in-CLI behavior. The relationship to the older CLAUDE_CODE_WORKFLOWS=1 env var (deprecated / aliased / distinct) is still unconfirmed.
Exact token/usage accounting. “Meaningfully more usage” and the confirm-before-run gate are stated, but no per-workflow metering detail or budget-unit definition is given. Update (2026-06-02): Thariq’s post documents an inline budget prompt — “use 10k tokens” sets a hard cap — the first concrete user-facing control beyond the walkthrough’s budgets knob. Update (2026-06-10): Claude Code’s /usage view now shows a per-component token breakdown — which skills, MCP servers, and plugins are consuming tokens (per @bcherny, who notes badly-behaved plugins are the usual culprit; raw/x-account-bcherny-2064507661845246349.md). Component-level visibility exists; per-workflow metering/reporting detail is still unspecified.
Bun port production status. Explicitly “not yet in production” at announcement; Jarred Sumner’s promised writeup will be the deeper primary source — worth a refresh when it publishes.
~~Research-preview → GA timeline~~ — resolved: GA on 2026-06-10 (see the Refresh section above). Whether Pro plans get access is still not stated.

Jonathon's AI Wiki

Explorer

Dynamic Workflows in Claude Code — Anthropic Announcement (GA 2026-06-10)

Key Takeaways

How it works

The Bun rewrite — what it unlocks at scale

Availability & getting started

Customer use (early access)

Best practices from the Claude Code team (Thariq Shihipar, 2026-06-02)

Why a workflow beats one big context window — three failure modes

Six reusable patterns

Example prompts (straight from the author)

"Often even more useful for non-technical work”

When NOT to use one

Tips

Hands-on demo (operator field test — 2026-05-29)

Second operator walkthrough (2026-05-31) — three concrete deltas

Third operator walkthrough (2026-06-05) — concurrency caps + deep-research verify internals

Why it matters

Try It

Refresh — Boris Cherny’s autonomous-Opus playbook (first-party, 2026-06)

Refresh — Dynamic workflows reach GA (first-party, 2026-06-10)

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Dynamic Workflows in Claude Code — Anthropic Announcement (GA 2026-06-10)

Key Takeaways

How it works

The Bun rewrite — what it unlocks at scale

Availability & getting started

Customer use (early access)

Best practices from the Claude Code team (Thariq Shihipar, 2026-06-02)

Why a workflow beats one big context window — three failure modes

Six reusable patterns

Example prompts (straight from the author)

"Often even more useful for non-technical work”

When NOT to use one

Tips

Hands-on demo (operator field test — 2026-05-29)

Second operator walkthrough (2026-05-31) — three concrete deltas

Third operator walkthrough (2026-06-05) — concurrency caps + deep-research verify internals

Why it matters

Try It

Refresh — Boris Cherny’s autonomous-Opus playbook (first-party, 2026-06)

Refresh — Dynamic workflows reach GA (first-party, 2026-06-10)

Related

Open Questions

Graph View

Table of Contents

Backlinks