Claude Code `/workflows` — Deterministic Multi-Agent Orchestration Walkthrough

Source: raw/Anthropic_Just_Dropped_the_Update_Everyone_s_Been_Waiting_For.md — YouTube transcript (video ID c0gVowvMR-g, fetched 2026-05-25 via inbox-refresh wiki-inbox playlist). Creator unattributed in the transcript itself; channel references “my newsletter linked below” and “my workflow creator skill on GitHub” but does not state the channel name in body. Pairs with the digest-level coverage in Week 22 Release Digest which names /workflows as shipped in Claude Code v2.1.147 (off by default; enabled via CLAUDE_CODE_WORKFLOWS=1 env var).

Worked-example walkthrough of Claude Code’s new Workflow tool — the deterministic multi-agent orchestration primitive that replaces the previous model-as-orchestrator pattern with a code-as-orchestrator pattern. Workflows are defined in JavaScript files at .claude/workflows/<name>.js. Each workflow declares phases, schemas, and agents; sub-agent outputs flow phase-to-phase without re-entering the main session’s context. This kills the token tax, removes the orchestrator’s “gets sloppier the longer it runs” failure mode, and unlocks conditionals, loops, and budgets on the workflow itself (real JavaScript) without involving a model in the control-flow decision.

Key Takeaways

/workflows is a deterministic multi-agent orchestrator. Defined in workflow.js, lives at .claude/workflows/<name>.js. The orchestrator is code, not a model. Sub-agents talk directly to each other through the phases without round-tripping through a main-session model context.
The killer feature is killing the token tax. In the prior model-as-orchestrator pattern, every sub-agent result went back through the main session before being passed to the next sub-agent. Run 10-15 sub-agents and the main session’s context fills with intermediate results — and the orchestrator gets sloppier as its window fills. With workflow.js, intermediate results never enter the main session.
Off by default; opt-in. CLAUDE_CODE_WORKFLOWS=1 enables the tool. Then /workflows lists running + completed workflows in the current session; you press enter on any of them to see per-stage detail, tool calls, and tokens-per-agent.
Three primitives: agent (one fresh sub-agent per call), pipeline (stream items through stages — next stage starts when the first item exits the prior stage; not when all items finish), and schema (structured-output contract returned to the next phase).
Plus four behavioral knobs: phaseLog (live view of what’s happening), arguments (pass values into the workflow from the slash command — e.g., min_users 20), budgets (hard token cap to prevent runaway loops — while budget.remaining > 50000 { … }), and automatic per-sub-agent retry (up to 3 times on failure; you can also press X to skip or retry manually).
Conditionals and loops are real JavaScript. Want to spawn a verifier sub-agent only if the review failed? if (review.issues) { fix(); verify(); }. Want to loop until no more dead code is found, capped at 8 rounds? while (round++ < 8 && dead.length) { … }. The control flow is code — no model latency or non-determinism in the orchestration decisions.
Three worked examples in the source. (1) Triage Sentry — pulls unresolved issues via Sentry MCP, filters to those affecting more than min_users users (default 20), fixes each, verifies each fix. 7 sub-agents, ~400K tokens. (2) Dead code sweep — while loop up to 8 rounds: find unused code → list per schema → if any, remove one by one; exit early when none found. (3) Personalized outreach — load leads from CSV → 8 parallel research sub-agents (one per lead) → pipeline streams each researched lead into a message-writer sub-agent that starts as soon as the first research completes. Demonstrates the pipeline-vs-batch distinction.
Background workflows + multi-workflow concurrency. Workflows run as background tasks in the session. You can keep chatting with Claude while a workflow runs; you can also trigger another workflow in parallel. Pause/resume with P. Per-sub-agent skip with X. The session is not blocked.
Bare /workflows lists past + running. No-argument invocation shows the workflow history dashboard — elapsed time, turn count, tokens used per workflow.
Conditionals + model selection. Each agent can run on a different model. Default is the main session’s model (Opus 4.7 in the demo); you can specify highQ (Opus-tier) for expensive steps or a cheaper tier for lighter ones. Pattern: cheap research model first, escalate to expensive only on conditional fallback.
There’s a community workflow-creator skill that teaches Claude Code how to author workflow.js files (creator references their GitHub; URL not in transcript). Expect an official Anthropic workflow-creator skill once the feature is announced — until then, the community version is the path. See Open Questions for the attribution gap.

What workflows solve — the token tax

Before /workflows, the canonical multi-step pattern was:

Main session orchestrates: it tells sub-agent #1 to do step A.
Sub-agent #1 returns its result to the main session.
Main session reads the result, decides what to do next, tells sub-agent #2 to do step B with the result.
Sub-agent #2 returns to the main session.
… repeat for sub-agents #3 through n.

Every result passes through the orchestrator’s context window. That’s the token tax: each intermediate result enters main-session context twice (once when received, once when passed on). With 10-15 sub-agents in a workflow, the main session’s window fills with intermediate state it doesn’t need long-term — and the orchestrator’s own quality degrades as its window fills (same context-rot mechanism as troubleshooting drift).

Two other failures stack on top:

No visibility. A workflow can run for 20-30 minutes producing a wall of scrolling text. The operator has no idea which sub-agent is currently active or what state the workflow is in.
Conditionals get worse with drift. The longer the orchestrator runs, the more likely it skips a “spawn verifier only if review failed” branch — because by the time it gets there, the main context has accumulated enough noise that the conditional check itself becomes unreliable.

The /workflows solution: make the orchestrator code, not a model. A workflow.js file passes sub-agent outputs directly to the next sub-agent’s prompt without involving the main session. The orchestrator becomes a function call graph evaluated by code; the model only does what models are good at — judgment inside each agent call.

The three primitives — agent / pipeline / schema

Primitive	Purpose	Example
`agent`	Spawn one fresh sub-agent. Each call gets a clean context. Can run multiple in parallel by calling `agent` from inside `Promise.all`.	`agent({ prompt: "Review this code", schema: VerdictSchema })`
`pipeline`	Stream items through stages. As soon as one item exits stage 1, stage 2 starts on it — without waiting for the other items to finish stage 1.	`pipeline([researchAgent, writeAgent], leads)`
`schema`	Structured-output contract returned by an agent. Next phase references the schema fields by name.	`const Issues = schema({ id, title, user_count })`

Plus four behavioral knobs:

Knob	Purpose
`phaseLog`	Live view of what’s happening — appears in the `/workflows` UI.
`arguments`	Values passed in from the slash command (`/triage-sentry min_users 20`). Defaults declared in the workflow’s meta.
`budgets`	Token / cost ceiling. Example: `while budget.remaining > 50000 { fix_bug() }` prevents runaway loops.
Automatic retry	Failed sub-agents (e.g., MCP server timeout) auto-retry up to 3 times. Operators can also press `X` to skip or manually retry.

Anatomy of a `workflow.js` file

The source walks through manual authoring (recommending you have Claude Code itself author the file in practice). The shape:

// .claude/workflows/triage-sentry.js
 
export const meta = {
  name: "triage-sentry",
  description: "Triage and fix Sentry issues affecting >N users",
  phases: ["load_issues", "fix", "verify"],
};
 
export const args = {
  min_users: { default: 20 },  // /triage-sentry min_users 30 overrides
};
 
const Issues = schema({
  id: "string",
  title: "string",
  user_count: "number",
});
 
const Verdict = schema({
  fixed: "boolean",
  notes: "string",
});
 
export default async function workflow({ agent, pipeline, args, phaseLog }) {
  // Phase 1: pull issues
  const issues = await agent({
    name: "load_issues",
    prompt: "Use Sentry MCP to list unresolved issues. Return id, title, user_count for each.",
    schema: Issues,
  });
 
  // Plain JS filter — no model in the loop
  const big = issues.filter(i => i.user_count >= args.min_users);
  phaseLog(`${big.length} issues above threshold of ${args.min_users} users`);
 
  if (big.length === 0) {
    return { fixed: 0, notes: "No issues above threshold." };
  }
 
  // Phase 2 + 3: pipeline through fix → verify
  const results = await pipeline([
    async (issue) => agent({
      name: "fix",
      prompt: `Investigate and fix Sentry issue ${issue.id}: ${issue.title}. Affected ${issue.user_count} users.`,
      schema: Verdict,
    }),
    async (verdict) => agent({
      name: "verify",
      prompt: `Verify the fix described: ${verdict.notes}. Confirm it actually resolves the issue.`,
      schema: Verdict,
    }),
  ], big);
 
  return {
    found: big.length,
    fixed: results.filter(r => r.fixed).length,
    details: results,
  };
}

Plain JavaScript variables, conditionals, filters. agent and pipeline are the only workflow-specific primitives; everything else is the host language.

Three worked examples

(1) Triage Sentry

Pulls unresolved Sentry issues via Sentry MCP, filters by min_users threshold (default 20), spawns a fix sub-agent per issue, then a verify sub-agent per fix. In the demo run: 25 unresolved issues found, 3 above the 20-user threshold, 3 fix sub-agents in parallel, 3 verify sub-agents downstream. Total: 7 sub-agents, ~400K tokens.

The interesting non-obvious detail: the phaseLog carries phaseLog("3 issues above threshold") before the fix phase starts — operator sees the gating decision before any expensive work happens.

(2) Dead code sweep

while loop up to 8 rounds:

let round = 0;
while (round++ < 8) {
  const dead = await agent({
    name: `scan_round_${round}`,
    prompt: "Find unused functions / unused imports / unreachable branches in the codebase.",
    schema: DeadCodeSchema,
  });
  if (dead.length === 0) break;
  for (const item of dead) {
    await agent({
      name: `remove_${item.id}`,
      prompt: `Remove dead code at ${item.path}:${item.line}. Verify no callers exist.`,
    });
  }
}

The loop exits early when a round finds no dead code. The cap of 8 is a safety budget. Per-item removal sub-agents run sequentially (each verifies-no-callers before deletion). Use Promise.all over dead if you want parallel removal — but the source shows sequential because removals can interact (one removal may invalidate another).

(3) Personalized lead outreach

Demonstrates pipeline’s streaming semantics — the source explicitly contrasts this with a batched approach:

const leads = await loadLeadsFromCSV(args.leads_file);  // 8 leads
 
await pipeline([
  async (lead) => agent({
    name: "research",
    prompt: `Research this lead: ${lead.name} at ${lead.company}. Find recent news, role detail, mutual connections.`,
    schema: LeadIntelSchema,
  }),
  async (intel) => agent({
    name: "write",
    prompt: `Write a personalized outreach message to ${intel.name} based on this intel: ${JSON.stringify(intel)}.`,
    schema: MessageSchema,
  }),
], leads);

Result: 8 research sub-agents spawn in parallel. As each one finishes, the corresponding write sub-agent starts immediately on the streamed-out intel — not after all 8 finish. That’s pipeline-streaming vs batched-fan-out. In the demo, the time-to-first-message was bounded by one research call, not by the slowest of eight.

Per-lead model escalation is also shown: research starts on a cheaper tier, escalates to highQ (Opus 4.7) only if the cheaper tier couldn’t find the contact details. That conditional is plain JavaScript.

Live UX — background workflows + multi-workflow concurrency

/workflows opens the workflow dashboard for the current session:

Live view per workflow — name, current phase, sub-agent count (running / completed), tokens used.
Press enter on any workflow to drill into per-stage detail: tool calls, prompts, returned schemas, errors.
Pause / resume with P.
Per-sub-agent X / R — skip a sub-agent or retry it manually (in addition to the 3-try auto-retry).
Concurrency — multiple workflows can run simultaneously in the same session. The main session is not blocked; you can keep chatting with Claude or trigger more workflows while existing ones run.
Background status — long-running workflows tag themselves as background tasks (like backgrounded subagents). You can leave them running and claude agents --json from another terminal to see live state (per the v2.1.145 surface in Week 22 Digest).
History persistence — past workflows from the session stay in the /workflows history view; navigate any past run to see the per-stage execution tree.

Why this matters operationally

Long-running workflows don’t degrade. Because the orchestrator is code, “the workflow gets sloppier after 15 sub-agents” stops being a thing. You can chain 20, 50, 100 sub-agents without main-session context filling — the long-running-agent ceiling shifts from “tokens in the orchestrator window” to “tokens per sub-agent + total session budget.”
Conditionals and loops become reliable. A model-as-orchestrator might skip a conditional after enough context drift. JavaScript if doesn’t drift. This is the same shift the goal` walkthrough flags at a different level — /goal lets you write a verifiable end-condition; /workflows lets you write the verifiable control flow that gets there.
Token-budget control finally exists. budgets.remaining > 50000 is a real hard cap, enforced by the workflow runtime. The model can’t “decide to keep going” past the budget. Closes the open-ended-loop failure mode of long-running /goal sessions.
Composable with /goal and auto-mode. /workflows is for repeatable, structured, multi-agent tasks. /goal is for single-target, autonomous-loop tasks where you don’t yet know the shape of the work. auto-mode is the permission-handling layer underneath both. The three together cover the full long-running-agent design space.
One-off tasks should NOT use workflows. The source is explicit: if you’ll only do this once, just chat with Claude directly. Workflows pay off when you’ll re-run the same shape repeatedly (daily Sentry triage, every-PR dead-code sweep, per-new-lead outreach drafting).

Try It

Enable workflows. CLAUDE_CODE_WORKFLOWS=1 claude (or export the env var in your shell rc). Confirm /workflows is available in the slash-command menu.
Start with bare /workflows to see the dashboard before authoring anything. Empty workflow history will display.
Have Claude author your first workflow. Don’t write workflow.js by hand. Ask Claude Code: “Look through my recent sessions and find a 3-5 step task I’ve done multiple times. Write a workflow.js for it under .claude/workflows/.” The community workflow-creator skill (referenced in the source) is one option; an official Anthropic version is likely once the feature is announced.
Pick a daily ritual to start. Daily Sentry / Linear / GitHub-issue triage are the canonical first workflows — they’re (a) repeatable, (b) fan-out-shaped, (c) tolerant of automatic per-sub-agent retry. Use the Sentry example above as your starting template.
Add a budget. Wrap any while loop in a budget guard: while (budget.remaining > N) { … }. This is cheap insurance against runaway loops and bad terminating conditions.
Run two workflows concurrently once you have one working. Trigger the same workflow twice with different arguments — verify the dashboard shows both running independently, with no cross-contamination of sub-agent output.
Skip a sub-agent manually. While a workflow is running, press /workflows, drill into a running workflow, find a sub-agent currently executing, press X to skip. Verify the workflow continues correctly without that sub-agent’s output — useful for testing your workflow’s graceful-degradation behavior.

Open Questions

Creator attribution missing. The transcript repeatedly references “my GitHub” / “my newsletter” but never names the channel or operator. The YouTube ID is c0gVowvMR-g. Worth resolving on next ingest: which YouTuber published this walkthrough? Voice + structure suggest it could be one of the regular Claude Code creators (Nate Herk / Chris / David Tech / similar) but no direct evidence in the transcript.
schema type detail. The source shows schemas at a high level (field names + types) but does not enumerate the supported primitive types, whether schemas can nest, or how validation failures surface. Worth a fetch of the official /workflows docs when they publish.
pipeline ordering guarantees. Pipeline-streaming starts stage 2 on each item as it exits stage 1. What’s the ordering guarantee in stage 2? If item A finishes stage 1 before item B but item B’s stage 2 finishes first, does the workflow result preserve original ordering or completion ordering? The source doesn’t specify.
Budget unit. budget.remaining > 50000 — is this tokens, dollars, API calls, or something else? The source uses “tokens” colloquially but doesn’t formalize the unit.
Workflow composition. Can one workflow call another workflow? Or only sub-agents? If composition is supported, what’s the syntax — await workflow("other-workflow.js", args)?
Workflow-creator skill URL. The creator’s GitHub skill is referenced but not linked in the transcript body. Worth surfacing the URL on the source’s video description on next refresh.

Adjacent pattern — self-verification via Chrome DevTools MCP

[Reddit signal — r/ClaudeCode 2026-05-28] Source: raw/reddit-1tpvd4s.md (26 score / 22 comments, OP Stock-Silver432, Resource flair, “Making Claude check its own work with 3x’d my output quality”). Operator describes a frontend verification loop that pairs naturally with the /workflows deterministic-orchestration primitive: at the end of any UI-changing task, Claude (1) navigates to the page it just changed via Chrome DevTools MCP, (2) screenshots at mobile/tablet/desktop widths, (3) looks at the screenshots itself (not “take a screenshot for me to review” — “take a screenshot, look at it yourself, tell me what’s wrong”), (4) clicks through the flow (open modal, submit form, hit edge cases), (5) fixes what’s broken and re-screenshots to confirm. Claimed ~3x first-pass quality. Honest tradeoff section: token cost rises (3 viewports × re-screenshot per fix = real chunk of context), wall-clock time per task increases, emulated viewports aren’t real iOS Safari so device-specific bugs still slip through. Suggested guardrail: gate the verification loop to UI-touching changes only, not every task. Pairs with this article’s workflow.js primitives — the self-screenshot-and-verify loop is a natural agent + pipeline shape inside a workflow (screenshot → vision-analyze → fix-if-broken → re-screenshot → exit-when-clean).

Dynamic Workflows in Claude Code — Anthropic Announcement — the official Anthropic announcement of this feature (now named “dynamic workflows”, enabled via the ultracode setting); resolves this article’s “expect an official announcement” open question. That article is the official framing + availability + the Bun Zig→Rust flagship example; this one is the hands-on mechanics.
Week 22 Release Digest — digest-level coverage of /workflows (v2.1.147); this article is the deep-dive companion
goal` Walkthrough — the autonomous-loop primitive; pairs with /workflows (structured-orchestration) as the two long-running-agent designs
Agent Teams — the broader multi-agent surface; /workflows is the deterministic orchestration layer on top
Claude Code Subagents — the primitive /workflows orchestrates
CLI Reference — auto-mode, agents --json, and other surfaces /workflows composes with
Scheduled Tasks — cron + /workflows for daily auto-triage / auto-sweep patterns
Context Management in Claude Code — the token-tax failure mode /workflows closes
Troubleshooting Claude — the context-rot mechanism that motivates moving the orchestrator from model to code

Jonathon's AI Wiki

Explorer

Claude Code `/workflows` — Deterministic Multi-Agent Orchestration Walkthrough

Key Takeaways

What workflows solve — the token tax

The three primitives — agent / pipeline / schema

Anatomy of a `workflow.js` file

Three worked examples

(1) Triage Sentry

(2) Dead code sweep

(3) Personalized lead outreach

Live UX — background workflows + multi-workflow concurrency

Why this matters operationally

Try It

Open Questions

Adjacent pattern — self-verification via Chrome DevTools MCP

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Claude Code `/workflows` — Deterministic Multi-Agent Orchestration Walkthrough

Key Takeaways

What workflows solve — the token tax

The three primitives — agent / pipeline / schema

Anatomy of a workflow.js file

Three worked examples

(1) Triage Sentry

(2) Dead code sweep

(3) Personalized lead outreach

Live UX — background workflows + multi-workflow concurrency

Why this matters operationally

Try It

Open Questions

Adjacent pattern — self-verification via Chrome DevTools MCP

Related

Graph View

Table of Contents

Backlinks

Anatomy of a `workflow.js` file