Claude Agent SDK — How the Agent Loop Works

Source: ai-research/claude-agent-sdk-agent-loop-2026-06-16.md (https://code.claude.com/docs/en/agent-sdk/agent-loop); plus raw/reddit-1ueuh0h.md (r/ClaudeCode harness-rebuild write-up) for the 2026-06-25 design-decisions note.

Anthropic’s official Agent SDK doc explaining the message lifecycle that powers SDK agents. The Agent SDK lets you embed the same autonomous loop that powers Claude Code in your own application — a standalone package (no Claude Code CLI required) that gives programmatic control over tools, permissions, cost limits, and output. This page is the primary-source, mechanical companion to the broader Claude Agent SDK concept article (whose loop description was previously inferred).

Key Takeaways

The loop is five steps. (1) Receive prompt (+ system prompt, tool defs, history) → SDK yields a SystemMessage subtype "init". (2) Claude evaluates and responds with text and/or tool-call requests → AssistantMessage. (3) SDK executes the tools and feeds results back → UserMessage. (4) Steps 2-3 repeat; each full cycle is one turn. (5) When Claude responds with no tool calls, the SDK yields a final AssistantMessage then a ResultMessage (final text, token usage, cost, session ID).
A “turn” is one round trip and runs without yielding to your code. A trivial prompt is 1-2 turns; “refactor the auth module and update the tests” can chain dozens of tool calls across many turns. The loop ends only when Claude emits output with no tool calls.
Five core message types drive the whole lifecycle: SystemMessage (subtypes init / compact_boundary / informational / worker_shutting_down), AssistantMessage (per Claude response), UserMessage (per tool result), StreamEvent (only with partial messages on), ResultMessage (end of loop). Iterate the stream to completion — trailing events (e.g. prompt_suggestion) can arrive after the result.
Built-in tools mirror Claude Code: Read/Edit/Write, Glob/Grep, Bash, WebSearch/WebFetch, ToolSearch (on-demand tool loading), and orchestration tools Agent/Skill/AskUserQuestion/TaskCreate/TaskUpdate. Extend via MCP servers, custom tool handlers, and project skills.
You hold the leash, not the steering wheel. Claude decides which tools to call; you decide whether they run — via allowed_tools (auto-approve), disallowed_tools (hard block), and permission_mode. Read-only tools run in parallel; state-mutating tools (Edit/Write/Bash) run sequentially.
Budget the loop in production. max_turns caps tool-use round trips; max_budget_usd caps spend. Either limit returns a ResultMessage with subtype error_max_turns / error_max_budget_usd. Without limits an open-ended prompt (“improve this codebase”) can run long.
Context accumulates and auto-compacts. The window never resets between turns — system prompt, tool defs, history, and (large) tool outputs all pile up; stable prefixes are prompt-cached. Near the limit the SDK summarizes older history and emits a compact_boundary. Put durable rules in CLAUDE.md (re-injected every request), not the initial prompt (may be summarized away).

The loop, end to end

For the prompt “Fix the failing tests in auth.ts” a full session looks like: SDK sends the prompt and yields the init SystemMessage, then —

Turn 1: Claude calls Bash (npm test) → AssistantMessage with the call, then UserMessage with output (three failures).
Turn 2: Claude calls Read on auth.ts + auth.test.ts → file contents returned.
Turn 3: Claude calls Edit to fix, then Bash to re-run tests; all pass.
Final turn: Claude produces text only (“Fixed the auth bug, all three tests pass now”) → final AssistantMessage + ResultMessage with cost and usage.

Four turns: three with tool calls, one final text-only response.

Controlling the run

Turns & budget: max_turns / maxTurns (tool-use round trips), max_budget_usd / maxBudgetUsd (spend cap). Setting a budget is a good production default.
Effort (low / medium / high / xhigh / max): how much reasoning Claude applies per turn. Low for file lookups; xhigh recommended for coding/agentic on Fable 5 and Opus 4.7+; max for deep multi-step work. Independent of extended thinking. Set per-session in query() or per-subagent on AgentDefinition.
Permission mode: default (uncovered tools hit your approval callback), acceptEdits (auto-approve file edits + common fs commands), plan (explore/plan, never edit source), dontAsk (only pre-approved rules run), auto (TS-only model classifier), bypassPermissions (run everything allowed — isolated environments only, not as root).
Model: defaults to Claude Code’s default for your auth/subscription; pin explicitly (e.g. model="claude-sonnet-4-6") for cheaper/faster agents.

Handling the result

ResultMessage.subtype is the primary termination check: success (only subtype with a result text field), error_max_turns, error_max_budget_usd, error_during_execution, error_max_structured_output_retries. All subtypes carry total_cost_usd, usage, num_turns, and session_id (guard for None on error paths in Python). A separate stop_reason (end_turn / max_tokens / refusal) explains why the model stopped on its final turn — check stop_reason == "refusal" to detect declines.

Context, sessions, hooks

Automatic compaction: summarizes older history near the limit; emits compact_boundary. Customize via a free-form “what to preserve” section in CLAUDE.md, a PreCompact hook (archive the transcript), or a manual /compact prompt.
Keep context lean: offload subtasks to subagents (fresh context; only the final response returns to the parent), scope each subagent’s tool set, lean on MCP tool search (defers MCP schemas), and drop effort to low for routine tasks.
Sessions: capture ResultMessage.session_id to resume (full prior context restored) or fork into an alternate approach. Python’s ClaudeSDKClient manages session IDs automatically.
Hooks fire at loop points — PreToolUse (validate/block before run), PostToolUse (audit/side effects), UserPromptSubmit, Stop, SubagentStart/SubagentStop, PreCompact. They run in your process (no context cost) and can short-circuit the loop (a rejecting PreToolUse hands Claude the rejection instead of executing).

Harness-design decisions, from rebuilding a CC-style harness on the raw Messages API (2026-06-25). A community write-up (r/ClaudeCode, raw/reddit-1ueuh0h.md) reports five decisions the harness quietly makes for you — behavioral inferences from a clean-room rebuild, not a look at CC’s source. Beyond hooks-as-killswitch (covered above — PreToolUse can block a call, not just log it):

CLAUDE.md layers on top of a base system prompt; it does not replace it. Instructions that seem “ignored” may be fighting an already-established base — write to override, not merely to state.
Subagent fan-out needs an abort tree. A parent abort should cascade down to every child, but a child dying should notify up without auto-killing the parent — otherwise one wedged child hangs the whole session.
Parallel subagents want a DAG, not a for-loop. Run everything with no unmet dependencies in parallel, gate the next layer, and fast-fail a node’s dependents on failure; a flat “loop and await all” both wastes parallelism and hangs on one bad node.
“Done” must be an explicit terminal state (done / blocked / needs-input, with evidence) or the agent trails off into “I could also…”; pair it with gating irreversible actions (deletes, pushes) behind explicit recent intent.

Try It

Run the canonical example. The “find and fix the bug causing test failures in the auth module” agent: allowed_tools=["Read","Edit","Bash","Glob","Grep"], setting_sources=["project"] (loads CLAUDE.md/skills/hooks), max_turns=30, effort="high". Iterate query() and branch on ResultMessage.subtype.
Make it safe before autonomous. Add max_budget_usd, switch permission_mode to acceptEdits for a dev machine (gates non-fs Bash behind allow rules), and add a PreToolUse hook to block dangerous commands.
Make it resumable. Persist ResultMessage.session_id; on error_max_turns, resume that session with a higher limit instead of restarting.
Pick the message granularity you need: ResultMessage only (final output/cost), AssistantMessage (per-turn progress), or enable partial messages for live StreamEvent deltas.

Claude Agent SDK — Official Toolkit for Building Custom Agents — the parent concept article (what the SDK is, vs the Anthropic SDK and Claude Code, billing). This page is its primary-source loop mechanics.
Hooks — the deterministic guardrail layer the loop fires (PreToolUse/PostToolUse/PreCompact/etc.).
Subagents — fresh-context offload, the loop’s main context-efficiency lever.
MCP — external tools/resources surfaced into the loop; MCP tool search defers schemas.
Certification Technical Reference — the stop_reason → tool_use → execute → re-call loop as an exam-grade primitive.
Loop Engineering (Addy Osmani) — the general “replace yourself as the prompter” pattern this SDK loop implements in first-party form.
The Loop Is the Unit of Work — the synthesis tying first-party loops to the broader agent-loop thesis.
Verifier-First Loops — the verification discipline that makes an unbounded loop safe to run.
Temporal — Durable Agentic Loop — the same Claude tool-use loop with durable execution (crash recovery + Temporal-owned retries) instead of an in-process loop.
12-Factor Agents — HumanLayer’s community framework describing the same “own your control flow / context / prompts” architecture this SDK loop implements in first-party form; see the factor-by-factor audit for how they compare directly.
Running the Agentic Loop — In-Process, Durable, or Hosted — the cross-topic synthesis placing this in-process SDK runtime against Temporal (durable) and Managed Agents (hosted), and the stateless-harness rule all three share.

Open Questions

Per-SDK event parity. The TypeScript SDK yields extra observability events (hook events, tool progress, rate limits, task notifications) and the auto permission mode that Python does not yet support — the doc flags the gap but not a timeline.
Default model resolution. “Claude Code’s default” depends on auth method and subscription; the doc doesn’t enumerate which model that resolves to per plan.

Jonathon's AI Wiki

Explorer

Claude Agent SDK — How the Agent Loop Works

Key Takeaways

The loop, end to end

Controlling the run

Handling the result

Context, sessions, hooks

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Claude Agent SDK — How the Agent Loop Works

Key Takeaways

The loop, end to end

Controlling the run

Handling the result

Context, sessions, hooks

Try It

Related

Open Questions

Graph View

Table of Contents

Backlinks