Source: ai-research/claude-agent-sdk-agent-loop-2026-06-16.md (https://code.claude.com/docs/en/agent-sdk/agent-loop)
Anthropic’s official Agent SDK doc explaining the message lifecycle that powers SDK agents. The Agent SDK lets you embed the same autonomous loop that powers Claude Code in your own application — a standalone package (no Claude Code CLI required) that gives programmatic control over tools, permissions, cost limits, and output. This page is the primary-source, mechanical companion to the broader Claude Agent SDK concept article (whose loop description was previously inferred).
Key Takeaways
- The loop is five steps. (1) Receive prompt (+ system prompt, tool defs, history) → SDK yields a
SystemMessagesubtype"init". (2) Claude evaluates and responds with text and/or tool-call requests →AssistantMessage. (3) SDK executes the tools and feeds results back →UserMessage. (4) Steps 2-3 repeat; each full cycle is one turn. (5) When Claude responds with no tool calls, the SDK yields a finalAssistantMessagethen aResultMessage(final text, token usage, cost, session ID). - A “turn” is one round trip and runs without yielding to your code. A trivial prompt is 1-2 turns; “refactor the auth module and update the tests” can chain dozens of tool calls across many turns. The loop ends only when Claude emits output with no tool calls.
- Five core message types drive the whole lifecycle:
SystemMessage(subtypesinit/compact_boundary/informational/worker_shutting_down),AssistantMessage(per Claude response),UserMessage(per tool result),StreamEvent(only with partial messages on),ResultMessage(end of loop). Iterate the stream to completion — trailing events (e.g.prompt_suggestion) can arrive after the result. - Built-in tools mirror Claude Code:
Read/Edit/Write,Glob/Grep,Bash,WebSearch/WebFetch,ToolSearch(on-demand tool loading), and orchestration toolsAgent/Skill/AskUserQuestion/TaskCreate/TaskUpdate. Extend via MCP servers, custom tool handlers, and project skills. - You hold the leash, not the steering wheel. Claude decides which tools to call; you decide whether they run — via
allowed_tools(auto-approve),disallowed_tools(hard block), andpermission_mode. Read-only tools run in parallel; state-mutating tools (Edit/Write/Bash) run sequentially. - Budget the loop in production.
max_turnscaps tool-use round trips;max_budget_usdcaps spend. Either limit returns aResultMessagewith subtypeerror_max_turns/error_max_budget_usd. Without limits an open-ended prompt (“improve this codebase”) can run long. - Context accumulates and auto-compacts. The window never resets between turns — system prompt, tool defs, history, and (large) tool outputs all pile up; stable prefixes are prompt-cached. Near the limit the SDK summarizes older history and emits a
compact_boundary. Put durable rules in CLAUDE.md (re-injected every request), not the initial prompt (may be summarized away).
The loop, end to end
For the prompt “Fix the failing tests in auth.ts” a full session looks like: SDK sends the prompt and yields the init SystemMessage, then —
- Turn 1: Claude calls
Bash(npm test) →AssistantMessagewith the call, thenUserMessagewith output (three failures). - Turn 2: Claude calls
Readonauth.ts+auth.test.ts→ file contents returned. - Turn 3: Claude calls
Editto fix, thenBashto re-run tests; all pass. - Final turn: Claude produces text only (“Fixed the auth bug, all three tests pass now”) → final
AssistantMessage+ResultMessagewith cost and usage.
Four turns: three with tool calls, one final text-only response.
Controlling the run
- Turns & budget:
max_turns/maxTurns(tool-use round trips),max_budget_usd/maxBudgetUsd(spend cap). Setting a budget is a good production default. - Effort (
low/medium/high/xhigh/max): how much reasoning Claude applies per turn. Low for file lookups;xhighrecommended for coding/agentic on Fable 5 and Opus 4.7+;maxfor deep multi-step work. Independent of extended thinking. Set per-session inquery()or per-subagent onAgentDefinition. - Permission mode:
default(uncovered tools hit your approval callback),acceptEdits(auto-approve file edits + common fs commands),plan(explore/plan, never edit source),dontAsk(only pre-approved rules run),auto(TS-only model classifier),bypassPermissions(run everything allowed — isolated environments only, not as root). - Model: defaults to Claude Code’s default for your auth/subscription; pin explicitly (e.g.
model="claude-sonnet-4-6") for cheaper/faster agents.
Handling the result
ResultMessage.subtype is the primary termination check: success (only subtype with a result text field), error_max_turns, error_max_budget_usd, error_during_execution, error_max_structured_output_retries. All subtypes carry total_cost_usd, usage, num_turns, and session_id (guard for None on error paths in Python). A separate stop_reason (end_turn / max_tokens / refusal) explains why the model stopped on its final turn — check stop_reason == "refusal" to detect declines.
Context, sessions, hooks
- Automatic compaction: summarizes older history near the limit; emits
compact_boundary. Customize via a free-form “what to preserve” section in CLAUDE.md, aPreCompacthook (archive the transcript), or a manual/compactprompt. - Keep context lean: offload subtasks to subagents (fresh context; only the final response returns to the parent), scope each subagent’s tool set, lean on MCP tool search (defers MCP schemas), and drop
efforttolowfor routine tasks. - Sessions: capture
ResultMessage.session_idto resume (full prior context restored) or fork into an alternate approach. Python’sClaudeSDKClientmanages session IDs automatically. - Hooks fire at loop points —
PreToolUse(validate/block before run),PostToolUse(audit/side effects),UserPromptSubmit,Stop,SubagentStart/SubagentStop,PreCompact. They run in your process (no context cost) and can short-circuit the loop (a rejectingPreToolUsehands Claude the rejection instead of executing).
Try It
- Run the canonical example. The “find and fix the bug causing test failures in the auth module” agent:
allowed_tools=["Read","Edit","Bash","Glob","Grep"],setting_sources=["project"](loads CLAUDE.md/skills/hooks),max_turns=30,effort="high". Iteratequery()and branch onResultMessage.subtype. - Make it safe before autonomous. Add
max_budget_usd, switchpermission_modetoacceptEditsfor a dev machine (gates non-fsBashbehind allow rules), and add aPreToolUsehook to block dangerous commands. - Make it resumable. Persist
ResultMessage.session_id; onerror_max_turns, resume that session with a higher limit instead of restarting. - Pick the message granularity you need:
ResultMessageonly (final output/cost),AssistantMessage(per-turn progress), or enable partial messages for liveStreamEventdeltas.
Related
- Claude Agent SDK — Official Toolkit for Building Custom Agents — the parent concept article (what the SDK is, vs the Anthropic SDK and Claude Code, billing). This page is its primary-source loop mechanics.
- Hooks — the deterministic guardrail layer the loop fires (
PreToolUse/PostToolUse/PreCompact/etc.). - Subagents — fresh-context offload, the loop’s main context-efficiency lever.
- MCP — external tools/resources surfaced into the loop; MCP tool search defers schemas.
- Certification Technical Reference — the
stop_reason → tool_use → execute → re-callloop as an exam-grade primitive. - Loop Engineering (Addy Osmani) — the general “replace yourself as the prompter” pattern this SDK loop implements in first-party form.
- The Loop Is the Unit of Work — the synthesis tying first-party loops to the broader agent-loop thesis.
- Verifier-First Loops — the verification discipline that makes an unbounded loop safe to run.
- Temporal — Durable Agentic Loop — the same Claude tool-use loop with durable execution (crash recovery + Temporal-owned retries) instead of an in-process loop.
- Running the Agentic Loop — In-Process, Durable, or Hosted — the cross-topic synthesis placing this in-process SDK runtime against Temporal (durable) and Managed Agents (hosted), and the stateless-harness rule all three share.
Open Questions
- Per-SDK event parity. The TypeScript SDK yields extra observability events (hook events, tool progress, rate limits, task notifications) and the
autopermission mode that Python does not yet support — the doc flags the gap but not a timeline. - Default model resolution. “Claude Code’s default” depends on auth method and subscription; the doc doesn’t enumerate which model that resolves to per plan.