Source: ai-research/temporal-agentic-loop-claude-python-2026-06-16.md (https://docs.temporal.io/ai-cookbook/agentic-loop-tool-call-claude-python)
Temporal’s AI-cookbook recipe implements the Claude tool-use agentic loop as a durable Temporal Workflow in Python. Where the Agent SDK loop and the loop-engineering essays focus on the loop’s logic, this recipe adds the reliability dimension: the loop survives worker/process crashes, and every Claude call and tool execution is a retryable Temporal Activity. The headline discipline — let Temporal own retries, not the Anthropic client.
Key Takeaways
- The loop is a Workflow; the side effects are Activities.
AgentWorkflowholds the deterministic agentic loop;claude_responses.create(the Claude/Anthropic API call) andtool_invoker.dynamic_tool_activity(tool execution) are Temporal Activities — so they get automatic retries/timeouts and the loop itself is replay-safe. - Standard Claude tool-use mechanics inside. Each pass calls Claude with the accumulated
messageshistory; if the response hastype: "tool_use"content blocks, the full assistant response is appended, all tools execute, results are added as a user message, and the loop repeats; when Claude returns no tool calls, the text response is returned. - Durability is the whole point. Because Temporal checkpoints/replays Workflow state, a worker crash mid-loop resumes exactly where it left off — no lost session, no restart-from-scratch. This is what makes long-running or overnight agents production-grade.
- Let Temporal handle retries, not the Anthropic SDK. Per the recipe, “retries are handled by Temporal and not by the Anthropic client library… client retries can interfere with correct and durable error handling and recovery.” Disable client-side retry and put the retry policy on the Activity.
- Determinism boundary is explicit. The Workflow imports helpers via
with workflow.unsafe.imports_passed_through():— Temporal Workflows must be deterministic, so all non-deterministic/side-effecting work lives in Activities.
Why it matters for agent loops
- It’s the durability/reliability dimension of the loop — complementary to the verification discipline (verifier-first) and the cost/security discipline (economics & security). A loop that can’t survive a crash isn’t production-grade. ^[inferred]
- It’s the roll-your-own counterpart to Anthropic’s Managed Agents session-decoupling, which solves the same crash-recovery problem with an external event log +
wake()(see Scaling Managed Agents). ^[inferred]
Implementation
Tool/Service: Temporal (durable execution) + Anthropic Python SDK — cookbook recipe (Python).
Setup: run a Temporal server (localhost:7233), set ANTHROPIC_API_KEY, pip install temporalio anthropic, start the worker, then start a Workflow.
# worker.py — dispatches the loop (Workflow) and the Claude/tool Activities
from temporalio.client import Client
from temporalio.worker import Worker
from workflows.agent import AgentWorkflow
from activities import claude_responses, tool_invoker
from temporalio.contrib.pydantic import pydantic_data_converter
from concurrent.futures import ThreadPoolExecutor
client = await Client.connect("localhost:7233", data_converter=pydantic_data_converter)
worker = Worker(
client,
task_queue="tool-invoking-agent-claude-python-task-queue",
workflows=[AgentWorkflow],
activities=[claude_responses.create, tool_invoker.dynamic_tool_activity],
activity_executor=ThreadPoolExecutor(max_workers=10),
)
await worker.run()Integration notes:
AgentWorkflow(the loop) lives inworkflows/agent.py; Activities inactivities/; tool defs viatools.get_toolsandhelpers.tool_helpers(imported through the determinism boundary).- The Claude call and each tool invocation are separate Activities, so a tool failure retries without restarting the whole loop.
pydantic_data_convertergives typed payloads across the Workflow/Activity boundary.
Try It
- Watch it survive a crash. Stand up a Temporal dev server, run the recipe with the default
"Tell me about recursion"query, kill the worker mid-run, and restart it — the loop resumes from the last checkpoint instead of starting over. - Move the Claude call into an Activity and turn OFF the SDK’s client retries — set the retry policy on the Temporal Activity instead. This is the recipe’s core correctness rule.
- Add a tool as a new Activity (the
dynamic_tool_activitypattern) and confirm a tool failure retries independently of the loop. - Compare reliability models with the first-party Agent SDK loop (in-process,
max_turns/max_budget_usdbounded) — same loop logic, very different crash-recovery story.
Related
- Claude Agent SDK — How the Agent Loop Works — the same Claude tool-use loop, first-party and in-process; this Temporal recipe adds durable execution on top of the identical logic.
- Scaling Managed Agents — Anthropic’s hosted answer to the same crash-recovery problem (external session log +
wake()), vs rolling your own with Temporal. - Should You Build a Loop? — reliability is part of the production-readiness/security tax this recipe pays down.
- Verifier-First Loops — the correctness dimension; Temporal is the durability dimension of the same loop.
- The Loop Is the Unit of Work — the synthesis the durable loop slots into.
- Agents & Agentic Systems — the broader orchestration-framework home.
- Running the Agentic Loop — In-Process, Durable, or Hosted — the cross-topic synthesis placing this durable runtime against the in-process SDK loop and hosted Managed Agents.
Open Questions
- The exact Claude model id and whether the recipe streams responses weren’t captured (
docs.temporal.ioblocks WebFetch; content via partial Tavily extract + corroborating search snippets). - Tool-Activity idempotency (avoiding double side effects on retry) is implied by Temporal’s execution model but not detailed in the captured excerpt. ^[inferred]