Source: ai-research/temporal-agentic-loop-claude-python-2026-06-16.md (https://docs.temporal.io/ai-cookbook/agentic-loop-tool-call-claude-python)

Temporal’s AI-cookbook recipe implements the Claude tool-use agentic loop as a durable Temporal Workflow in Python. Where the Agent SDK loop and the loop-engineering essays focus on the loop’s logic, this recipe adds the reliability dimension: the loop survives worker/process crashes, and every Claude call and tool execution is a retryable Temporal Activity. The headline discipline — let Temporal own retries, not the Anthropic client.

Key Takeaways

  • The loop is a Workflow; the side effects are Activities. AgentWorkflow holds the deterministic agentic loop; claude_responses.create (the Claude/Anthropic API call) and tool_invoker.dynamic_tool_activity (tool execution) are Temporal Activities — so they get automatic retries/timeouts and the loop itself is replay-safe.
  • Standard Claude tool-use mechanics inside. Each pass calls Claude with the accumulated messages history; if the response has type: "tool_use" content blocks, the full assistant response is appended, all tools execute, results are added as a user message, and the loop repeats; when Claude returns no tool calls, the text response is returned.
  • Durability is the whole point. Because Temporal checkpoints/replays Workflow state, a worker crash mid-loop resumes exactly where it left off — no lost session, no restart-from-scratch. This is what makes long-running or overnight agents production-grade.
  • Let Temporal handle retries, not the Anthropic SDK. Per the recipe, “retries are handled by Temporal and not by the Anthropic client library… client retries can interfere with correct and durable error handling and recovery.” Disable client-side retry and put the retry policy on the Activity.
  • Determinism boundary is explicit. The Workflow imports helpers via with workflow.unsafe.imports_passed_through(): — Temporal Workflows must be deterministic, so all non-deterministic/side-effecting work lives in Activities.

Why it matters for agent loops

  • It’s the durability/reliability dimension of the loop — complementary to the verification discipline (verifier-first) and the cost/security discipline (economics & security). A loop that can’t survive a crash isn’t production-grade. ^[inferred]
  • It’s the roll-your-own counterpart to Anthropic’s Managed Agents session-decoupling, which solves the same crash-recovery problem with an external event log + wake() (see Scaling Managed Agents). ^[inferred]

Implementation

Tool/Service: Temporal (durable execution) + Anthropic Python SDK — cookbook recipe (Python).

Setup: run a Temporal server (localhost:7233), set ANTHROPIC_API_KEY, pip install temporalio anthropic, start the worker, then start a Workflow.

# worker.py — dispatches the loop (Workflow) and the Claude/tool Activities
from temporalio.client import Client
from temporalio.worker import Worker
from workflows.agent import AgentWorkflow
from activities import claude_responses, tool_invoker
from temporalio.contrib.pydantic import pydantic_data_converter
from concurrent.futures import ThreadPoolExecutor
 
client = await Client.connect("localhost:7233", data_converter=pydantic_data_converter)
worker = Worker(
    client,
    task_queue="tool-invoking-agent-claude-python-task-queue",
    workflows=[AgentWorkflow],
    activities=[claude_responses.create, tool_invoker.dynamic_tool_activity],
    activity_executor=ThreadPoolExecutor(max_workers=10),
)
await worker.run()

Integration notes:

  • AgentWorkflow (the loop) lives in workflows/agent.py; Activities in activities/; tool defs via tools.get_tools and helpers.tool_helpers (imported through the determinism boundary).
  • The Claude call and each tool invocation are separate Activities, so a tool failure retries without restarting the whole loop.
  • pydantic_data_converter gives typed payloads across the Workflow/Activity boundary.

Try It

  1. Watch it survive a crash. Stand up a Temporal dev server, run the recipe with the default "Tell me about recursion" query, kill the worker mid-run, and restart it — the loop resumes from the last checkpoint instead of starting over.
  2. Move the Claude call into an Activity and turn OFF the SDK’s client retries — set the retry policy on the Temporal Activity instead. This is the recipe’s core correctness rule.
  3. Add a tool as a new Activity (the dynamic_tool_activity pattern) and confirm a tool failure retries independently of the loop.
  4. Compare reliability models with the first-party Agent SDK loop (in-process, max_turns/max_budget_usd bounded) — same loop logic, very different crash-recovery story.

Open Questions

  • The exact Claude model id and whether the recipe streams responses weren’t captured (docs.temporal.io blocks WebFetch; content via partial Tavily extract + corroborating search snippets).
  • Tool-Activity idempotency (avoiding double side effects on retry) is implied by Temporal’s execution model but not detailed in the captured excerpt. ^[inferred]