Scaling Managed Agents — Anthropic's Brain / Hands / Session Architecture

Source: ai-research/anthropic-engineering-scaling-managed-agents-2026-06-16.md (Anthropic Engineering, “Scaling Managed Agents,” Lance Martin / Gabe Cemaj / Michael Cohen, published 2026-04-08) + raw/x-account-claudedevs-2072058428424589412.md (@ClaudeDevs, 2026-06-30 — API surface additions)

Anthropic Engineering’s architecture deep-dive on how Claude Managed Agents is built. The load-bearing move: decouple the brain (Claude + harness) from the hands (sandboxes/tools) and the session (an external append-only event log), so each can fail or be replaced independently. The motivating principle — “Harnesses encode assumptions that go stale as models improve” — is why the system is opinionated about interface shapes, not about what runs behind them. This is the how-it’s-built companion to the product/operator articles under Claude Managed Agents. (Lead author Lance Martin is rlancemartin, also behind the Managed Agents cookbooks.)

Key Takeaways

The “pet” problem → cattle. The first version coupled session + harness + sandbox in one container — “we’d adopted a pet.” A container failure lost the session, debugging was near-impossible, and untrusted code shared space with credentials. The fix virtualizes three independent interfaces.
Brain / Hands / Session. Session = a durable append-only event log stored outside the harness. Harness = a stateless loop calling Claude (nothing in it needs to survive a crash). Sandbox = isolated execution. “Each became an interface that made few assumptions about the others.”
Six primitives carry the whole system: execute(name, input) → string (universal interface to any tool/container/service), provision({resources}), wake(sessionId) (reboot a failed harness with prior state), getSession(id), emitEvent(id, event) (durable writes during the loop), and getEvents() (positional slices for context interrogation).
Crashes become tool errors. With containers stateless and cattle-like, “if a container died, the harness caught the failure as a tool-call error and passed it back to Claude.” Because the session log sits outside the harness, a dead harness is simply replaced. Eliminating upfront per-session container provisioning dropped p50 TTFT ~60% and p95 over 90%.
Credentials never touch the harness. Two patterns: resource-bundled auth (a Git token used only at sandbox init to clone/wire remotes; Claude never sees it) and an external vault + MCP proxy (the proxy takes a session token and fetches the real credential from the vault). “The harness is never made aware of any credentials.”
Session ≠ context window. Context is durably stored in the session log; getEvents() lets the harness select positional slices, rewind before a moment, or transform events before passing them to Claude’s context window — context engineering + prompt caching without losing durability.
Many brains, many hands. The execute() interface means “the harness doesn’t know whether the sandbox is a container, a phone, or a Pokémon emulator,” and harnesses can “pass hands to one another.” Many stateless harnesses scale without proportional container provisioning — and can reach a customer’s VPC without network peering.
Modeled on operating systems. “Operating systems solved this problem by virtualizing hardware into abstractions — process, file — general enough for programs that didn’t exist yet.” Design for harnesses, sandboxes, and agent patterns not yet conceived.
API surface caught up (2026-06-30). Streaming session event deltas, per-session agent overrides, additional webhook event types, reverse pagination, and improved credential injection scoping shipped as concrete Managed Agents capabilities — see “API surface additions” below.

The six primitives

Primitive	Job
`execute(name, input) → string`	Universal interface to any tool, container, MCP server, or service (a “hand”)
`provision({resources})`	Spin up a new container/sandbox on demand
`wake(sessionId)`	Reboot a failed/idle harness with its prior state
`getSession(id)`	Retrieve the full event log for recovery
`emitEvent(id, event)`	Durable write to the session during the loop
`getEvents()`	Fetch positional slices of the event stream (rewind / reread / transform)

API surface additions (2026-06-30)

[X signal — @ClaudeDevs, 2026-06-30] New Managed Agents capabilities, announced the same day as the Sonnet 5 launch: streaming session event deltas, per-session agent overrides, additional webhook event types, reverse pagination, and improved credential injection scoping. An example implementation (roadtrip_planner, built with the claude-api skill) is referenced in the official cookbook (anthropics/claude-cookbooks) demonstrating these capabilities.

These map onto the primitives above as concrete product surface: streaming deltas and reverse pagination both extend getEvents(); additional webhook event types extend the emitEvent() durability model into outbound notification; improved credential injection scoping extends the vault/proxy pattern in “Credentials never touch the harness.” Per-session agent overrides is a new capability not directly named in the original six primitives. ^[inferred — the announcement itself does not describe these features in terms of the six primitives; this mapping is this wiki’s synthesis connecting the 2026-06-30 product announcement to the 2026-04-08 architecture post]

Why it matters

It’s the canonical first-party statement of why agent infrastructure decouples the session from the harness — the “infrastructure was the wall” thesis of the Agent Platform team made concrete.
The wake() + external-session-log pattern is Anthropic’s managed answer to the same crash-recovery problem you’d otherwise solve yourself with durable execution — compare the Temporal durable agentic loop.

Try It

Audit your own harness against the three interfaces: is your session log outside the harness? Can a crashed harness be replaced without losing state? Do credentials live outside the sandbox the model controls?
Adopt the crash model: treat sandboxes as cattle — catch container death as a tool-call error and hand it back to Claude rather than nursing the instance.
Adopt the credential discipline: never let the harness or model see tokens — bundle at sandbox init, or proxy through a vault keyed by session.
Or don’t build it: use Managed Agents directly (platform.claude.com/docs/en/managed-agents/overview) and inherit the decoupling.

Claude Managed Agents — the product/operator overview this post is the architecture of.
Managed Agents — Self-Hosted Sandboxes + MCP Tunnels — the “hands run on customer infra” + credential-isolation pattern this decoupling makes possible.
Managed Agents Production (Jess Ann + Lance Martin) — the Agent / Environment / Session / Events mental model; this post is the deeper “why the session is its own interface.”
Agent Platform Team — the “infrastructure was the wall” thesis this architecture answers.
How We Contain Claude — the runtime-sandbox/containment slice that complements this session + credential slice.
Claude Agent SDK — How the Agent Loop Works — the “stateless loop calling Claude” that this architecture virtualizes as the harness.
Temporal — Durable Agentic Loop — the roll-your-own durable-execution counterpart to the external-session-log + wake() pattern.
Running the Agentic Loop — In-Process, Durable, or Hosted — the cross-topic synthesis placing this hosted runtime against the in-process SDK loop and the durable Temporal loop.

Open Questions

No pricing, rate limits, or availability regions are disclosed in the post.
The primitive signatures (execute/provision/wake/…) are described conceptually in the engineering post; exact SDK/API names in the platform docs may differ. ^[inferred] The 2026-06-30 API additions (streaming deltas, per-session overrides, webhook events, reverse pagination, credential scoping) are named product capabilities, not primitive signatures — still no direct confirmation of exact method names for the six primitives themselves.
Rollout status of the 2026-06-30 API additions (GA vs. gradual rollout) is not stated in the announcement.

Jonathon's AI Wiki

Explorer

Scaling Managed Agents — Anthropic's Brain / Hands / Session Architecture

Key Takeaways

The six primitives

API surface additions (2026-06-30)

Why it matters

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Scaling Managed Agents — Anthropic's Brain / Hands / Session Architecture

Key Takeaways

The six primitives

API surface additions (2026-06-30)

Why it matters

Try It

Related

Open Questions

Graph View

Table of Contents

Backlinks