Context Management in Claude Code (Anthropic Tutorial)

Source: raw/Context_Management_in_Claude_Code.md — Anthropic-produced YouTube tutorial eW3oTyfeWZ0 (~2 min).

Short Anthropic-authored explainer covering the primitives of context management in Claude Code: how the context window fills up, how /compact / /clear / /context differ, where CLAUDE.md fits, and the four levers (specificity, MCP discipline, skills, subagents) that reduce context pressure. Companion to the longer Anthropic’s Official Best Practices for Claude Code doc — same primitives, distilled to a 2-minute walkthrough.

Key Takeaways

Context window = Claude’s working memory. Every prompt, file read, tool call, and tool result is appended to the window. Finite space → optimization is “extremely important.”
Automatic compaction kicks in near the limit. Summarizes important details and removes unnecessary tool-call results. Caveat: “could potentially lose details in your previous conversation.”
Three commands, three intents.
- /compact — manual compaction; preserves a memory of prior work. Use when continuing the same feature but running low on space.
- /clear — wipes everything, starts from scratch. Use when switching to a new feature so prior conversation doesn’t bias new work.
- /context — inspect current context size + category breakdown + graphic. Diagnostic, no state change.
CLAUDE.md is the cross-session memory layer. Anything Claude should remember in other sessions belongs there. Avoids re-discovering things from scratch on every new session.
Specificity paradox. The irony of writing a shorter prompt: it actually takes up more context long-run, because Claude has to look around the codebase and do its own thinking to compensate. A clear sentence or two upfront is cheaper than ambiguity.
MCP server discipline. MCP servers load all of their tools into context by default. If a server is unrelated to the current project, turn it off. Reduces baseline context load.
Skills vs MCP for context. Skills work similarly to MCP servers but don’t put the entire thing into context — explicit endorsement of skills as a lower-context-cost alternative to MCP when both could deliver the same functionality.
Subagents run in parallel with a separate context window. For tasks where you only need the answer not the journey (e.g., “where are the authentication endpoints?”), a subagent does the work and returns just the summary to the main agent. Main context is preserved.

Summary heuristic (from the closing line)

“Use /compact to summarize long sessions and /clear to start fresh. To use your context window effectively, be specific with what you want.”

Why It Matters

Primitive-level Anthropic guidance. Short videos like this are the canonical specification for what each command does — Reddit chatter (“the 4 context tools”) is community paraphrasing of these primitives. Cite this video when teaching the commands.
The MCP-vs-Skills cost framing is novel. Most context comparisons focus on MCP-vs-MCP. The “skills are lower-context than MCP for the same task” framing is the operational call-to-action — prefer skills when a skill can do the job.
Sub-agent context isolation is the leverage primitive. For exploration questions (“where is X?”), spawning a subagent and returning a summary is the cheapest way to preserve main-thread context. Pattern works across Code, Cowork, and any agent surface with subagent support.

Try It

Run /context mid-session on your most active Claude Code project. Note which category dominates — usually file reads or MCP tool definitions.
If MCP definitions dominate, audit your MCP servers and turn off the ones unrelated to the current project. Compare /context before/after.
Add a one-line note in CLAUDE.md next time you re-explain something Claude should have remembered. Over a week, the file becomes self-documenting cross-session memory.
Next time you’d /compact after a feature is done, try /clear instead and see if the next feature lands cleaner.

Anthropic’s Official Best Practices for Claude Code — the long-form primary source on context management; covers 8 techniques where this video covers the core 4.
Claude Code What’s New 2026 W21 — Reddit chatter on “4 context tools” was paraphrasing these primitives.
Claude Code Subagents — separate context window per subagent, the architectural unlock referenced here.
Claude Code Skills Ecosystem — skills as lower-context alternative to MCP.
Essential MCP Servers — context cost of MCP servers, what to turn on/off.
Claude Code CLI Reference — /compact, /clear, /context slash commands.
Claude Code Memory Architecture Comparison — CLAUDE.md and persistent memory choices.

Community reproduction — 161-turn refactor → 52 turns via batching plugin (2026-05-18)

[Reddit signal — r/ClaudeCode 2026-05-18] reddit-1th04si (ChampionshipNo2815, 44 score / 75 comments): A Max-plan user traced a single rename refactor across a few files. Claude Code ran 161 turns to finish it — every read, every grep, every edit is its own API call, and each one re-ingests everything before it as input tokens. By turn 60, you’re paying context cost on the entire session history. Same task re-run with a Claude Code plugin that collapses search + read into one call and stacks all edits into one round-trip finished in 52 turns. No model change, no plan change — just fewer round-trips per task. Reproduction of the “specificity paradox” framing above: tool-call-amplifies-context isn’t an abstract concern, it dominates real-world session-limit consumption. Concrete falsifiable claim (161 → 52, ~68% reduction) on a small refactor; the plugin name itself isn’t called out in the post but the pattern (search+read fusion + edit batching) is the operational lever.

Cross-tool note — building a clean context window on the filesystem (Nate B Jones, 2026-05-30)

[YouTube — "How I AI / My Weekly Codex Experiments," rqVzTX8w_w0] Nate B Jones’s weekly-experiments take adds a filesystem-as-context-assembly technique he reports works in Codex but not Claude Code or Cowork (he tried the same workflow in both): tell the agent to scan your file system and copy the relevant files — described in natural language (“this is what it’s about, roughly when I made it”), not by filename — into a clean working folder, then open a fresh chat pointed only at that folder plus the task (detailed instructions dropped in as a file). The clean, scoped working set is what unlocks 30-50K-word document work, plus complex spreadsheet/coding work, across a folder structure. He roots it in Codex’s repo/sandbox heritage — to Codex, text files and code files are the same thing in a folder.

Paired prompting shift (applies to both Claude and Codex): prompting in 2026 has moved from prompt-engineering → “point at files + tell it what good looks like (an eval, or a few standard-defining questions)” → now define the shape of the task collaboratively first (a messy human-and-agent back-and-forth) and then execute agentically — he reports 5.5 doesn’t lose the thread when you flip from “let’s define this” to “now go do it.” A useful contrast to the Claude-Code-native levers above: same goal (a clean, deliberately-assembled context window) reached by having the agent stage files rather than by /compact / /clear discipline. ^[inferred — single-operator anecdote; the “doesn’t work in CC/Cowork” claim is his test, not a benchmarked result]

The “dumb zone” — where context degrades before the limit (Cole Medin, 2026-06)

[YouTube — "How to Build Effective Claude Code Agents in 2026," Nate Herk × Cole Medin] Operators report Claude Code’s attention degrades well before the 1M-token limit — a “dumb zone” where the needle-in-a-haystack / lost-in-the-middle effect amplifies and the model “acts dumb.” Cole Medin’s subjective per-model thresholds: Opus 4.8 ~250k tokens, Opus 4.7 ~200k, Sonnet 4.6 ~100–125k. Past those points results visibly degrade — which is also why running ~20 always-on MCP servers (each stuffing ~20k tokens of definitions) hurts: you enter the dumb zone before doing any work. The takeaway reinforces the levers above (/compact, /clear, MCP discipline, subagents) but adds a where does it start heuristic the article previously lacked: keep working context comfortably under the per-model threshold, not merely under the hard limit. ^[inferred — subjective operator estimates; two operators in the source converge near ~250k for Opus, but no benchmark backs the specific numbers]

Open Questions

The video doesn’t quantify the “MCP-vs-Skills” context cost differential — would be useful to have Anthropic publish a token-cost table comparing a 10-tool MCP server vs an equivalent skill bundle.
How does /compact interact with custom instructions in CLAUDE.md — does compaction preserve or summarize them?
Which Claude Code plugin collapses search+read+edit into batched round-trips? The reddit-1th04si post doesn’t name it. A strong candidate is Token Optimizer (alexgreensh), whose Read-Cache dedup + Delta Mode + AST Structure Maps compress exactly these re-read round-trips — though it is not confirmed to be the post’s referent. ^[inferred]

Jonathon's AI Wiki

Explorer

Context Management in Claude Code (Anthropic Tutorial)

Key Takeaways

Summary heuristic (from the closing line)

Why It Matters

Try It

Community reproduction — 161-turn refactor → 52 turns via batching plugin (2026-05-18)

Cross-tool note — building a clean context window on the filesystem (Nate B Jones, 2026-05-30)

The “dumb zone” — where context degrades before the limit (Cole Medin, 2026-06)

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Context Management in Claude Code (Anthropic Tutorial)

Key Takeaways

Summary heuristic (from the closing line)

Why It Matters

Try It

Related

Community reproduction — 161-turn refactor → 52 turns via batching plugin (2026-05-18)

Cross-tool note — building a clean context window on the filesystem (Nate B Jones, 2026-05-30)

The “dumb zone” — where context degrades before the limit (Cole Medin, 2026-06)

Open Questions

Graph View

Table of Contents

Backlinks