Source: raw/Context_Management_in_Claude_Code.md — Anthropic-produced YouTube tutorial eW3oTyfeWZ0 (~2 min).

Short Anthropic-authored explainer covering the primitives of context management in Claude Code: how the context window fills up, how /compact / /clear / /context differ, where CLAUDE.md fits, and the four levers (specificity, MCP discipline, skills, subagents) that reduce context pressure. Companion to the longer Anthropic’s Official Best Practices for Claude Code doc — same primitives, distilled to a 2-minute walkthrough.

Key Takeaways

  • Context window = Claude’s working memory. Every prompt, file read, tool call, and tool result is appended to the window. Finite space → optimization is “extremely important.”
  • Automatic compaction kicks in near the limit. Summarizes important details and removes unnecessary tool-call results. Caveat: “could potentially lose details in your previous conversation.”
  • Three commands, three intents.
    • /compact — manual compaction; preserves a memory of prior work. Use when continuing the same feature but running low on space.
    • /clear — wipes everything, starts from scratch. Use when switching to a new feature so prior conversation doesn’t bias new work.
    • /context — inspect current context size + category breakdown + graphic. Diagnostic, no state change.
  • CLAUDE.md is the cross-session memory layer. Anything Claude should remember in other sessions belongs there. Avoids re-discovering things from scratch on every new session.
  • Specificity paradox. The irony of writing a shorter prompt: it actually takes up more context long-run, because Claude has to look around the codebase and do its own thinking to compensate. A clear sentence or two upfront is cheaper than ambiguity.
  • MCP server discipline. MCP servers load all of their tools into context by default. If a server is unrelated to the current project, turn it off. Reduces baseline context load.
  • Skills vs MCP for context. Skills work similarly to MCP servers but don’t put the entire thing into context — explicit endorsement of skills as a lower-context-cost alternative to MCP when both could deliver the same functionality.
  • Subagents run in parallel with a separate context window. For tasks where you only need the answer not the journey (e.g., “where are the authentication endpoints?”), a subagent does the work and returns just the summary to the main agent. Main context is preserved.

Summary heuristic (from the closing line)

“Use /compact to summarize long sessions and /clear to start fresh. To use your context window effectively, be specific with what you want.”

Why It Matters

  • Primitive-level Anthropic guidance. Short videos like this are the canonical specification for what each command does — Reddit chatter (“the 4 context tools”) is community paraphrasing of these primitives. Cite this video when teaching the commands.
  • The MCP-vs-Skills cost framing is novel. Most context comparisons focus on MCP-vs-MCP. The “skills are lower-context than MCP for the same task” framing is the operational call-to-action — prefer skills when a skill can do the job.
  • Sub-agent context isolation is the leverage primitive. For exploration questions (“where is X?”), spawning a subagent and returning a summary is the cheapest way to preserve main-thread context. Pattern works across Code, Cowork, and any agent surface with subagent support.

Try It

  1. Run /context mid-session on your most active Claude Code project. Note which category dominates — usually file reads or MCP tool definitions.
  2. If MCP definitions dominate, audit your MCP servers and turn off the ones unrelated to the current project. Compare /context before/after.
  3. Add a one-line note in CLAUDE.md next time you re-explain something Claude should have remembered. Over a week, the file becomes self-documenting cross-session memory.
  4. Next time you’d /compact after a feature is done, try /clear instead and see if the next feature lands cleaner.

Community reproduction — 161-turn refactor → 52 turns via batching plugin (2026-05-18)

[Reddit signal — r/ClaudeCode 2026-05-18] reddit-1th04si (ChampionshipNo2815, 44 score / 75 comments): A Max-plan user traced a single rename refactor across a few files. Claude Code ran 161 turns to finish it — every read, every grep, every edit is its own API call, and each one re-ingests everything before it as input tokens. By turn 60, you’re paying context cost on the entire session history. Same task re-run with a Claude Code plugin that collapses search + read into one call and stacks all edits into one round-trip finished in 52 turns. No model change, no plan change — just fewer round-trips per task. Reproduction of the “specificity paradox” framing above: tool-call-amplifies-context isn’t an abstract concern, it dominates real-world session-limit consumption. Concrete falsifiable claim (161 → 52, ~68% reduction) on a small refactor; the plugin name itself isn’t called out in the post but the pattern (search+read fusion + edit batching) is the operational lever.

Open Questions

  • The video doesn’t quantify the “MCP-vs-Skills” context cost differential — would be useful to have Anthropic publish a token-cost table comparing a 10-tool MCP server vs an equivalent skill bundle.
  • How does /compact interact with custom instructions in CLAUDE.md — does compaction preserve or summarize them?
  • Which Claude Code plugin collapses search+read+edit into batched round-trips? The reddit-1th04si post doesn’t name it. Worth surfacing if a specific plugin matches this description (candidate: a token-saver / context-saver style plugin).