Source: ai-research/anthropic-best-practices-claude-code-2026-05-17.md — Anthropic’s official “Best practices for Claude Code” doc at code.claude.com/docs/en/best-practices. Primary source for everything in this article. Plus refresh: raw/x-elias-al-cal-rueb-code-with-claude-best-practices.md — X-signal anchor (Cal Rueb at Code with Claude SF, May 22, 2025 — the talk that produced the published doc).

The canonical Anthropic guide on operating Claude Code effectively — the engineering-team-distilled patterns underneath the Karpathy CLAUDE.md techniques this vault tracks. Anchored by one constraint: Claude’s context window fills up fast, and performance degrades as it fills. Every pattern in the doc is downstream of that single observation. Captures the full inventory of context-management tools (/clear, /compact, /compact <instructions>, /rewind directional summarization, CLAUDE.md compaction directives, /btw) that the Reddit-mediated Week 21 summary surfaced from inside the broader best-practices framing.

Key Takeaways

The single constraint everything else follows from

  • Context window = the most important resource to manage. A single debugging session or codebase exploration consumes tens of thousands of tokens.
  • LLM performance degrades as context fills — Claude starts “forgetting” earlier instructions and making more mistakes near the limit.
  • Anthropic frames this as the root cause behind most failure patterns; every other best practice in the doc is a context-management lever.

The single highest-leverage move

“Include tests, screenshots, or expected outputs so Claude can check itself. This is the single highest-leverage thing you can do.

Verification options: test suite, linter, Bash command that checks output, screenshot comparison via Claude in Chrome, or pasted expected output. Without it, the user becomes the only feedback loop and every mistake demands attention.

The 4-phase workflow (explore → plan → code → commit)

PhaseModeWhat happens
ExplorePlan modeClaude reads files + answers questions, no changes made
PlanPlan mode (continued)Detailed implementation plan; Ctrl+G opens plan in editor for direct edits
ImplementDefault modeClaude codes against the plan; verifies as it goes
CommitDefault modeDescriptive message + PR

Skip the plan when: scope is clear, fix is small (typo, log line, rename). Plan when: uncertain about approach, change touches multiple files, unfamiliar code. If you can describe the diff in one sentence, skip the plan.

CLAUDE.md is loaded every session

Inclusion/exclusion table from the official doc:

✅ Include❌ Exclude
Bash commands Claude can’t guessAnything Claude can figure out by reading code
Code style rules that differ from defaultsStandard language conventions Claude already knows
Testing instructions + preferred test runnersDetailed API documentation (link instead)
Repository etiquette (branch naming, PR conventions)Information that changes frequently
Project-specific architectural decisionsLong explanations or tutorials
Developer environment quirks (required env vars)File-by-file descriptions of the codebase
Common gotchas + non-obvious behaviorsSelf-evident practices like “write clean code”

Tuning rules (Anthropic’s own, verbatim):

  • “If Claude keeps doing something you don’t want despite having a rule against it, the file is probably too long and the rule is getting lost.”
  • “If Claude asks you questions that are answered in CLAUDE.md, the phrasing might be ambiguous.”
  • “Treat CLAUDE.md like code: review it when things go wrong, prune it regularly.”

Imports via @path/to/import syntax; placement options span home folder, project root, child directories (loaded on-demand), parent directories (monorepo pattern).

The complete context-tool inventory (8 tools, not 2)

Reddit 1tfjja8 surfaced 4 tools between /clear and /compact. The official doc actually documents 8 distinct context-management surfaces:

ToolWhat it doesWhen to reach for it
/clearReset context entirelyBetween unrelated tasks. Admission you waited too long if reached mid-flow.
Auto compactionTriggers near context limit; Claude summarizes preserving code patterns + file states + key decisionsDefault safety net.
/compact <instructions>Direct the summary explicitly (/compact Focus on the API changes)When you know what the next step needs and the auto-summary might miss it.
/rewindSummarize from hereCondenses messages forward from checkpoint; keeps earlier context intactKeep early architectural decisions; compress messy debugging after them.
/rewindSummarize up to hereCondenses messages before checkpoint; keeps recent ones in fullDrop the setup noise; keep recent precise working state.
CLAUDE.md compaction directivesLines like “When compacting, always preserve the full list of modified files and any test commands”Set invariants once; bake them into every compaction.
/btwSide questions appear in a dismissible overlay and never enter conversation historyQuick lookups (“what does this flag do?”) you don’t want polluting context.
SubagentsRun in separate context windows; return summaries to mainHeavy investigation. The single most powerful context-saving tool in the doc.

Decision rule from the doc: /clear is admission you waited too long. The earlier instrument you reach for, the cheaper your session stays.”

Communication patterns

Ask Claude questions you’d ask a senior engineer. Concrete examples from the doc:

  • How does logging work?
  • What edge cases does CustomerOnboardingFlowImpl handle?
  • Why does this code call foo() instead of bar() on line 333?

Let Claude interview you for larger features:

“I want to build [brief description]. Interview me in detail using the AskUserQuestion tool. Ask about technical implementation, UI/UX, edge cases, concerns, and tradeoffs. Don’t ask obvious questions, dig into the hard parts I might not have considered. Keep interviewing until we’ve covered everything, then write a complete spec to SPEC.md.”

Then start a fresh session to execute against the spec.

Permission modes — three ways to reduce interruptions

ModeWhat it does
Auto modeClassifier model reviews commands, blocks only what looks risky (scope escalation, unknown infrastructure, hostile-content-driven actions)
Permission allowlistsPermit specific tools (npm run lint, git commit) without per-call approval
SandboxingOS-level isolation restricting filesystem + network access

Pairs with /goal (see goal walkthrough) for uninterrupted long-running sessions — auto mode is the standard companion to /goal per the doc.

Extension points (5 distinct surfaces)

SurfaceUse when
CLI tools (gh, aws, gcloud, sentry-cli)Most context-efficient. Anthropic’s recommended default.
MCP serversExternal services without a CLI (Notion, Figma, databases)
HooksActions that must happen every time with zero exceptions — deterministic, unlike advisory CLAUDE.md
Skills (SKILL.md in .claude/skills/)Domain knowledge + reusable workflows; loaded on demand, not bloating every conversation
Subagents (.claude/agents/)Isolated tasks with their own context + scoped toolset; delegated work
PluginsBundle skills + hooks + subagents + MCP into a single installable unit

Skill frontmatter pattern with disable-model-invocation: true — for workflows with side effects (deploy, send message) that should be triggered manually only.

Parallel-session scaling

Four parallelization paths:

  • Worktrees — separate CLI sessions, isolated git checkouts
  • Desktop app — visual multi-session, each in its own worktree
  • Claude Code on the web — Anthropic-managed cloud VMs (isolated, no local resource cost)
  • Agent teams — automated coordination with shared tasks + messaging + team lead

Writer/Reviewer pattern — fresh context in the Reviewer Claude removes bias toward code it just wrote. Anthropic explicitly recommends this for high-quality reviews.

Fan-out for batch work

for file in $(cat files.txt); do
  claude -p "Migrate $file from React to Vue. Return OK or FAIL." \
    --allowedTools "Edit,Bash(git commit *)"
done

Test on a few files, then run at scale. --allowedTools matters for unattended operations — restricts what Claude can do when there’s no human in the loop.

The 5 named failure patterns (from the doc)

  1. The kitchen sink session — Context full of unrelated work. Fix: /clear between unrelated tasks.
  2. Correcting over and over — Context polluted with failed approaches. Fix: After two failed corrections, /clear + better initial prompt incorporating what you learned.
  3. The over-specified CLAUDE.md — Important rules lost in noise. Fix: Ruthlessly prune. If Claude does something correctly without the instruction, delete it.
  4. The trust-then-verify gap — Plausible-looking output that doesn’t handle edge cases. Fix: Always provide verification (tests, scripts, screenshots). If you can’t verify it, don’t ship it.
  5. The infinite exploration — Unscoped “investigate” → Claude reads hundreds of files. Fix: Scope narrowly or use subagents.

Develop your intuition (final section)

The patterns are starting points, not rules. Sometimes you should:

  • Let context accumulate (deep in one complex problem; history is valuable)
  • Skip planning (exploratory tasks)
  • Use a vague prompt (see how Claude interprets before constraining)

Anthropic frames this as intuition no guide can capture — pay attention to what works, ask why when Claude struggles.

Cross-references inside the karpathy wiki

This doc is the upstream primary source for several existing articles in this vault:

Community signals

[X signal — @iam_elias1 2026-05-18] Origin attribution: Cal Rueb at Code with Claude SF, May 22, 2025

Elias Al’s promotional X post (url, 2,777 views at capture) names the talk that produced the published doc this article is built on:

  • Speaker: Cal Rueb, Member of Technical Staff at Anthropic
  • Event: Code with Claude, San Francisco
  • Date: May 22, 2025
  • Framing claimed in the post: “everything Anthropic’s own engineers do internally that most users never figure out on their own”

The X post’s section outline of the talk matches the structure of the published doc this article summarizes:

  1. CLAUDE.md setup so Claude always knows your codebase (see above)
  2. Why planning before coding matters (see above)
  3. Context management so performance doesn’t degrade (see above)
  4. Parallel session workflow Anthropic engineers use daily (see above)
  5. Security, permissions, and staying in control (see above)

Color anecdote from the post: Cal described becoming “completely addicted” to Claude Code after a weekend install, building a note-taking app, and carrying his laptop everywhere to watch it work.

Quantitative claim — needs verification

The X post asserts: “Anthropic’s internal testing found that unguided attempts succeed roughly 33% of the time. With proper planning and structured workflows, that number climbs dramatically.” ^[inferred]

This is a two-hop citation chain (X post → talk → Anthropic internal data). The number does not appear in the published best-practices doc that this article cites. Surface it cautiously when relying on it; corroborate against the actual talk recording before treating as a primary-source baseline.

Open Questions

  • /btw documentation depth — the doc cites it but only at the summary level. The dedicated /en/interactive-mode#side-questions-with-%2Fbtw page is referenced but not pulled. Future ai-research pass candidate.
  • /en/checkpointing Restore vs. summarize semantics — the doc points to this page for the full semantics of Rewind directional summarization. Worth pulling as a separate ai-research file.
  • /en/sandboxing reference — sandboxing mode is referenced as a permission-mode option; the full sandboxing page wasn’t captured.
  • Versioning + last-updated date — Anthropic doesn’t surface a “last updated” stamp on the public doc; we don’t know how stale this snapshot is. Re-fetch quarterly + diff would catch drift.
  • Cal Rueb talk recording — the X post references a video (“Watch the full talk here”) with the talk recording. Pulling the recording (or its transcript) as a separate ai-research source would corroborate the 33% stat and capture content present in the talk but not in the written doc.

Try It

  1. Read the source once end-to-end at code.claude.com/docs/en/best-practices. Cheaper than re-discovering each pattern from scratch.
  2. Audit your CLAUDE.md against the ✅/❌ table above. Anthropic’s pruning rule: “For each line, ask: Would removing this cause Claude to make mistakes? If not, cut it.”
  3. Learn the full 8-tool context inventory — most operators only know /clear + /compact. Add /btw and the two /rewind directions to your daily moves.
  4. Set CLAUDE.md compaction directives. Add lines like “When compacting, always preserve the full list of modified files and any test commands” so compaction never strips load-bearing context.
  5. Add verification to your default workflow — tests/screenshots/expected outputs in the initial prompt. Anthropic’s framing: “the single highest-leverage thing you can do.”
  6. Use subagents for any investigation that would touch 5+ files — keeps main context clean.
  7. Try the Writer/Reviewer pattern the next time you ship something tricky. Two sessions, two worktrees.
  8. Use Claude to interview you on the next bigger feature — start with a minimal prompt + the interview meta-prompt above; let it produce a spec; start a fresh session to execute it.
  9. Skip the plan when scope is one sentence. Anthropic explicitly calls out the over-use of plan mode for trivial fixes.
  10. For batch migrations, use the fan-out template above with --allowedTools scoping. Test on 2-3 files before running at scale.