Source: multica-ai-andrej-karpathy-skills-readme-2026-05-22.md — full README.md + CLAUDE.md verbatim, fetched 2026-05-22. Repo: github.com/multica-ai/andrej-karpathy-skills. Stars: 146,000 (14,900 forks, 794 watchers). License: MIT. Language: Markdown. Author: @jiayuan_jy. Karpathy source post: x.com/karpathy/status/2015883857489522876 (~7.7M views).

A single CLAUDE.md file distilling Andrej Karpathy’s 2026 X post on LLM coding pitfalls into four behavioral principles for Claude Code: Think Before Coding / Simplicity First / Surgical Changes / Goal-Driven Execution. 146k stars at ingest — one of the highest-star single-file artifacts in the Claude Code ecosystem. Installable as a Claude Code plugin (/plugin marketplace add forrestchang/andrej-karpathy-skills) or appended directly to any project’s CLAUDE.md via curl. Same maintainer, same content as forrestchang/andrej-karpathy-skillsmultica-ai is the org name (also home to the Multica coding-agent platform), forrestchang is the maintainer’s personal handle that hosts the plugin marketplace. Ships an equivalent Cursor project rule at .cursor/rules/karpathy-guidelines.mdc so the same guidelines apply across both IDEs.

The four principles

Each principle pairs a one-line summary with the specific LLM failure mode it addresses:

1. Think Before Coding

Don’t assume. Don’t hide confusion. Surface tradeoffs.

  • State assumptions explicitly. If uncertain, ask.
  • If multiple interpretations exist, present them — don’t pick silently.
  • If a simpler approach exists, say so. Push back when warranted.
  • If something is unclear, stop. Name what’s confusing. Ask.

Addresses: wrong assumptions, hidden confusion, missing tradeoffs. From Karpathy: “The models make wrong assumptions on your behalf and just run along with them without checking. They don’t manage their confusion, don’t seek clarifications, don’t surface inconsistencies, don’t present tradeoffs, don’t push back when they should.”

2. Simplicity First

Minimum code that solves the problem. Nothing speculative.

  • No features beyond what was asked.
  • No abstractions for single-use code.
  • No “flexibility” or “configurability” that wasn’t requested.
  • No error handling for impossible scenarios.
  • If you write 200 lines and it could be 50, rewrite it.

Addresses: overcomplication, bloated abstractions. From Karpathy: “They really like to overcomplicate code and APIs, bloat abstractions, don’t clean up dead code… implement a bloated construction over 1000 lines when 100 would do.” The test: Would a senior engineer say this is overcomplicated?

3. Surgical Changes

Touch only what you must. Clean up only your own mess.

  • Don’t “improve” adjacent code, comments, or formatting.
  • Don’t refactor things that aren’t broken.
  • Match existing style, even if you’d do it differently.
  • If you notice unrelated dead code, mention it — don’t delete it.
  • Remove imports/variables/functions that your changes made unused; leave pre-existing dead code alone.

Addresses: orthogonal edits, touching code you shouldn’t. From Karpathy: “They still sometimes change/remove comments and code they don’t sufficiently understand as side effects, even if orthogonal to the task.” The test: Every changed line should trace directly to the user’s request.

4. Goal-Driven Execution

Define success criteria. Loop until verified.

Transform imperative tasks into verifiable goals:

Instead of…Transform to…
”Add validation""Write tests for invalid inputs, then make them pass"
"Fix the bug""Write a test that reproduces it, then make it pass"
"Refactor X""Ensure tests pass before and after”

For multi-step tasks, state a brief plan in 1. [Step] → verify: [check] format. Addresses: Karpathy’s load-bearing observation that “LLMs are exceptionally good at looping until they meet specific goals… Don’t tell it what to do, give it success criteria and watch it go.” Strong criteria let the agent loop independently; weak criteria (“make it work”) require constant clarification.

Install routes

Option A — Claude Code plugin (recommended). From inside Claude Code:

/plugin marketplace add forrestchang/andrej-karpathy-skills
/plugin install andrej-karpathy-skills@karpathy-skills

Installs the guidelines as a Claude Code plugin available across all projects.

Option B — Per-project CLAUDE.md curl.

# New project:
curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md
 
# Existing project (append):
echo "" >> CLAUDE.md
curl https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md >> CLAUDE.md

Option C — Cursor users. The repo ships .cursor/rules/karpathy-guidelines.mdc so the same guidelines apply when you open the project in Cursor. See CURSOR.md in the repo for setup.

Both install URLs point at forrestchang/andrej-karpathy-skills, not multica-ai/. The two repos host identical content; multica-ai is the org, forrestchang is the maintainer’s personal handle that serves the plugin marketplace.

Key Takeaways

  • It’s a 250-line CLAUDE.md, not a framework. Total artifact: four principles, ~250 lines of markdown. The 146k-star result is the install-effort-to-value ratio (one curl command → measurable behavior change) more than the technical sophistication.
  • Each principle is anchored to a specific Karpathy quote. The README cross-walks principle ↔ Karpathy observation explicitly. That traceability is load-bearing — it’s why the artifact reads as “Karpathy’s checklist” rather than “another opinionated CLAUDE.md.” Karpathy himself has not endorsed it; it’s a community distillation.
  • Same content as forrestchang/andrej-karpathy-skills. Both URLs are referenced from the same README. multica-ai is the org that also hosts Multica (open-source coding-agent platform); forrestchang is the maintainer’s personal handle and the plugin-marketplace host. There is no fork divergence — they are the same project under two names.
  • Goal-Driven Execution is the principle with the strongest empirical support. Cited Karpathy quote: “LLMs are exceptionally good at looping until they meet specific goals… give it success criteria and watch it go.” Matches the wiki’s coverage of [[claude-ai/claude-code-goal-command-walkthrough|the /goal command pattern]] (long-running goal-converged sessions across 40-80 tasks) and the 1. [Step] → verify: [check] plan-format rule already in the karpathy/CLAUDE.md root used by this repo.
  • The “Surgical Changes” principle is a direct counter to the wiki’s existing Anti-AI Slop Guide. Both fight the same failure mode (drive-by refactoring, generic improvements, scope creep), from different angles — Anti-AI Slop is content/copy-shaped, Surgical Changes is code-edit-shaped.
  • Worked-pattern verification from Dream Labs walkthrough (2026-05-20). With the file installed, a lead-magnet rebuild produced ~100 lines vs 212 lines without; “make the button orange” changed only the button vs “turned the whole site green” in the no-file control; “fix the bug” transformed to “write a test that reproduces it, make it pass” and looped to verified. The behavior change is real, not aspirational. See Karpathy’s LLM-Wiki Techniques for the full walkthrough citation.
  • Endorsement signal (uncorroborated, but consistent with content): Boris Cherny (Claude Code creator) reportedly replied to the source Karpathy tweet that “all these points resonate” and that he’s looking to fix them in Claude Code directly; Elon Musk reportedly replied “sums up the zeitgeist.” Reported via Dream Labs walkthrough; primary-source X verification not yet pulled.
  • Tradeoff: caution over speed. README’s explicit framing — these guidelines bias toward correctness on non-trivial work. For trivial tasks (typo fixes, one-liners), use judgment. Anti-pattern: applying the full rigor to a one-character bug fix. Aligned with: the Anthropic Best Practices doc verification primitive (“verification is the single highest-leverage thing you can do”).
  • The 146k stars confirm it’s not a star-puller glitch. The 2026-05-19 inbox-refresh stub flagged “suspiciously inflated star counts, likely puller glitch” and skip-triaged. Verified at fetch on 2026-05-22: 146,000 stars (up from 137,343 on 2026-05-19 — plausible growth), 14,900 forks, 794 watchers. Real artifact, real adoption. Correction: the original skip-triage was wrong; the puller was accurate.
  • The “AI developer” identity converges with other wiki sources. This artifact, the 13-rung ladder, Zero to Claude Code, and the 2026 AIOS pattern are independently arriving at the same operator profile — someone who thinks clearly, communicates well with AI, and turns ideas into real products. multica-ai’s CLAUDE.md is the behavioral training set for that profile; the other resources are the learning paths into it.

How this differs from existing wiki coverage

Wiki articleSurfaceWhat it coversThis article’s relationship
Karpathy’s LLM-Wiki Techniques for Claude CodeThe LLM-wiki pattern (this vault’s foundation)How to build + query a Karpathy-pattern second brain in Claude CodeDifferent concept; previously mentioned forrestchang/andrej-karpathy-skills in passing in its Recent Signals — this article is the dedicated treatment
Anthropic’s Best Practices for Claude CodeAnthropic’s official doc8 context-management techniques, verification primitive, ASCII flow + screenshot rulesSibling: Anthropic’s primary-source rules; multica-ai’s CLAUDE.md is a community-distilled behavioral overlay that compresses 4 of those primitives into one file
Anti-AI Slop GuideContent/copy-shapedResisting generic LLM filler in summariesRelated: same “don’t add what wasn’t asked for” thesis, applied to copy vs code
goal Command WalkthroughAnthropic’s /goal primitiveLong-running goal-converged sessions, 40-80 tasks, completion conditionsCompanion: this article’s principle 4 is the prompt-side of /goal; /goal is the runtime-side of principle 4
Anthropic Engineers’ Four Skill RulesInternal Anthropic blogSkill-creation rules (prompt skills not Claude, etc.)Different pattern; multica-ai’s CLAUDE.md is behavioral guidelines not skill construction rules
The CLAUDE.md File — Anthropic PrimerAnthropic primer on CLAUDE.md mechanics/init, three-level hierarchy, @<path> syntaxBuilding block: multica-ai’s CLAUDE.md is what to put in your CLAUDE.md after the primer teaches you how

Try It

  1. Smallest install (one Claude Code session): open a project in Claude Code, run /plugin marketplace add forrestchang/andrej-karpathy-skills, then /plugin install andrej-karpathy-skills@karpathy-skills. Make a non-trivial change in your next session and watch for the four behaviors (explicit assumptions, minimal scope, surgical edits, success criteria stated before execution).
  2. Per-project install (no marketplace): curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md from inside the project root. Edit to append your project-specific rules below the four principles — the README explicitly recommends this customization pattern.
  3. For Cursor users: clone the repo (or just copy .cursor/rules/karpathy-guidelines.mdc) into your project’s .cursor/rules/ directory. Same content, IDE-appropriate packaging.
  4. A/B-test the behavior change. Pick a non-trivial task (e.g., “add an endpoint to this Express app”) and run it twice — once in a project without the file, once with it installed. Compare diff size, scope of changes, and whether the agent asked clarifying questions before implementation. The Dream Labs AI 2026-05-20 walkthrough provides a worked example showing 212 → ~100 lines of code on the same task.
  5. Pair with the wiki’s existing [[claude-ai/karpathy-techniques-for-claude-code|LLM-Wiki Techniques]] article. This article + the LLM-wiki techniques give you the two halves of Karpathy’s Claude Code stack: behavioral-rules-for-the-agent (this) + knowledge-architecture-for-the-agent (the wiki pattern).
  6. Capture before/after metrics. The README’s “How to Know It’s Working” section names four observable signals: fewer unnecessary diff changes, fewer overcomplication rewrites, clarifying questions before implementation, clean minimal PRs. Track these for a week to verify the install paid off.

Open Questions

  • Boris Cherny + Elon Musk reply screenshots. Both endorsements are reported via the Dream Labs AI 2026-05-20 walkthrough but not independently verified against the X thread. Worth pulling primary-source screenshots if the endorsement matters for any external citation.
  • Karpathy’s own stance. README explicitly says the guidelines are derived from Karpathy’s observations, not endorsed by him. Has he replied or quote-tweeted the artifact since? Unknown at ingest.
  • multica-ai vs forrestchang/ — long-term canonical URL. Both URLs currently serve the same content. If they ever diverge (e.g., one stops being maintained), the README’s install commands hardcoded to forrestchang/ become the de facto canonical. Worth re-checking on next refresh.
  • Multica platform deep-dive. Companion project at github.com/multica-ai/multica“open-source platform for running and managing coding agents with reusable skills”. Not yet ingested into the wiki. Likely worth a separate article in agents-agentic-systems/ or claude-ai/ depending on its surface. Cross-references in the wiki currently mention “multica” only as an open-source shoulder of Open Design.
  • EXAMPLES.md content. The repo includes an EXAMPLES.md not pulled in this ingest. Could surface worked examples of the four principles in action — Tier-1 refresh candidate.
  • Star-count verification cadence. 146k at 2026-05-22, 137k at 2026-05-19. Growth rate (~3k/day) is high enough that this article’s star figure will rot fast. Mark for Tier-1 refresh in 30-90 days.