Source: raw/x-account-anthropicai-2066969532380721386.md · ai-research/anthropic-claude-code-expertise-2026-06-16.md Product: Claude Code URL: https://www.anthropic.com/research/claude-code-expertise Date: 2026-06-16
Anthropic’s Economic Research team analyzed ~400,000 interactive Claude Code sessions from ~235,000 people (October 2025–April 2026) with its privacy-preserving Clio tooling, building a framework for what work agentic coding does, who does it, and whether it succeeds. The headline: people make most planning decisions and Claude makes most execution decisions, the work is getting more valuable, and success tracks the user’s domain expertise far more than their coding background. Because Claude Code is a leading edge of agentic knowledge work, the report reads as an early preview of where agent-assisted work generally may be heading.
Key Takeaways
- Scope. ~400,000 sessions, ~235,000 people, Oct 2025–Apr 2026, across CLI / Claude.ai / Claude Code desktop. Excludes third-party IDE/SDK usage and headless
claude -pruns. Classifiers run on Claude Sonnet 4.6. Claude Code users now average ~20 hours/week with the tool running. - Division of labor. People make ~70% of planning decisions (what to do); Claude makes ~80% of execution decisions (how to do it) — “people decide what to build, the agent decides how to build it.” A typical session is ~4 turns; each user prompt sets off ~10 Claude actions on average and ~2,400 words of output per turn.
- What the work is. ~51% of sessions directly write (25%) or repair (26%) code — ~56% counting testing and orchestrating. Operating software (deploy/configure/run/monitor) is ~17% overall; planning/exploring ~14%; data analysis and prose ~13%.
- The work shifted over seven months. Fixing broken code fell from 33% → 19% of sessions (nearly halved). Operating software grew 14% → 21% (about 1 in 5 by April). Writing + data analysis roughly doubled, ~10% → ~20%.
- Tasks grew more valuable. Estimated via a freelance-marketplace proxy (calibrated to real job postings; coarse, used for relative comparison not literal dollars), the average session’s value rose ~27% Oct→Apr — building +43%, operating +34%, fixing +32%.
- Expertise is task-specific, not a job title. An accountant who specifies exact reconciliation rules is an “expert” at that task; a senior engineer asking a first Rust question is a “novice.” Claude does more per prompt for experts: ~12 actions and ~3,200 words vs ~5 actions and ~600 words for novices (~5x the output).
- Returns to expertise are real but front-loaded. Verified success (judged success plus a hard signal — passing tests, a matching commit/PR, or explicit user confirmation): 15% for novice-rated sessions vs 28–33% for intermediate-and-up; partial success 77% vs 91–92%. Most of the gain is novice→intermediate; the intermediate→expert gap is modest. Novices abandon troubled sessions ~19% of the time vs 5–7% for everyone else.
- Occupation matters less than expertise. In code-producing sessions, every one of the ten largest occupations lands within ~7 percentage points of software engineers on success (software ~34% verified vs other professions ~29%; partial success 89% vs 88%). Management occupations rate highest on verified success — slightly above software engineers. The coding background is becoming less decisive than command of the problem domain.
Try It
- Bring domain command, not just coding skill. State precisely what “done” looks like and what Claude should verify; the report’s success signal is how well you understand the problem, not whether you can code. ^[inferred]
- You don’t need to be an expert — intermediate is most of the win. A working grasp of the domain captures the bulk of the success gain; deep mastery adds only a little more. ^[inferred]
- Own the plan, delegate the execution. Keep the planning decisions (approach, acceptance criteria); let Claude make the file-level/how decisions where it already does ~80% of the work. ^[inferred]
- Give the agent verifiable success signals. Wire up tests and commit/PR checkpoints so both you and the agent can confirm completion — that’s literally how the study scores “verified success.” ^[inferred]
- Non-developers can run technical work in their field. Analysts, marketers, and lawyers in the dataset succeed at near-software-engineer rates on code-producing sessions — point Claude Code at a domain task you’d previously have outsourced. ^[inferred]