Source: I_Tried_Every_Popular_Claude_Skills_System_Here_is_the_Best Creator: Code4AI URL: https://www.youtube.com/watch?v=VoxL_YmHR-I Duration: ~10-12 minutes Platform: YouTube
A contrarian take after two years of working with agents and reviewing every major skill library (Gary Tan’s gstack, Affaan Mustafa’s Everything Claude Code, Matt Pocock, Addy Osmani, BMAD, GSD, OpenSpec, Superpowers). The thesis: the best skill system is the one you build yourself — start with natural language prompting + native plan mode, only add skills when the agent demonstrably fails at something, keep them as short as possible. Most popular skill libraries replicate the same 5-step software development lifecycle (research → prototype → plan → build → test → polish) that engineers have used for 30+ years. Skills are documentation — they rot, they need maintenance, they bloat context. The minority position counterweighting the install-N-starter-packs pattern that dominates the wider operator community.
Key Takeaways
- The universal 5-step pattern under every popular skill library: research / discuss-spec → prototype (front end only with dummy JSON) → plan (markdown breakdown into phases + verification steps) → build (one slice at a time) → test (lint + build + Playwright/browser + human smoke test) — plus an optional polish step (run a different model over the codebase to simplify / catch issues). All seven major libraries reviewed map onto this spine.
- What each library specifically contributes (in order of complexity):
- Addy Osmani (Google) — clean 6-7 prompt spec → plan → build incrementally → test → review → simplify. The textbook implementation of the universal pattern.
- Matt Pocock (mattpocock/skills) — simplicity-first. Notable adds:
diagnose(a sharper specify),grill-me(grounds Claude in the domain model before coding so context isn’t re-spent every session),prototype(front-end-first design mode), TDD-by-default, two-issues breakdown for vertical-slice work. - Gary Tan (gstack) — opinionated, “a bit over-engineered” per the reviewer. The standout is
office-hours— a 6-forcing-questions skill modeled on YC office hours that interrogates startup ideas before any code gets written. Worth lifting individually even if you don’t adopt gstack wholesale. - Affaan Mustafa (Everything Claude Code, 183k stars) — the biggest. Memory, continuous learning, verification loops, sub-agent orchestration, security focus. The deep-dive reference if you want to see what a maximalist harness looks like.
- BMAD — enterprise-level “Party Mode” (BA + PM + senior architect personas). Recommended only at enterprise scale.
- Superpowers — lightweight. The reviewer’s stated favorite for results when forced to pick one library.
- OpenSpec / spec-based bundles — heavy overlap with the rest; useful for teams already aligned around spec-driven development.
- Skills are just natural-language prompts in a special file. YAML front matter (name + description) is always loaded by the model so it knows what skills exist. Scripts, reference material, and assets can be bundled inside the skill folder. There’s no magic — they’re documentation Claude reads when triggered.
- Most modern agents already do most of the work. Plan mode in Claude Code, Codex, and Cursor natively shards projects into phases, generates verification steps, and tracks to-dos — replicating what spec libraries used to provide via custom skills. The reviewer’s point: you don’t need a skill for what the harness already does well.
- TDD is opt-in, not skill-resident. Every library advocates for test-driven development, but the practical recommendation is to say “include test-driven development” during planning and the agent starts writing tests alongside code. No dedicated TDD skill needed.
- Frontend-prototype-first is the load-bearing discipline. A consistent recommendation across libraries: tell the agent “we’re in prototyping mode, develop front end only, use dummy JSON for back-end data, link components for navigation, don’t connect backend logic.” Reason: stops the agent from creating complicated back-end scaffolding it has to support, which slows everything down. Passing the prototype to the agent later as the back-end spec is much more productive than trying to design both at once.
- Agents cheat at testing. Real warning: the test step often catches less than expected. Inevitably, the test phase remains a human-in-the-loop smoke test — click through the actual UI to verify the agent built what you wanted.
- Skills rot like documentation. “Skills in some ways are essentially just documentation. And we’ve all experienced the scenario where comments go out of date, documents go out of date. You’re going to have to spend as much time updating these skills and keeping them current.” — direct quote. Implication: every skill added is a maintenance liability. Be selective.
- The bespoke-skill rule. Only build a skill when (a) the agent has demonstrably messed up in the same way repeatedly, or (b) you want to encode bespoke information about your codebase / process. Write them short — a few short paragraphs is often enough. Long skill files bloat the context window and confuse the model.
- The Agentic Development Lifecycle (ADLC) emerges. Traditional software has the Software Development Lifecycle. The reviewer proposes the agentic equivalent: managing your harness + skill management + cross-developer skill organization is a new engineering discipline. Vercel’s skills.sh is highlighted as a working primitive for storing skills in private repos, updating them, and sharing across an organization.
- The bottom line. “Inevitably the best skill system and the best harness is going to be the one you develop over time for you. I think this is going to be how you really differentiate as a software developer — your agent harness, your set of skills that has been built up over time working with a particular codebase.”
How the Universal Pattern Maps to Existing Wiki Coverage
The reviewer’s 5-step ADLC matches what almost every popular skill library does. Wiki articles for each library (existing coverage):
| Library | Wiki article | Distinguishing add |
|---|---|---|
| Everything Claude Code (Affaan Mustafa) | everything-claude-code-affaan-mustafa | Maximalist: 48 agents, 182 skills, AgentShield security, cross-harness |
| Gary Tan / gstack | garrytan-gstack | office-hours 6-forcing-questions skill |
| Matt Pocock | mattpocock-skills | caveman token-saver, grill-me, diagnose |
| GSD (Get-Shit-Done) | gsd-build-get-shit-done | 6-step phase loop, 5-artifact persistent layer |
| BMAD-METHOD | bmad-method-agentic-dev | 12+ persona Party Mode, enterprise SDLC |
| OpenSpec | openspec-spec-driven-vibe-coding | proposal.md + design.md + tasks.md primitives |
| Superpowers | superpowers-skills-framework | Closed end-to-end software dev methodology |
Code4AI’s contribution to this cluster isn’t another library — it’s the meta-thesis: all of these collapse to the same SDLC pattern, which the harness now provides natively, so skills should be additive and bespoke rather than wholesale.
The Minority Position Worth Pinning
Almost every other operator-curated article on this wiki advocates for installing a bundled skill system:
- Six Best Claude Code Skills for Business — install these 6
- Seven Claude Skills That Run My Business — keep these 7
- Nine Plugins to Build 10× Faster — stack three columns
- Gary Tan’s gstack — clone this 23-tool setup
- Everything Claude Code — install the 182-skill suite
- Anthropic engineers’ four skill rules — prompt skills not Claude; build composable; update every session
Code4AI is the counter: don’t install pre-built libraries at all; rely on native plan mode + minimal bespoke skills written for your specific failure modes. This is the position worth knowing exists before reaching for a starter pack.
Practical Decision Framework
Synthesized from the video’s argument:
| Situation | Recommendation |
|---|---|
| New to Claude Code, exploring | Native plan mode + natural-language prompting. Add nothing. |
| Hitting a specific failure mode repeatedly | Write a short bespoke skill (a few paragraphs) for that failure |
| Need a domain-grounding so context isn’t re-spent | Pocock’s grill-me pattern or a custom domain-context.md |
| Codebase-specific patterns the agent keeps missing | Bespoke skill embedded in .claude/skills/ |
| Enterprise team, need governance | BMAD’s Party Mode or skills.sh-style centralized repo |
| Want to ship a polished MVP fast solo | Stay lightweight, follow the universal 5-step pattern manually |
| Heavy security focus | Lift specific patterns from Everything Claude Code without installing the whole thing |
Try It
- Audit your installed skills. Anything you haven’t invoked in 30 days is a maintenance liability — uninstall.
- Run your next project with zero skills installed. Use only Claude Code’s native plan mode + a clear prototype-first prompt. Notice where the agent specifically fails.
- Write your first bespoke skill only for one of those specific failure points — keep it under 10 lines of body text.
- Try Pocock’s
grill-mepattern before starting any new feature — get Claude to interview you about the domain so it doesn’t re-load context every session. - Adopt the universal 5-step manually: research / discuss → prototype-front-end-only → markdown plan with phases → build slice-by-slice → test (lint + build + smoke) → optional polish pass with a different model. No skill bundle required.
Related
- Anthropic Engineers’ Four Skill Rules — the opposite-but-compatible position (yes build skills, but composable + always update)
- Skill Systems — Orchestrator + Child Pattern — what happens when you DO need to compose multiple skills
- Everything Claude Code — the maximalist counter to this contrarian thesis
- Garry Tan’s gstack — the opinionated 23-tool starter
- Matt Pocock’s Skills —
caveman,grill-me,diagnose— the patterns the reviewer specifically endorses - BMAD-METHOD — enterprise Party Mode
- GSD — 6-phase spec-driven loop
- OpenSpec — proposal/design/tasks spec primitives
- Superpowers — reviewer’s stated favorite of the libraries
- Agent Skills Overview — the formal spec of what skills actually are
- Six Best Claude Code Skills for Business — install-this-stack position
- Seven Skills That Run My Business — sister install-this-stack position
- Anthropic’s Official Best Practices for Claude Code — primary source on native plan mode + 8 context tools
Open Questions
- Vercel’s skills.sh maturity. The video name-drops skills.sh as the organizational primitive for skill management at team scale, but doesn’t show it. How well does it actually handle the rot/maintenance problem the reviewer flags?
- Empirical comparison missing. The thesis (“best is bespoke”) is asserted from experience, not benchmarked. Is there a measured comparison of bespoke-only vs starter-pack on real engineering tasks?
- When does an org cross the threshold from bespoke-skills to needing a real library? The video leaves this fuzzy — somewhere between solo dev and BMAD-level enterprise, there’s a transition point worth nailing down.