Claude Code Security-Guidance Plugin (Anthropic Official)

Source: raw/x-bookmarks-recent-digest-2026-05-31.md (first-party @ClaudeDevs announcement thread), grounded in ai-research/anthropic-security-guidance-plugin-2026-05-31.md (official Claude Code docs code.claude.com/docs/en/security-guidance + Help Net Security + SecurityWeek launch coverage, 2026-05-27).

Anthropic’s security-guidance plugin is an official, free Claude Code plugin that makes Claude review its own code changes for common vulnerabilities and fix them in the same session. It runs automatically via hooks — no separate tool to launch, no command to remember — as a lightweight first pass that catches issues before code reaches a pull request. Distinct from third-party skill/code scanners (SkillSpector, DeepSec): this one ships from Anthropic’s own marketplace and was used internally before release.

Key Takeaways

First-party, free, all plans. Install from the official marketplace: /plugin marketplace add anthropics/claude-plugins-official then /plugin install security-guidance@claude-plugins-official and /reload-plugins. Requires Claude Code v2.1.144+ and Python 3.8+.
Three review stages, wired to hooks — the plugin is a concrete, real-world example of the hooks system doing security work:
- On each file edit (PostToolUse on Edit/Write/NotebookEdit) — a lightweight pattern check with no model call (so it adds zero usage cost). Flags risky constructs and commonly abused libraries: eval(, new Function, os.system, child_process.exec, pickle (unsafe deserialization), dangerouslySetInnerHTML, .innerHTML =, document.write, plus scrutiny of .github/workflows/. Pattern set lives in security-patterns.yaml.
- At the end of each turn (Stop hook) — a full-diff review against a working-tree baseline captured at UserPromptSubmit, run in the background.
- On each commit or push (PostToolUse on Bash, filtered to git commit/git push) — reads surrounding context to validate vulnerabilities, run in the background.
Cost model is two-tier. Instant pattern checks run without model calls (free); the deeper model-backed reviews use the same Claude usage budget as a normal request. Deeper stages need a git repo; pattern checks run in any directory.
Vulnerability classes: injection flaws, unsafe deserialization, insecure DOM APIs — the categories most likely to slip through AI-generated changes.
Org-customizable. Add repo-specific rules in .claude/claude-security-guidance.md; per @ClaudeDevs you can “drop it in your repo or distribute via MDM,” and the plugin enforces your policies alongside the built-in checks.
Benchmark: Anthropic reports a 30–40% decrease in security-related comments on PRs opened with the plugin across its internal rollout — positioned explicitly as a first pass, not a replacement for full review.
Granular kill switches: env flags ENABLE_PATTERN_RULES=0, ENABLE_STOP_REVIEW=0, ENABLE_COMMIT_REVIEW=0, ENABLE_CODE_SECURITY_REVIEW=0 (all model reviews), or SECURITY_GUIDANCE_DISABLE=1 to turn the whole thing off without uninstalling. Log at ~/.claude/security/log.txt.

Where it sits in the Claude Code security stack

Anthropic’s own docs place the plugin as the in-session layer. Updated 2026-07-24: the table grew from four stages to six when the Claude Security plugin shipped, adding an on-demand deep-scan row and a managed row (ai-research/claude-code-docs-claude-security-plugin-2026-07-24.md):

Stage	Tool	Covers
In session	Security-guidance plugin	Common vulnerabilities in code Claude writes, fixed in the same session
On demand, single pass	`/security-review`	One-time security pass on the current branch, when you ask
On demand, deep scan	Claude Security plugin	Multi-agent scan of a repository or diff, with independently reviewed findings and patches
On pull request	Code Review (Team/Enterprise)	Multi-agent correctness + security review with full-codebase context
Managed	Claude Security product (Enterprise)	Hosted scanning that monitors connected repositories
In CI	Your existing SAST / dependency scanners	Language-specific rules, supply-chain checks, policy enforcement the plugin does not attempt

It deliberately does not try to be your CI scanner — it is the earliest, cheapest checkpoint in the chain.

Stack note (2026-06-10) — these tools are also the false-positive escape hatch for Fable 5’s model-level cyber classifier. When legitimate security work trips the Fable 5 cyber safeguard (a classifier above this whole table, gating the model itself), Boris Cherny’s first-party recommendation is to route the work through the built-in /security-review skill or Claude Security instead of fighting the gate (raw/x-account-bcherny-2064475977208721485.md).

Sharpened 2026-07-24 — the escape hatch is itself gated, and that’s the point. The Claude Security plugin’s own docs list “You may see ‘Fable 5’s safeguards flagged this message’ when using Fable 5” under Troubleshooting: certain model activities are blocked by the cyber classifier and automatically downgraded to Opus, and “this is expected, and the scan should still complete successfully.” So routing security work through these tools does not put it outside the classifier — the gate still fires. What changes is the failure mode: inside a purpose-built security surface it degrades to Opus and continues, rather than refusing.

Try It

Install it in any project where Claude writes code: the three /plugin commands above. Confirm with /plugin that security-guidance@claude-plugins-official is enabled.
Watch it fire. Edit a file containing a flagged pattern (e.g. an .innerHTML = assignment) and observe the per-edit warning — it surfaces as a PostToolUse hook message. (Writing this very article triggered it on the documented pattern strings, confirming the per-edit check runs on prose too — a reminder it is pattern-matching, not semantic.)
Add a house rule. Drop a .claude/claude-security-guidance.md with one org-specific policy (e.g. “never construct SQL by string interpolation; require parameterized queries”) and confirm the end-of-turn review honors it.
Tune the noise. If the per-edit pattern check is too chatty for a docs-heavy or prototype repo, set ENABLE_PATTERN_RULES=0 and keep only the model-backed Stop/commit reviews.
Layer it behind /security-review (manual deep pass) and PR Code Review — the plugin is the first net, not the only one.

Claude Security Plugin — the on-demand deep-scan layer one row below this one: multi-agent scan, independently verified findings, reviewed patch files you apply yourself.
Claude Code Hooks Deep Dive — the plugin is a production example of SessionStart/UserPromptSubmit/PostToolUse/Stop hooks composed into a feature.
SkillSpector — NVIDIA’s third-party skill security scanner; complementary (vets skills you install vs. code Claude writes).
DeepSec — third-party vulnerability scanner; the security-guidance plugin is the Anthropic-native equivalent of the in-session layer.
Claude Code Plugins & Marketplaces — how /plugin marketplace add + /plugin install work.
Five OSS Tools That Fix Claude Code’s Blind Spots — sibling “harden your Claude Code setup” roundup.
How We Contain Claude — Anthropic’s broader defense-in-depth framing that this plugin operationalizes at the code layer.

Open Questions

Exact rule coverage of the model-backed Stop/commit stages beyond the published pattern list — the docs describe categories (injection, unsafe deserialization, insecure DOM) but not the full ruleset.
False-positive rate on documentation/test files (the per-edit check is pure pattern-match, as the meta-example above shows) — no published precision/recall numbers beyond the 30–40% PR-comment reduction.

Jonathon's AI Wiki

Explorer

Claude Code Security-Guidance Plugin (Anthropic Official)

Key Takeaways

Where it sits in the Claude Code security stack

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Claude Code Security-Guidance Plugin (Anthropic Official)

Key Takeaways

Where it sits in the Claude Code security stack

Try It

Related

Open Questions

Graph View

Table of Contents

Backlinks