Jonathon's AI Wiki
Search
Search
Dark mode
Light mode
Explorer
Tag: claude-mythos-preview
3 items with this tag.
Jun 11, 2026
Claude Mythos Preview — Anthropic System Card (April 7 2026)
claude-mythos-preview
system-card
anthropic
rsp-3-0
frontier-model
alignment
model-welfare
project-glasswing
cybersecurity
opus-4-6
defensive-cyber
not-generally-available
evaluation-awareness
recursive-self-improvement
primary-source
May 27, 2026
Anthropic Engineering — How We Contain Claude Across Products (3-Risk × 3-Defense Frame)
anthropic-engineering
agent-containment
sandboxing
blast-radius
claude-ai-runtime
claude-code-runtime
cowork-runtime
gvisor
vm-isolation
egress-allowlist
claude-mythos-preview
prompt-injection
agent-identity
post-mortem
x-sourced
reddit-sourced
r-claudecode
May 08, 2026
Translating Claude's Thoughts Into Language — Activation-to-Text Interpretability
interpretability
mechanistic-interpretability
activations
alignment
safety-evals
anthropic-research
blackmail-test
self-introspection
claude-mythos-preview