Jonathon's AI Wiki

Tag: jailbreak

2 items with this tag.

  • Jul 03, 2026

    Stanford HAI AI Index 2026 — Chapter 3 Responsible AI Deep-Dive

    • industry-report
    • stanford-hai
    • responsible-ai
    • ai-incidents
    • ai-safety
    • transparency
    • hallucination
    • governance
    • jailbreak
  • Jul 02, 2026

    Refusal Calibration and Constitutional AI — Shaping Claude's Refusals Without Raising Jailbreak Risk

    • constitutional-ai
    • claudes-constitution
    • refusals
    • over-refusal
    • jailbreak
    • safety
    • system-prompt
    • anthropic

Created with Quartz v4.5.2 © 2026

  • ✦ Explore the graph in 3D