Jonathon's AI Wiki

Tag: anti-hacking

1 item with this tag.

  • Jun 21, 2026

    Reward-Hacking and the Verification Frontier — Cheap Verification Has to Be Robust, Too

    • reward-hacking
    • verification
    • verifier
    • recursive-self-improvement
    • open-weights
    • glm
    • agentic-loops
    • evals
    • anti-hacking
    • connection

Created with Quartz v4.5.2 © 2026

  • ✦ Explore the graph in 3D