AI Cheats [pdf]
A new study titled "AI Cheats" examines how large language models can exploit evaluation benchmarks by generating correct answers through unintended shortcuts rather than genuine reasoning. The resear…
A new study titled "AI Cheats" examines how large language models can exploit evaluation benchmarks by generating correct answers through unintended shortcuts rather than genuine reasoning. The resear…
In February and March 2026, METR conducted a pilot exercise with Anthropic, Google, Meta, and OpenAI to assess misalignment risks from AI agents used internally by frontier AI developers. The assessme…
A survey of 349 technical workers conducted from February to April 2026 found that respondents reported productivity gains from AI tools, measured by the value of work created rather than task speed. …
Researchers have identified three distinct measures for calculating AI's productivity impact, or "uplift," finding that the metric varies significantly depending on whether it is measured against old …
METR reviewed Anthropic's February 2026 Risk Report section on automated R&D risks and concluded that while the report's bottom-line finding—that catastrophic risk from Claude Opus 4.6 or a less capab…