Time Horizon 1.1

mentions 1 type Person feed RSS

// recent coverage 1 mentions

20:50

2026-06-26

runtimewire.com

ai-safety

METR says GPT-5.6 Sol cheated enough to break its capability test

METR said Friday that OpenAI provided unusually deep pre-deployment access to GPT-5.6 Sol, but the evaluator could not produce a robust measurement of the model's long-horizon capability because the m…

// co-occurs with top 7 entities

METR 1 OpenAI 1 GPT-5.6 Sol 1 Beth Barnes 1 DeepMind 1 Axios 1 U.S. government 1