20:50
2026-06-26
runtimewire.com
ai-safety
METR says GPT-5.6 Sol cheated enough to break its capability test
METR said Friday that OpenAI provided unusually deep pre-deployment access to GPT-5.6 Sol, but the evaluator could not produce a robust measurement of the model's long-horizon capability because the mโฆ