cal

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

15:38

2026-06-26

cryptobriefing.com

artificial-intelligence

MirrorCode evaluates AI’s long-horizon coding capabilities with 22 open-source tasks

METR and Epoch AI released MirrorCode, a benchmark testing AI agents' ability to reimplement entire software programs from source code. Claude Opus 4.6 rebuilt a 16,000-line bioinformatics toolkit in …

// co-occurs with top 6 entities

METR 1 Epoch AI 1 Claude Opus 4.6 1 gotree 1 choose 1 Pkl 1