FrontierMath — Web Pulse coverage FrontierMath benchmark undergoes major audit as Epoch AI flags errors in one-third of math problems :: https://wpnews.pro/news/frontiermath-benchmark-undergoes-major-audit-as-epoch-ai-flags-errors-in-one-of [AINews] FrontierCode: Benchmarking for Code Quality over Slop :: https://wpnews.pro/news/ainews-frontiercode-benchmarking-for-code-quality-over-slop