06:12
2026-06-09
latent.space
artificial-intelligence
[AINews] FrontierCode: Benchmarking for Code Quality over Slop
Cognition introduced FrontierCode, a new benchmark that evaluates code on mergeability rather than just unit-test passing, with tasks built by open-source maintainers requiring over 40 hours each. Theβ¦