cd/entity/RE-Bench· home entities RE-Bench
grep -l @re-bench /news/*.json | wc -l → 1

@RE-Bench

mentions 1 type Organization feed RSS
19:05
2026-03-28
muratbuffalo.blogspot.com
artificial-intelligence

Measuring AI Ability to Complete Long Software Tasks

Based solely on the provided article, researchers at METR introduced a new metric called the "50%-task-completion time horizon" to track AI progress, finding that this horizon—the length of a software…

// co-occurs with top 5 entities