08:07
2026-06-19
tbench.ai
artificial-intelligence
Terminal-Bench Challenges: long-horizon, token-intensive, single-task benchmarks
Terminal-Bench introduces Challenges, long-horizon, token-intensive, single-task benchmarks requiring agents to build entire codebases from scratch. Three initial challenges—Rust Compiler Speedup, Inf…