cd/entity/WorkBenchยท homeโ€บ entitiesโ€บ WorkBench
grep -l @workbench /news/*.json | wc -l โ†’ 1

WorkBench

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00
2026-06-15
arxiv.org
ai-agents

WorkBench Revisited: Workplace Agents Two Years On

A new arXiv paper revisits the WorkBench benchmark for workplace agents, finding that the best agent in June 2026, Claude Opus 4.8, completes 89% of tasks with only 2.5% unintended harmful actions, upโ€ฆ

// co-occurs with top 3 entities