17:20
2026-05-27
huggingface.co
ai-agents
ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks β by Artificial Analysis and IBM
Artificial Analysis and IBM Research launched ITBench-AA, the first benchmark for agentic enterprise IT tasks, revealing that all frontier AI models scored below 50% on Site Reliability Engineering chβ¦