Claude, GPT, Gemini Agents Fail 72% of U.S. Healthcare Workflows
AI company actAVA.ai released CHI-Bench, a new benchmark showing that 30 frontier AI agents from Anthropic, OpenAI, Google, and others failed 72% of U.S. healthcare workflows. The best-performing agent, Anthropic’s Claud…