17:49
2026-06-26
human-bench.com
ai-agents
Human-bench: an eval for "human shaped" agents
American Productivity Company's agent Righthand, powered by Claude Sonnet 4.6, achieved an 84.0% score on the Human Bench benchmark, which evaluates AI agents on realistic professional tasks requiringβ¦