HermesBench

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

23:03

2026-05-30

verkyyi.github.io

ai-agents

Show HN: HermesBench – workflow reliability evals for personal AI agents

HermesBench, a new benchmark for evaluating complete personal AI agent configurations rather than just models, launched with a public baseline score of 78.2 across 27 personal-agent recipes. The bench…

// co-occurs with top 3 entities

Hermes Agent 1 Codex 1 Claude 1