cd /news/ai-agents/ask-hn-what-are-some-good-benchmarks… · home › topics › ai-agents › article

[ARTICLE · art-35183] src=news.ycombinator.com ↗ pub=2026-06-20T23:26Z topic=ai-agents verified=true sentiment=· neutral

Ask HN: What are some good benchmarks for different agent harnesses?

A Hacker News user asks the community for recommendations on benchmarks to evaluate different agent harnesses, noting that Terminal Bench does not align with their experience.

read1 min views1 publishedJun 20, 2026

Hacker News new | past | comments | ask | show | jobs | submit login Ask HN: What are some good benchmarks for different agent harnesses? 2 points by Bnjoroge 9 minutes ago | hide | past | favorite | discuss Other than terminal bench which doesnt quite map to my experience, what are some other benchmarks to see how different models do in different harnesses? help Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact Search:

source & further reading

news.ycombinator.com — original article Ask HN: What are your parameter count estimates for Opus 4.8 and GPT-5.5? Never coded before, now shipping websites to clients via single prompt Ask HN: Why are desktop AI apps so heavy?

~/api · this article 200

$curl api.wpnews.pro/v1/news/ask-hn-what-are-some-goo…

Read original on news.ycombinator.com → news.ycombinator.com/item?id=48614029

mentioned entities

Hacker News

Terminal Bench

metadata

slugask-hn-what-are-some-good-benchmarks-for-different-agent-harnesses

topic#ai-agents

secondary2 topics

sentimentneutral

canonicalnews.ycombinator.com

navigation

← prevShow HN: Vitrus – the company br…

next →Visual Studio Code 1.126

── more in #ai-agents 4 stories · sorted by recency

marktechpost.com · 20 Jun · #ai-agents

Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribution and Claude Code Orchestration

github.com · 20 Jun · #ai-agents

Show HN: Vitrus – the company brain that tells you what it doesn't know

github.com · 20 Jun · #ai-agents

dev.to · 20 Jun · #ai-agents

North Korean Hackers Poisoned 140+ npm Packages in an AI Dev Tooling Attack. Here's What Would Have Caught It.

── more on @hacker news 3 stories trending now

wpnews · 19 Jun · #artificial-intelligence

From Dream Job to 'The Gulag': Inside Staff Revolt Zuckerberg's Brutal AI Push

wpnews · 19 Jun · #artificial-intelligence

Stop Guessing Which Library to Use — I Built an AI Capability Discovery Engine

wpnews · 19 Jun · #artificial-intelligence

Joanna Stern spent one week with new Siri AI, and it’s very good

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required

LIVE [news/ask-hn-what-are-some…] indexed:0 read:1min 2026-06-20 · —