cd /news/artificial-intelligence/gemini-3-5-flash-beats-opus-4-8-on-b… · home topics artificial-intelligence article
[ARTICLE · art-17056] src=bsky.app pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Gemini 3.5 Flash beats Opus 4.8 on bluffbench

Simon P. Couch re-ran the Bluffbench evaluation against Opus 4.8, Gemini 3.5 Flash, and GPT 5.5. Gemini 3.5 Flash outperformed Opus 4.8, which showed only a modest improvement over previous Opus models. The results position Gemini 3.5 Flash as the standout model in the latest benchmark comparison.

read1 min publishedMay 29, 2026

This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.

Post

Simon P. Couch

simonpcouch.com

did:plc:bspwzx2ytje3gbvikujf2gl5 Re-ran this eval against Opus 4.8, Gemini 3.5 Flash, and GPT 5.5. Opus 4.8 is a modest improvement over the previously tested Opus models, but Gemini 3.5 Flash is the real stand-out! simonpcouch.github.io/bluffbench/ [contains quote post or other embedded content]

2026-05-28T19:41:06.976Z

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/gemini-3-5-flash-bea…] indexed:0 read:1min 2026-05-29 ·