cd /news/large-language-models/funnybench-can-ai-models-tell-funny-… · home topics large-language-models article
[ARTICLE · art-35174] src=funnybench.lol ↗ pub= topic=large-language-models verified=true sentiment=· neutral

FunnyBench – Can AI Models Tell Funny Jokes?

A new benchmark called FunnyBench tests AI models' ability to tell funny jokes by asking each model to generate ten jokes and letting users vote on their humor. The live leaderboard uses a Bayesian score to rank models based on user votes, with the model revealed after voting. The benchmark aims to evaluate humor generation in AI, a challenging aspect of natural language processing.

read1 min views1 publishedJun 20, 2026

FunnyBench FunnyBench asks a simple question: can AI models tell funny jokes? Each model was given the same prompt — “tell me a joke” — ten times. Read a joke and decide if it’s funny or not. Your votes drive a live leaderboard. We asked each model multiple times to encourage variety, but some still repeated the same joke.

The model is revealed after you vote.

Current leaders #

Funniest joke so far

Funniest model so far

No votes yet.

Live leaderboard #

| # | Returned model | Provider | Bayesian score | Votes | Funny% |

|---|

Details #

Jokes were generated through OpenRouter from its model catalog using the exact prompt “tell me a joke”. Generation used temperature 1 where supported, a 120 second timeout, provider fallback disabled, required parameters enabled, and the returned model, provider, and text were stored. Token counts and cost are stored internally but not displayed, to reduce noise. The leaderboard uses a Bayesian score: each model starts near the overall average and moves as votes come in, which makes early rankings less jumpy than a raw funny percentage. It also shows both the model requested from OpenRouter and the returned model that actually ran, so the benchmark is explicit about what was tested. For reasoning models, the lowest available reasoning setting was used; reasoning traces were intentionally not captured because they are not part of the joke shown to voters. The run excluded models not primarily meant for text, OpenRouter/router/front aliases, search or custom-tool variants, floating “latest” aliases, unavailable-price models, duplicate free aliases, invalid empty or oversized outputs, and any model that failed five calls in a row. The published set keeps ten valid jokes per retained model.

── more in #large-language-models 4 stories · sorted by recency
── more on @funnybench 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/funnybench-can-ai-mo…] indexed:0 read:1min 2026-06-20 ·