{"slug": "funnybench-can-ai-models-tell-funny-jokes", "title": "FunnyBench – Can AI Models Tell Funny Jokes?", "summary": "A new benchmark called FunnyBench tests AI models' ability to tell funny jokes by asking each model to generate ten jokes and letting users vote on their humor. The live leaderboard uses a Bayesian score to rank models based on user votes, with the model revealed after voting. The benchmark aims to evaluate humor generation in AI, a challenging aspect of natural language processing.", "body_md": "[FunnyBench](#)\n\n# Can AI tell a joke?\n\nFunnyBench asks a simple question: can AI models tell funny jokes? Each model was\ngiven the same prompt — *“tell me a joke”* — ten times. Read a joke and\ndecide if it’s funny or not. Your votes drive a live leaderboard. We asked each\nmodel multiple times to encourage variety, but some still repeated the same joke.\n\nThe model is revealed after you vote.\n\n## Current leaders\n\nFunniest joke so far\n\nFunniest model so far\n\nNo votes yet.\n\n## Live leaderboard\n\n| # | Returned model | Provider | Bayesian score | Votes | Funny% |\n|---|\n\n## Details\n\nJokes were generated through OpenRouter from its model catalog using the exact\nprompt *“tell me a joke”*. Generation used temperature 1 where supported,\na 120 second timeout, provider fallback disabled, required parameters enabled,\nand the returned model, provider, and text were stored. Token counts and cost are\nstored internally but not displayed, to reduce noise. The\nleaderboard uses a Bayesian score: each model starts near the overall average and\nmoves as votes come in, which makes early rankings less jumpy than a raw funny\npercentage. It also shows both the model requested from OpenRouter and the returned\nmodel that actually ran, so the benchmark is explicit about what was tested. For\nreasoning models, the lowest available reasoning setting was used; reasoning\ntraces were intentionally not captured because they are not part of the joke\nshown to voters. The run excluded\nmodels not primarily meant for text, OpenRouter/router/front aliases, search or\ncustom-tool variants, floating “latest” aliases, unavailable-price models,\nduplicate free aliases, invalid empty or oversized outputs, and any model that\nfailed five calls in a row. The published set keeps ten valid jokes per retained\nmodel.", "url": "https://wpnews.pro/news/funnybench-can-ai-models-tell-funny-jokes", "canonical_source": "https://funnybench.lol", "published_at": "2026-06-20 22:43:39+00:00", "updated_at": "2026-06-20 23:09:41.563231+00:00", "lang": "en", "topics": ["large-language-models", "ai-research", "natural-language-processing"], "entities": ["FunnyBench", "OpenRouter"], "alternates": {"html": "https://wpnews.pro/news/funnybench-can-ai-models-tell-funny-jokes", "markdown": "https://wpnews.pro/news/funnybench-can-ai-models-tell-funny-jokes.md", "text": "https://wpnews.pro/news/funnybench-can-ai-models-tell-funny-jokes.txt", "jsonld": "https://wpnews.pro/news/funnybench-can-ai-models-tell-funny-jokes.jsonld"}}