{"slug": "eqbench-emotional-intelligence-benchmarks-for-llms", "title": "Eqbench: Emotional Intelligence Benchmarks for LLMs", "summary": "Researchers have introduced EQ-Bench, a new benchmark designed to measure emotional intelligence in large language models (LLMs) through challenging roleplay scenarios. The benchmark calculates an Elo score based on pairwise model comparisons, with an LLM judge evaluating responses across eight core emotional intelligence dimensions. The tool aims to provide a standardized method for assessing how well AI systems understand and respond to human emotions.", "body_md": "Emotional Intelligence Benchmarks for LLMs\n\n[Github](https://github.com/EQ-bench) | [Paper](https://arxiv.org/abs/2312.06281) | | [Twitter](https://twitter.com/sam_paech) | [About](about.html)\n\n**💙EQ-Bench3**\n[🌀Spiral-Bench v1.2](spiral-bench.html)\n[✍️Longform Writing](creative_writing_longform.html)\n[🎨Creative Writing v3](creative_writing.html)\n[☢️Slop Score](slop-score.html)\n[⚖️Judgemark v4](judgemark-v4.html)\n[🎤BuzzBench](buzzbench.html)\n[🌍DiploBench](diplobench.html)\n\nA benchmark measuring emotional intelligence in challenging roleplays. [Learn more](./about.html#eq-bench-3)\n\n**Note:** Ability scores shown in the heatmap do not contribute to the Elo score. They are \"higher is higher\", not \"higher is better\".\n\n| Model | Abilities |\n|\n|---|\n\nFor more details about the benchmark, see the [About](./about.html#long) section.\n\nThe Elo score shown in the leaderboard is calculated from pair-wise model comparisons, where the LLM judge rates each response against eight core dimensions of emotional intelligence:\n\n*Note:* the coloured “Abilities” heat-map columns (Humanlike, Safety, Assertive, etc.) are **not** used in the Elo calculation—they are purely informational, giving a quick view of each model’s stylistic traits and skill profile.\n\nThese are informational only -- not used for scoring.", "url": "https://wpnews.pro/news/eqbench-emotional-intelligence-benchmarks-for-llms", "canonical_source": "https://eqbench.com/", "published_at": "2026-05-29 22:07:16+00:00", "updated_at": "2026-05-29 22:15:38.268512+00:00", "lang": "en", "topics": ["large-language-models", "artificial-intelligence", "natural-language-processing", "ai-research", "ai-ethics"], "entities": ["EQ-Bench", "Spiral-Bench", "Longform Writing", "Creative Writing", "Slop Score", "Judgemark", "BuzzBench", "DiploBench"], "alternates": {"html": "https://wpnews.pro/news/eqbench-emotional-intelligence-benchmarks-for-llms", "markdown": "https://wpnews.pro/news/eqbench-emotional-intelligence-benchmarks-for-llms.md", "text": "https://wpnews.pro/news/eqbench-emotional-intelligence-benchmarks-for-llms.txt", "jsonld": "https://wpnews.pro/news/eqbench-emotional-intelligence-benchmarks-for-llms.jsonld"}}