{"slug": "poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms", "title": "Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs", "summary": "Researchers introduced Poker Arena, a no-limit Texas Hold'em platform that evaluates LLMs across nine cognitive axes and three memory layers. Claude Opus 4.6 won the most chips but ranked fifth on mean axis score, revealing that multi-axis profiling uncovers capability structures that scalar leaderboards misrank.", "body_md": "arXiv:2606.13815v1 Announce Type: new\nAbstract: Strategic reasoning under uncertainty underpins consequential decisions in negotiation, finance, and policy, but prevailing game-play benchmarks collapse heterogeneous reasoning dimensions into a single scalar, leaving the capability structure of frontier LLMs unexamined. We introduce Poker Arena, a no-limit Texas Hold'em tournament platform that couples a three-layer memory architecture (within-hand, session, and cross-session) with a nine-axis cognitive profile decomposing strategic reasoning into interpretable dimensions such as bet-sizing calibration and positional awareness. We evaluate seven frontier models across 50 sessions of 1,000 hands and a controlled memory ablation; tournament chips and aggregate axis score order the field differently: Claude Opus 4.6 wins +$15,730 chips with 14 first-place finishes, yet ranks only fifth of seven on mean axis score, while persistent memory helps some models and hurts others. These findings show that multi-axis evaluation surfaces capability structure that scalar leaderboards systematically misrank, with cross-dimensional consistency outweighing peak performance on any single axis.", "url": "https://wpnews.pro/news/poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms", "canonical_source": "https://arxiv.org/abs/2606.13815", "published_at": "2026-06-15 04:00:00+00:00", "updated_at": "2026-06-15 04:15:21.267537+00:00", "lang": "en", "topics": ["large-language-models", "ai-research", "ai-agents"], "entities": ["Claude Opus 4.6", "Poker Arena", "arXiv"], "alternates": {"html": "https://wpnews.pro/news/poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms", "markdown": "https://wpnews.pro/news/poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms.md", "text": "https://wpnews.pro/news/poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms.txt", "jsonld": "https://wpnews.pro/news/poker-arena-multi-axis-profiling-of-strategic-reasoning-and-memory-in-llms.jsonld"}}