cd /news/large-language-models/poker-arena-multi-axis-profiling-of-… · home topics large-language-models article
[ARTICLE · art-27521] src=arxiv.org ↗ pub= topic=large-language-models verified=true sentiment=· neutral

Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs

Researchers introduced Poker Arena, a no-limit Texas Hold'em platform that evaluates LLMs across nine cognitive axes and three memory layers. Claude Opus 4.6 won the most chips but ranked fifth on mean axis score, revealing that multi-axis profiling uncovers capability structures that scalar leaderboards misrank.

read1 min publishedJun 15, 2026

arXiv:2606.13815v1 Announce Type: new Abstract: Strategic reasoning under uncertainty underpins consequential decisions in negotiation, finance, and policy, but prevailing game-play benchmarks collapse heterogeneous reasoning dimensions into a single scalar, leaving the capability structure of frontier LLMs unexamined. We introduce Poker Arena, a no-limit Texas Hold'em tournament platform that couples a three-layer memory architecture (within-hand, session, and cross-session) with a nine-axis cognitive profile decomposing strategic reasoning into interpretable dimensions such as bet-sizing calibration and positional awareness. We evaluate seven frontier models across 50 sessions of 1,000 hands and a controlled memory ablation; tournament chips and aggregate axis score order the field differently: Claude Opus 4.6 wins +$15,730 chips with 14 first-place finishes, yet ranks only fifth of seven on mean axis score, while persistent memory helps some models and hurts others. These findings show that multi-axis evaluation surfaces capability structure that scalar leaderboards systematically misrank, with cross-dimensional consistency outweighing peak performance on any single axis.

── more in #large-language-models 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/poker-arena-multi-ax…] indexed:0 read:1min 2026-06-15 ·