cd /news/artificial-intelligence/grouptom-bench-benchmarking-group-th… · home topics artificial-intelligence article
[ARTICLE · art-21120] src=arxiv.org pub= topic=artificial-intelligence verified=true sentiment=· neutral

GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

Multimodal large language models fail to infer how individual mental states interact and crystallize into group-level outcomes, according to a new benchmark called GroupToM-Bench. The benchmark, the first for group-level Theory of Mind, reveals a gap between current models and human baselines in processing social structures and non-linear collective dynamics.

read1 min publishedJun 4, 2026

arXiv:2606.04184v1 Announce Type: new Abstract: True general intelligence requires not only a model of the physical world but also a social world model: the capacity to infer how individual mental states interact and crystallize into group-level outcomes. Despite notable progress in individual-level Theory of Mind (ToM) reasoning, existing multimodal large language models fail at this broader task. Collective behavior emerges non-linearly from social tensions, conformity dynamics, and structural constraints, meaning it cannot be recovered by merely summing individual intentions. We present GroupToM-Bench, the first multimodal benchmark for group-level ToM, built around a causal chain spanning micro-level BDI states (belief, desire, intention), meso-level group tension and structural constraints, and macro-level outcome prediction and mechanistic attribution. To probe this full arc, we develop a seven-level cognitive audit framework. Experiments reveal a gap between current models and human baselines, highlighting a failure to process social structures and non-linear collective dynamics.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/grouptom-bench-bench…] indexed:0 read:1min 2026-06-04 ·