arXiv

GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

Title: GroupToM-Bench: Evaluating Group Theory of Mind and Nonlinear Social Emergence in Multimodal Large Language Models

Abstract: Achieving genuine general intelligence demands more than just understanding the physical environment; it necessitates a robust model of the social world capable of deducing how individual mental states intersect and coalesce into collective results. While significant strides have been made in individual-level Theory of Mind (ToM) reasoning, current multimodal large language models (MLLMs) remain inadequate for this expansive challenge. Because collective behavior arises non-linearly from factors such as structural constraints, conformity dynamics, and social tensions, it cannot be accurately reconstructed by simply aggregating individual intentions.

To address this, we introduce GroupToM-Bench, the inaugural multimodal benchmark designed for group-level ToM. This framework is structured around a causal progression that connects micro-level BDI states (belief, desire, intention) to meso-level group tensions and structural limitations, ultimately leading to macro-level outcome prediction and mechanistic attribution. We employ a seven-level cognitive audit framework to examine this entire spectrum. Our experimental results uncover a distinct disparity between contemporary models and human baselines, underscoring a critical inability in current systems to navigate social structures and the complexities of non-linear collective dynamics.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade
Bloomberg

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade

Broadcom’s earnings miss triggered a sell-off in AI stocks, dragging down emerging-market equities. This disruption high...

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role
Bloomberg

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role

Revolut co-founder and CTO Vlad Yatsenko is stepping down from his executive role. The resignation marks a significant l...

Netflix Top Tech Exec Stone on Integrating AI
Bloomberg

Netflix Top Tech Exec Stone on Integrating AI

Netflix’s top tech exec discusses integrating AI to enhance content discovery and production efficiency.