arXiv

GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

June 4, 2026 · Weidong Tang, Jierui Li, Yueling Hou, Zihan Mei, Can Zhang, Xinyan Wan, Zhiyuan Liang, Pengfei Zhou, Yang You, Wangbo Zhao · Original Source

Title: GroupToM-Bench: Evaluating Group Theory of Mind and Nonlinear Social Emergence in Multimodal Large Language Models

Abstract: Achieving genuine general intelligence demands more than just understanding the physical environment; it necessitates a robust model of the social world capable of deducing how individual mental states intersect and coalesce into collective results. While significant strides have been made in individual-level Theory of Mind (ToM) reasoning, current multimodal large language models (MLLMs) remain inadequate for this expansive challenge. Because collective behavior arises non-linearly from factors such as structural constraints, conformity dynamics, and social tensions, it cannot be accurately reconstructed by simply aggregating individual intentions.

To address this, we introduce GroupToM-Bench, the inaugural multimodal benchmark designed for group-level ToM. This framework is structured around a causal progression that connects micro-level BDI states (belief, desire, intention) to meso-level group tensions and structural limitations, ultimately leading to macro-level outcome prediction and mechanistic attribution. We employ a seven-level cognitive audit framework to examine this entire spectrum. Our experimental results uncover a distinct disparity between contemporary models and human baselines, underscoring a critical inability in current systems to navigate social structures and the complexities of non-linear collective dynamics.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC