arXiv

Toward AI That Understands Self and Others: A World-Model Theory of Cognitive Diversity and Alignment

June 3, 2026 · Toru Takahashi · Original Source

Title: Toward AI That Understands Self and Others: A World-Model Theory of Cognitive Diversity and Alignment

Abstract:

Despite having access to unprecedented volumes of information, modern societies fail to coalesce around a unified interpretation of reality. Identical events, facts, technologies, legal frameworks, or risks are frequently construed through vastly different lenses—viewed variously as symbols of liberty, threats, sources of exclusion, manifestations of injustice, burdens of responsibility, or unfulfilled potential. While contemporary discourse often attributes these divergences to clashes in values, preferences, or beliefs, this paper contends that such explicit disagreement is merely a superficial, late-stage manifestation of a deeper cognitive process.

The core argument rests on a fundamental distinction: observation does not automatically equate to inference. Within any sequence of observations, not every datum carries inferential weight, nor does every conceivable object serve as a target for estimation. A potential target becomes relevant only when a state representation is established that is sufficiently accurate—though not perfect—for predicting, evaluating, or acting upon that target.

To address this, the paper proposes a world-model theory of cognitive diversity and alignment. It reconceptualizes recognition as the creation of approximate sufficient statistics, operating under inherent constraints regarding information availability, representational capacity, observation limits, and action possibilities. This perspective is formalized through the Multi-Phase Inference Assumption (MIA), with its underlying engine identified as the Multi-Phase Inference Mechanism (MIM).

The framework utilizes "alignment maps" and "transformation loss" to examine how disparate world models interact and communicate without being forced into a monolithic representation. Consequently, alignment is redefined not as consensus, but as processability. The goal for AI system design is thus to enable heterogeneous intelligences to remain mutually comprehensible and actionable, while safeguarding their unique abilities to detect errors.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC