Solipsistic Superintelligence is Unlikely to be Cooperative
Title: The Improbability of Cooperation in Solipsistic Superintelligence
Abstract:
The primary hurdle in artificial intelligence is transitioning from mere capability to successful coexistence. Current AI research is largely dominated by a paradigm that prioritizes the creation of highly potent agents, which view the world merely as a static, external source of feedback. We argue that superintelligence—defined as an exceptionally capable problem-solver—emerging from this self-centered design philosophy is unlikely to act cooperatively.
The deployment of AI systems introduces endogenous non-stationarity, creating a significant disparity between historical training distributions and the actual deployment environment. We identify this phenomenon as the "self-undermining property of unilateral optimization," which widens the gap between training and testing phases. To bridge this divide, AI must move beyond isolated task-solving and engage in cooperation, specifically through equilibrium-selection processes that allow multiple actors to manage their mutual interdependence.
We advocate for a shift toward a non-solipsistic research framework that embeds this interdependence as a fundamental design principle, rather than treating cooperation as just another problem to be solved. This approach necessitates the development of dynamic evaluation environments featuring adaptive counterparts, the integration of institutions as core design elements, and the preservation of human agency as an intrinsic structural component of the systems we construct.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



