arXiv

The Social Cost of Intelligence: Emergence, Propagation, and Amplification of Stereotypical Bias in Multi-Agent Systems

June 2, 2026 · Thi-Nhung Nguyen, Linhao Luo, Amardeep Kaur, Rollin Omari, Tamas Abraham, Junae Kim, Thuy-Trang Vu, Dinh Phung · Original Source

Title: The Social Cost of Intelligence: Emergence, Propagation, and Amplification of Stereotypical Bias in Multi-Agent Systems

Original: arXiv:2510.10943v2 Announce Type: replace-cross Abstract: Bias in large language models (LLMs) remains a persistent challenge, often leading to stereotyping and unfair treatment across social groups. While prior work has mainly focused on individual LLMs, the emergence of multi-agent systems (MAS), where multiple LLMs collaborate and communicate, introduces new and underexplored dynamics in how bias emerges, propagates, and amplifies. To systematically investigate these dynamics, we propose a simple evaluation framework with three agent-level metrics that quantify bias emergence, propagation, and amplification throughout multi-agent interaction. We evaluate MAS across three bias benchmarks under varying LLM backbones, social-group configurations, communication behaviors, and adversarial settings. Our results show that communication can trigger up to 70\% new bias emergence, propagate bias across over 80\% of agents, and amplify stereotypes by more than 3$\times$. We further find that denser and competitive communication generally increases bias. Finally, we demonstrate that MAS are highly vulnerable to simple bias injection attacks, and existing defense strategies provide only limited protection. Our findings provide important insights into the fairness and robustness of multi-agent LLM systems.

Rewrite: The persistence of bias in large language models (LLMs) continues to pose significant challenges, frequently resulting in stereotyping and inequitable treatment among various social groups. Although previous research has predominantly concentrated on isolated LLMs, the rise of multi-agent systems (MAS)—in which multiple LLMs interact and collaborate—uncovers novel and insufficiently studied mechanisms regarding the origin, spread, and intensification of bias. To rigorously examine these phenomena, we introduce a streamlined evaluation framework comprising three agent-level metrics designed to measure bias emergence, propagation, and amplification during multi-agent interactions. Our evaluation covers three distinct bias benchmarks, testing MAS under diverse conditions including different LLM backbones, social-group setups, communication patterns, and adversarial scenarios. The findings reveal that inter-agent communication can induce new bias in up to 70% of cases, facilitate the spread of bias across more than 80% of the participating agents, and intensify stereotypes by a factor exceeding three. Additionally, we observe that higher communication density and competitive dynamics tend to exacerbate bias. Furthermore, our analysis highlights that MAS are extremely susceptible to straightforward bias injection attacks, with current defensive measures offering only marginal safeguards. These results offer critical perspectives on the equity and resilience of multi-agent LLM architectures.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC