TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment
Title: TriAlign: Advancing Universal Truth Consistency in Personalized LLM Alignment
Original: arXiv:2606.01755v1 Announce Type: new
Abstract: While personalized large language models tailor their outputs to align with individual user preferences and social identities, this customization can inadvertently create significant disparities in how universal truths are presented across different social groups. Specifically, certain demographics may systematically encounter less accurate responses when dealing with objective tasks. Current alignment strategies typically either bypass personalization entirely or concentrate predominantly on subjective preference matching, thereby neglecting the critical dimensions of fairness and consistency regarding factual accuracy.
To bridge this gap, we introduce the concept of Truth-Invariant Alignment (TIA), a novel alignment challenge for personalized LLMs. TIA seeks to maintain the uniformity of universal truths across all social groups without sacrificing the benefits of personalization. In response, we present TriAlign, which stands as the inaugural offline multi-agent reinforcement learning (MARL) framework designed specifically for TIA. In this system, each social group is represented as an independent agent engaged in interaction.
TriAlign simultaneously optimizes for three key areas: accuracy in universal truths, consistency of those truths across different groups, and the degree of personalization. This is achieved through a fairness-sensitive objective function that incorporates an explicit penalty for inconsistency. Our experiments, conducted across a variety of benchmarks, reveal that TriAlign secures a superior equilibrium among these competing goals compared to robust baseline methods. The framework effectively diminishes disparities in universal truth adherence across social groups, while concurrently enhancing performance on objective tasks and the quality of personalized interactions.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




