Can Factual Opinions Be Edited (Manipulated) in Large Language Models?
Title: Is It Possible to Alter Factual Opinions Within Large Language Models?
Abstract: As Large Language Models (LLMs) become deeply embedded in numerous sectors, the ability to edit their knowledge bases has emerged as a critical capability that also carries significant potential dangers. Existing editing protocols largely focus on atomic facts, thereby neglecting the substantial risks involved in manipulating factual opinions—such as the recorded positions of public figures on societal matters. The alteration of such opinions holds the power to reconstruct public personas, sway electoral outcomes, and shift collective societal perspectives. To rigorously evaluate this vulnerability, we present the Factual Opinion Editing with Evidence (FOE) benchmark. This framework includes data on 261 public figures, spans 19 distinct issue categories, and comprises 2,178 comprehensive opinion records. Our analysis reveals that contemporary editing techniques perform poorly when applied to factual opinions; they typically produce only surface-level modifications and fail to maintain logical consistency between the revised stance and the supporting evidence generated by the model. To overcome this shortcoming, we introduce a straightforward yet potent approach called Self-Generated Evidence-Aligned. This method ensures alignment between opinions and evidence without the need for explicit instructional cues. Collectively, our proposed benchmark and methodology establish a baseline for grasping the nascent security challenges posed by the editing of factual opinions in LLMs.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





