CORE: Conflict-Oriented Reasoning for General Multimodal Manipulation Detection
Title: CORE: Conflict-Oriented Reasoning for General Multimodal Manipulation Detection
Abstract:
The proliferation of generative AI has led to a surge in highly realistic and widespread multimodal disinformation, presenting significant risks to social cohesion and public confidence. Current detection strategies often depend on specialized models tailored to specific manipulation techniques and require extensive labeled datasets, which limits their ability to generalize to novel forms of fraud. We posit that the fundamental nature of manipulated content is defined by intrinsic contradictions—specifically, semantic or physical discrepancies that exist either between different modalities or in opposition to established world knowledge. Drawing from this insight, we introduce the Conflict-Oriented REasoning (CORE) framework. This approach empowers multimodal large language models (MLLMs) with dedicated capabilities for identifying such conflicts. To facilitate this, we developed the Conflict Attribution Corpus (CAC), a dataset featuring detailed annotations of conflict sources and factors, which serves as the foundation for training conflict perception. Leveraging the CAC, CORE employs conflict-oriented representation enhancement and reasoning processes to deliver robust and adaptable detection performance. This enables the system to quickly adjust to unfamiliar manipulation styles, even with minimal examples or in zero-shot scenarios. Our comprehensive experiments indicate that CORE outperforms existing state-of-the-art models. Both the code and the dataset are accessible at https://github.com/shen8424/CORE.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



