arXiv

P\textsuperscript{2}-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

Title: P²-DPO: Tackling Hallucination in Perceptual Processing through Calibration Direct Preference Optimization

Abstract:

The phenomenon of hallucination has recently become a focal point of research within Large Vision-Language Models (LVLMs). Direct Preference Optimization (DPO) offers a solution by leveraging corrected human preferences to guide learning, effectively mitigating these hallucinations. However, this approach has limitations: it fails to specifically address the perceptual bottlenecks present in attended regions and does not adequately handle insufficient visual robustness when images are degraded. Additionally, conventional preference pairs are typically vision-agnostic, and their off-policy nature restricts their utility in directing model training.

To overcome these hurdles, we introduce Perceptual Processing Direct Preference Optimization (P²-DPO), a new training framework where the model constructs and learns from its own preference pairs. This approach directly targets the aforementioned visual bottlenecks while sidestepping the problems associated with vision-agnostic and off-policy data. The proposed method features two key components: (1) an on-policy strategy for constructing preference pairs that focus on enhancing perception and ensuring visual robustness, and (2) a specialized Calibration Loss designed to accurately align visual inputs with the causal generation of text.

Our experiments show that P²-DPO surpasses strong baseline models that depend on expensive human feedback, achieving this with similar training costs and data volumes. Moreover, assessments using Attention Region Fidelity (ARF) and tests under image degradation conditions confirm that P²-DPO successfully resolves perceptual bottlenecks in attended areas and enhances the model's robustness against degraded visual inputs.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...