TIGER: Traceable Inference with Graph-Based Evidence Routing for Mitigating Hallucinations in Multimodal Generation
**Title: TIGER: Graph-Based Evidence Routing for Traceable Inference to Reduce Hallucinations in Multimodal Generation
Abstract:
This study investigates fact-level repair within multimodal generation, addressing the issue where fluent outputs may include specific facts lacking support from the input data. Current inference-time repair strategies typically generate feedback by jointly conditioning on both the input and the existing output. However, this approach presents two primary drawbacks: first, hallucinated assertions in the output can skew the modelâs interpretation of the input, and second, unstructured feedback cannot be effectively ranked or scheduled at the individual fact level.
To address these challenges, we introduce TIGER, an inference-time framework that restructures feedback for localized repair. TIGER operates by independently extracting an observation graph from the input and a claim graph from the current output. It then calculates a graph-conditioned risk score for each claim, based on the degree of support or conflict. The model proceeds to repair selected high-risk claims without updating the backbone parameters.
Our convergence analysis demonstrates that, under mild assumptions, the expected total risk decreases geometrically to a defined asymptotic bound. Empirical evaluations across four cross-modal pathwaysâimage-to-text, image-plus-text-to-text, audio-to-text, and video-to-textâindicate that TIGER successfully minimizes unsupported content while maintaining task quality. These improvements are consistent across various backbones. Furthermore, a case study using CrisisFACTS suggests that this repair mechanism can enhance grounding in multi-source environments.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




