arXiv

Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG

Title: Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG

Original: arXiv:2606.00432v1 Announce Type: new Abstract: As retrieval-augmented generation (RAG) systems scale, it becomes increasingly challenging to ensure faithful grounding in external evidence. Large language models may still prioritize parametric knowledge over retrieved information when conflicts arise. We propose a novel training-free decoding framework, \emph{Grounded Decoding}, designed to improve factual consistency in RAG without modifying model parameters. Unlike standard approaches that rely on a single conditional distribution, our method constructs two matched-prompt distributions at every generation step: (1) a full RAG distribution conditioned on the query, retrieved documents, and generated prefix, and (2) a retrieval-only distribution conditioned solely on retrieved evidence and the same prefix. The final next-token distribution is derived as the unique solution to a KL-barycenter objective over the probability simplex, yielding a normalized geometric fusion of the two distributions.This formulation naturally recovers standard RAG when the grounding weight is zero and smoothly shifts probability mass toward retrieved evidence as grounding strength increases. We further introduce a conflict-aware adaptive weighting scheme that dynamically adjusts grounding based on distributional disagreement and retriever confidence. Experiments on ALCE, Natural Questions, and FActScore demonstrate consistent improvements in factual accuracy and citation quality over standard RAG and competitive decoding-time baselines, while maintaining fluency. Our results indicate that probability-level fusion provides a strong and efficient alternative to logit-level intervention methods for faithful RAG decoding.

Rewritten:

Title: Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG

As Retrieval-Augmented Generation (RAG) architectures expand in complexity, maintaining reliable grounding in external sources presents growing difficulties. Even when discrepancies occur between external data and internal knowledge, large language models often default to relying on their parametric training data rather than the retrieved context. To address this, we introduce Grounded Decoding, a training-free decoding strategy aimed at enhancing the factual reliability of RAG systems without altering the underlying model weights.

In contrast to conventional methods that utilize a single conditional probability distribution, our approach generates two distinct, matched-prompt distributions at each step of the generation process. The first is a comprehensive RAG distribution, which factors in the initial query, the retrieved documents, and the text generated so far. The second is a retrieval-specific distribution that relies exclusively on the retrieved evidence alongside the same generated prefix. By solving a KL-barycenter objective across the probability simplex, we derive the final next-token distribution. This process results in a normalized geometric integration of the two distributions.

This mathematical structure ensures that when the grounding weight is set to zero, the system reverts to standard RAG behavior. As the grounding strength is increased, the probability mass is gradually redirected toward the retrieved evidence. Additionally, we present an adaptive weighting mechanism sensitive to conflicts, which dynamically tunes the grounding influence according to the level of disagreement between distributions and the confidence scores of the retriever.

Evaluations conducted on the ALCE, Natural Questions, and FActScore benchmarks reveal that our method consistently outperforms both standard RAG and competitive decoding-time baselines in terms of factual accuracy and citation precision, all while preserving text fluency. These findings suggest that fusing probabilities at the distribution level offers a robust and efficient substitute for logit-level intervention techniques when aiming for faithful RAG decoding.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...