arXiv

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

June 4, 2026 · Saroj Mishra · Original Source

Title: Combating Cascading Hallucinations in Agentic RAG: The CHARM Framework for Detection and Mitigation

Abstract:

While multi-step agentic retrieval-augmented generation (RAG) pipelines have proven effective for complex reasoning, they remain susceptible to a specific type of failure that current hallucination detection tools often overlook: cascading hallucination. This phenomenon occurs when initial errors are not only propagated but also amplified through successive reasoning stages, ultimately yielding outputs that are both highly confident and factually wrong. To tackle this vulnerability, we define cascading hallucination as a unique failure mode within agentic RAG systems and establish a taxonomy comprising four distinct cascade patterns. We subsequently introduce CHARM (Cascading Hallucination Aware Resolution and Mitigation), an architectural framework designed to detect and halt error propagation during multi-step reasoning. CHARM operates alongside existing agentic RAG pipelines—eliminating the need for structural overhauls—by utilizing four key components: stage-level fact verification, cross-stage consistency tracking, confidence propagation monitoring, and cascade resolution triggering. Our evaluation of CHARM was conducted across LangChain agentic pipeline configurations using HotpotQA, MuSiQue, 2WikiMultiHopQA, and a bespoke adversarial dataset. The framework achieved a cascade detection rate of 89.4% with a false positive rate of 5.3%, incurring an average latency overhead of 215 ms ± 18 ms per stage. Notably, CHARM reduced error propagation by 82.1%, significantly outperforming output-level detectors, which managed only an 18.5% reduction. Ablation studies of the components confirm that each detection module plays a vital role in ensuring comprehensive cascade coverage. By integrating with human-in-the-loop oversight frameworks, CHARM offers a robust reliability and governance stack suitable for the production deployment of agentic AI.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC