REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control
Title: REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control
Abstract:
The surge of misinformation on social media platforms necessitates automated fact-checking tools capable of delivering precise verdicts accompanied by trustworthy explanations. Current approaches leveraging large language models (LLMs) often fail to account for the deceptive stylistic elements present in LLM-generated rationales, leading to unfaithful justifications that can distort human assessment. Furthermore, these methods depend heavily on external knowledge bases, a dependency that introduces hallucinations and significant latency, thereby compromising the reliability and responsiveness required for real-time applications.
To overcome these limitations, we introduce REason-guided Fact-checking with Latent EXplanations (REFLEX), a self-refining framework that explicitly governs reasoning style through verdict anchoring. REFLEX leverages self-disagreement veracity signals—generated by comparing the backbone model with its fine-tuned counterpart—to construct steering vectors. This process naturally separates factual content from stylistic noise.
Empirical evaluations on real-world datasets indicate that REFLEX attains state-of-the-art results across LLaMA-series models, utilizing merely 465 self-refined samples. Additionally, due to its strong transferability, REFLEX secures a performance improvement of up to 7.54% on in-the-wild data. Our findings further confirm that the proposed method effectively reduces faithful hallucinations, steering the model toward superior verdict accuracy compared to prior explainable fact-checking techniques.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





