Score $\times$ Decoder: A Unified View of Unsupervised Inference-Time Scaling for Hallucination Mitigation
Title: Score × Decoder: A Unified View of Unsupervised Inference-Time Scaling for Hallucination Mitigation
Abstract
Even when the correct answer is encoded within their parameters, large language models are prone to hallucination. Although inference-time scaling can retrieve this latent knowledge, state-of-the-art techniques typically rely on supervised components, such as reward models or trained verifiers. This study investigates the capabilities of unsupervised approaches, asking which intrinsic signal most accurately identifies correct outputs and how these signals should be decoded using only a base language model.
We propose a comprehensive framework that maps four scoring mechanisms—perplexity, contrastive likelihood, power-distribution likelihood, and self-verification—against three decoding strategies: optimization, sampling, and consensus. This "score × decoder" grid is rigorously evaluated across all combinations using the MATH500 benchmark, testing both the base and instruction-tuned versions of the Qwen3-1.7B model.
Our results indicate that self-verification, which involves prompting the model to evaluate its own response, performs robustly across most scenarios, particularly when enhanced by a training-free virtual-thinking prefix. However, we find that no single score maintains consistent quality; its effectiveness is contingent upon both the specific decoder employed and the underlying model’s capabilities. Consequently, in the absence of external supervision, the selection of a scoring metric and a decoding method must be treated as an interdependent decision.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





