Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
Title: Closing the Divide Between Internal Knowledge and Predictive Output in LLMs for Multiple-Choice Tasks
Although large language models (LLMs) demonstrate robust performance across a wide array of tasks, their reliability is often compromised by inconsistent behaviors that do not accurately reflect their internal knowledge bases. A prominent example of this issue is the frequent failure of LLMs on multiple-choice questions (MCQs), even when the correct answers are clearly encoded within their hidden representations. This discrepancy highlights a fundamental misalignment between what the model knows and how it generates outputs. To address this knowledge-prediction gap in MCQs, we conducted a three-phase analysis of hidden representations. Initially, we measured the extent and scale of this gap across various models and datasets. Next, we offered a geometric perspective by isolating separate subspaces for knowledge and prediction within the residual stream. Finally, we developed KAPPA, a lightweight intervention applied during inference that aligns these two subspaces in the residual stream to narrow the gap between knowledge and prediction. Our findings yield a geometric and interpretable framework for understanding the knowledge-prediction gap in LLMs. Moreover, KAPPA successfully diminishes this gap across a variety of MCQ benchmarks and model architectures, while also proving effective in free-form contexts.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





