Large Language Models Are Overconfident in Their Own Responses
Large Language Models Exhibit Excessive Confidence in Their Own Outputs
arXiv:2606.03437v1 Announce Type: new Abstract: Previous research has established that instruction-tuned large language models (LLMs) suffer from poorer calibration compared to their base pre-trained versions. Yet, the specific influence of the widely adopted chat template on the calibration of conversational LLMs remains largely unexplored. This study isolates the drivers of this miscalibration by separating the impacts of post-training algorithms from chat formatting. Our findings indicate that while instruction tuning inherently degrades calibration, the chat template exacerbates the problem via an "ownership bias." Specifically, models display substantially higher confidence in their own generated answers than in identical responses attributed to a user.
Through extensive experiments involving six recent open-weight LLMs, three distinct benchmarks, and three confidence elicitation methods, we observed that models can assign confidence scores up to 26% higher to their own outputs. Capitalizing on this discovery, we introduce a straightforward inference-time technique: presenting the model’s generated answer as user input during the confidence elicitation process. This method effectively curtails overconfidence and enhances calibration by as much as 26%, eliminating the need for retraining and significantly closing the performance gap between base and instruction-tuned models.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





