arXiv

Capability Self-Assessment: Teaching LLMs to Know Their Limits

June 2, 2026 · Haoyan Yang, Reza Shirkavand, Yukai Jin, Jiawei Zhou, Shangqian Gao, Heng Huang · Original Source

Title: Capability Self-Assessment: Teaching LLMs to Know Their Limits

arXiv:2606.00251v1 Announce Type: new

Abstract:

Fundamental to the reliability of intelligent systems is the capacity to identify personal limitations and determine whether to tackle a problem directly or delegate it. However, our findings indicate that contemporary large language models systematically fail to possess this skill: across various model families and scales, these models tend to overestimate their own competence, attempting to resolve queries beyond their capabilities. We define this specific ability as Capability Self-Assessment (CSA) and frame it as a policy-learning challenge, with the goal of enhancing self-assessment without compromising the model’s existing functions. Our research demonstrates that reinforcement learning is highly effective in teaching CSA, significantly surpassing supervised fine-tuning while maintaining the model’s original capabilities. Conversely, supervised fine-tuning severely diminishes the very capabilities the model is intended to evaluate. Furthermore, the self-assessment behavior acquired through learning shows strong generalization out of distribution, indicating that CSA is a transferable trait among models. Finally, CSA offers practical utility: it enhances local-cloud decision-making processes at inference time and serves as a valuable signal for selecting targeted data during training.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC