arXiv

Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data

Title: Self-Assessment Is Inherent: Extracting Latent Judge Calibration in Foundation Models Using Minimal Data

Abstract: As large language models (LLMs) become increasingly assessed by other models, a critical question emerges: is it possible for a model to forecast how a judge will rate its own generated output? Our research indicates that this capability is largely innate, requiring no specialized training. Even when prompted with few-shot examples, base models demonstrate a strong ability to predict external judges’ multi-attribute quality scores for open-ended responses, performing significantly better than chance across three distinct benchmarks.

To harness this latent potential, we propose Self-Evaluation Elicitation (SEE). This approach unlocks the model's inherent skills through a concise two-stage process. First, a calibration-coupled reinforcement learning phase enhances both the quality of the model’s answers and its ability to predict the judge’s feedback. Second, a masked distillation phase refines the predictive accuracy of the self-evaluation without altering the underlying answers.

SEE achieves these results using only 160 unique examples—a dataset size approximately 31 times smaller than that required by standard reinforcement learning baselines. Despite the minimal data requirement, the method improves calibration on held-out data across all three benchmarks while maintaining high answer quality. Furthermore, the extracted self-evaluation mechanism is tightly integrated into the model’s own token distribution and remains robust across judges it was never explicitly trained to evaluate. This stability suggests that the model has developed a generalizable concept of quality, rather than merely memorizing the preferences of a specific judge. These findings suggest that aligning self-evaluation with judges should be viewed as an elicitation challenge rather than one of knowledge acquisition.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Bloomberg Tech Event Special | Bloomberg Tech 6/04/2026
Bloomberg

Bloomberg Tech Event Special | Bloomberg Tech 6/04/2026

This title indicates a special Bloomberg Tech broadcast scheduled for June 4, 2026. No specific content details are prov...

Anthropic’s Amodei on Pros and Cons of an AI Startup IPO
Bloomberg

Anthropic’s Amodei on Pros and Cons of an AI Startup IPO

Anthropic CEO Dario Amodei weighs the pros and cons of an IPO for his AI startup, highlighting the trade-offs between pu...

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

Fed's Daly Says Forward Guidance Could Be Misleading
Bloomberg

Fed's Daly Says Forward Guidance Could Be Misleading

Fed’s Daly warns forward guidance may be misleading or lack clarity.

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...