arXiv

Improving Semantic Uncertainty Quantification in LVLMs with Semantic Gaussian Processes

Title: Enhancing Semantic Uncertainty Quantification in Large Vision-Language Models via Semantic Gaussian Processes

Abstract: Large Vision-Language Models (LVLMs) frequently generate outputs that appear credible yet lack reliability, underscoring the critical need for robust uncertainty estimation. Current approaches to assessing semantic uncertainty typically depend on external models to cluster various sampled responses and evaluate their semantic coherence. Nevertheless, such clustering techniques are often unstable; they are highly susceptible to slight changes in wording and may erroneously merge or divide answers that are semantically alike, resulting in inaccurate uncertainty metrics. To address these limitations, we introduce Semantic Gaussian Process Uncertainty (SGPU), a Bayesian methodology that measures semantic uncertainty by examining the geometric arrangement of answer embeddings, thereby circumventing the pitfalls of rigid clustering. SGPU projects generated responses into a dense semantic space, calculates the Gram matrix of their embeddings, and distills their semantic structure through the eigenspectrum. This spectral data is subsequently input into a Gaussian Process Classifier, which is trained to associate patterns of semantic consistency with predictive uncertainty. The framework is versatile, functioning effectively in both black-box and white-box scenarios. Our evaluation across six LLMs and LVLMs on eight datasets covering visual question answering (VQA), image classification, and textual QA demonstrates that SGPU consistently sets new standards for calibration (measured by Expected Calibration Error, ECE) and discrimination (measured by AUROC and AUARC). Furthermore, we demonstrate that SGPU generalizes across different models and modalities, suggesting that its spectral representation effectively captures universal patterns of semantic uncertainty.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Bloomberg Tech Event Special | Bloomberg Tech 6/04/2026
Bloomberg

Bloomberg Tech Event Special | Bloomberg Tech 6/04/2026

This title indicates a special Bloomberg Tech broadcast scheduled for June 4, 2026. No specific content details are prov...

Anthropic’s Amodei on Pros and Cons of an AI Startup IPO
Bloomberg

Anthropic’s Amodei on Pros and Cons of an AI Startup IPO

Anthropic CEO Dario Amodei weighs the pros and cons of an IPO for his AI startup, highlighting the trade-offs between pu...

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

Fed's Daly Says Forward Guidance Could Be Misleading
Bloomberg

Fed's Daly Says Forward Guidance Could Be Misleading

Fed’s Daly warns forward guidance may be misleading or lack clarity.

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...