arXiv

Accurate Large-sample Uncertainty Quantification using Stochastic Gradient Markov Chain Monte Carlo

June 2, 2026 · Yu Wang, Jie Ding, Jonathan H. Huggins · Original Source

Title: Precise Uncertainty Quantification for Large-Scale Data via Stochastic Gradient Markov Chain Monte Carlo

Abstract:

Calibrating algorithms like Stochastic Gradient Langevin Dynamics (SGLD) and Stochastic Gradient Descent (SGD) for uncertainty quantification and approximate sampling presents significant difficulties, especially in practical scenarios involving large batch sizes or model misspecification. Current theoretical frameworks, which often depend on continuous-time limits or stringent statistical assumptions, tend to yield quantitatively unreliable guidance in these specific regimes. To overcome these limitations, we introduce novel discrete-time approximations for SG(L)D, both with and without momentum. These approaches facilitate precise predictions of key metrics, including the stationary covariance, the covariance of iterate averages, and the integrated autocorrelation time. Furthermore, we establish non-asymptotic, quantitative error bounds that demonstrate the sufficient accuracy of these estimates for practical tuning and uncertainty quantification. Our numerical experiments confirm that this theoretical framework offers superior tuning guidance across various models and data-generating distributions where previous methods falter, including cases utilizing $\beta$-divergence instead of log-loss to achieve statistically robust inferences.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC