Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels
Title: Empirical Neural Tangent Kernels Enable Scalable Uncertainty Quantification for Extreme Weather Prediction
Abstract:
While deep learning models for weather forecasting now rival numerical prediction systems in accuracy and operate with significantly greater speed, they typically yield deterministic outputs lacking uncertainty estimates. This absence of probabilistic context represents a critical deficiency for high-stakes decision-making during extreme weather events. To address this, we introduce Neural Tangent Kernel-based uncertainty quantification (NTK-UQ), which leverages last-layer empirical features. Our theoretical analysis reveals that the quality of uncertainty quantification is inherently tied to model architecture through two primary mechanisms.
First, we identify a "variance collapse" mechanism that delineates conditions under which uncertainty quantification fails. This occurs when the eigenvalue truncation rank nears the effective rank of the feature space, causing the Gaussian Process (GP) correction term to absorb nearly all prior variance. Consequently, the model loses the ability to distinguish between routine conditions and tropical cyclones. Architectures characterized by concentrated spectra, such as spectral operators, necessitate aggressive truncation (specifically $k \leq 10$), whereas attention-based models can accommodate full-rank computations.
Second, the effectiveness of decomposition techniques is influenced by the non-Gaussian, heavy-tailed nature of extreme weather phenomena. Independent Component Analysis (ICA) outperforms Singular Value Decomposition (SVD) by utilizing higher-order statistics, such as kurtosis and negentropy, to isolate features associated with heavy-tailed extreme events. In contrast, SVD is limited to capturing only second-order variance. We propose a data-driven selection rule that determines whether to employ ICA or SVD based on the concentration ratio of the feature eigenspectrum. This rule accurately prescribes the superior decomposition method for all four architectures evaluated in our study.
In comparative testing against split conformal prediction—the standard post-hoc baseline—NTK-UQ generates prediction intervals that are 31–37% sharper at 90% coverage. Uniquely, NTK-UQ produces adaptive intervals that scale according to the severity of the extreme event, a capability that conformal prediction cannot offer by design. The proposed framework demands no model retraining; estimating inference-time uncertainty requires merely a single matrix-vector product per sample.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



