Non-vacuous Generalization Bounds for Deep Neural Networks without any modification to the trained models
Title: Non-trivial Generalization Bounds for Deep Neural Networks Without Altering Trained Models
Abstract: Ensuring the reliability of machine learning systems continues to face a significant hurdle: understanding and verifying the performance of contemporary deep neural networks. To address this, we propose a novel category of data-dependent generalization bounds that can be applied directly to trained architectures, requiring no structural changes. Specifically, we introduce a bound that is precisely computable and yields non-vacuous results for every network tested, encompassing massive models such as those with 600 million parameters on ImageNet. This study marks the first demonstration that substantial generalization guarantees are possible for large-scale, unmodified deep networks.
Our methodology highlights that generalization performance is driven by the interplay between the trained model and the geometric structure of the data distribution. We break down the generalization error into two distinct, interpretable factors: a distributional complexity term, which reflects how data density is spread throughout the input space, and local model-behavior terms, which describe the network’s actions within specific regions. This combined dependence clarifies the origins and locations of generalization gaps. Empirical evidence suggests that certain components of our bound strongly predict actual test errors. Furthermore, the bound becomes tighter when the chosen partition corresponds with the intrinsic geometry of the data, underscoring data-dependent local regularity as a crucial factor in generalization.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





