arXiv

Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks

Title: Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks

Abstract

A prevailing heuristic for explaining how first-order gradient methods generalize in non-convex neural networks is the principle that "flat interpolators generalize effectively" (Hochreiter and Schmidhuber, 1994; Keskar et al., 2017). In this context, flatness is typically quantified by the trace of the Hessian of the empirical loss. However, Dinh et al. (2017) demonstrated that by exploiting network symmetries, one can alter the flatness of a model without affecting either the empirical or population losses. Consequently, any interpolator can be rendered either sharper or flatter, rendering the earlier heuristic statement vacuous.

In this study, we investigate the learning of an unknown multi-index model using 2-layer non-convex homogeneous neural networks. We demonstrate that a connection between flatness and generalization persists despite the presence of these symmetries. This relationship specifically concerns the "flattest" interpolators—those achieving the orderwise minimum flatness among all possible interpolators.

First, we identify a natural class of interpolators that fail to generalize, showing that their flatness cannot be improved to approach the theoretical minimum, even when symmetries are utilized. Second, we prove that for data generated by a sum of single-index models, any flattest interpolator achieves a small population loss, provided that both approximation error and label noise are low. Thus, the flattest interpolators consistently generalize. This finding establishes a direct link between flatness and generalization that holds for a broad spectrum of activation functions and realistic data distributions.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade
Bloomberg

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade

Broadcom’s earnings miss triggered a sell-off in AI stocks, dragging down emerging-market equities. This disruption high...

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role
Bloomberg

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role

Revolut co-founder and CTO Vlad Yatsenko is stepping down from his executive role. The resignation marks a significant l...

Netflix Top Tech Exec Stone on Integrating AI
Bloomberg

Netflix Top Tech Exec Stone on Integrating AI

Netflix’s top tech exec discusses integrating AI to enhance content discovery and production efficiency.

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive
Bloomberg

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive

Microsoft AI CEO Mustafa Suleyman criticized Anthropic’s models as too expensive. Meanwhile, Microsoft plans to allow us...

Ramp Notches $44 Billion Valuation in New Funding Round
Bloomberg

Ramp Notches $44 Billion Valuation in New Funding Round

RAMP secured a $44 billion valuation in its latest funding round. CEO Eric Glyman attended the 2026 Reagan National Econ...