arXiv

What Makes a Strong Model? A Unified Spectral Analysis of Knowledge Transfer over High-dimensional Linear Regression

June 2, 2026 · Wendao Wu, Fangqing Zhang, Haihan Zhang, Cong Fang · Original Source

Title: Unveiling the Strength of Models: A Cohesive Spectral Examination of Knowledge Transfer in High-Dimensional Linear Regression

Abstract: Knowledge Transfer (KT) is a pervasive element in contemporary machine learning, spanning from traditional model compression techniques like Knowledge Distillation (KD) to the recently observed Weak-to-Strong (W2S) generalization trends. Although prior research has provided isolated perspectives on these areas, a comprehensive theoretical framework that explains the success of KT across such varied contexts has been absent. This study introduces a unified spectral analysis of Stochastic Gradient Descent (SGD) dynamics within high-dimensional linear regression, shedding light on why KT is effective across these seemingly unrelated domains. We define KT efficiency through two primary mechanisms: Spectral Horizon Expansion in KD, which allows for the retrieval of high-frequency signals that are otherwise statistically unreachable, and Spectral Denoising in W2S, where the student model functions as a filter to mitigate optimization noise. By integrating these concepts, our framework demonstrates that the success of knowledge transfer is driven by the dynamic relationship between implicit regularization and the varied rates of spectral learning.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC