Breaking the Scale Barrier: One-Shot Knowledge Transfer via Frequency Transform
Title: Shattering Scale Limitations: Single-Step Knowledge Transfer Through Frequency Transformation
Abstract:
While fine-tuning extensive pre-trained networks has emerged as the conventional method for adapting to downstream applications, the knowledge embedded within these models is inextricably linked to their specific, monolithic structures. This rigidity hinders the flexible repurposing of such models across architectures of different dimensions. To address this bottleneck, contemporary strategies generally rely on either selecting specific parameters—which overlooks the interconnected nature of the knowledge—or employing generative models to predict parameters, a method that necessitates impractical access to vast model repositories.
This study posits that the low-frequency components of model weights serve as the tangible vessel for foundational, task-independent knowledge, which we term the "learngene." We substantiate this hypothesis by showing that downstream models and tasks can efficiently inherit this component. Leveraging this discovery, we introduce FRONT (FRequency dOmain kNowledge Transfer), an innovative framework that utilizes the Discrete Cosine Transform (DCT) to extract these low-frequency "learngenes." This extracted knowledge can be effortlessly applied to initialize models of any scale through straightforward truncation or padding, requiring no training whatsoever. To boost performance further, we offer an optional, cost-effective refinement stage that incorporates a spectral regularizer, thereby enhancing the transferability of the learngene. Comprehensive experiments reveal that FRONT delivers state-of-the-art results, speeding up convergence in vision tasks by as much as $15\times$ and lowering average training FLOPs by 40.5% in language tasks. The code is accessible at https://github.com/LUcy0505/FRONT.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC





