LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation
Title: LoopFM: Leveraging Historical Representations from Foundation Models for Recommendation
Abstract: Traditional knowledge distillation (KD) methods typically transfer a single scalar prediction from a massive foundation model (FM) to smaller, specialized vertical models (VMs). This approach often suffers from a diminishing transfer ratio, meaning the VM captures a shrinking fraction of the FM’s performance gains, because a single scalar value fails to convey the rich intermediate knowledge embedded within larger FMs. To overcome this limitation, we introduce LoopFM (Learning frOm HistOrical RePresentations of FM), a novel framework that establishes a high-bandwidth transfer channel. LoopFM achieves this by structuring FM intermediate embeddings as input features—such as user history sequences—for downstream VMs. Crucially, this method eliminates the need for real-time FM inference during serving and avoids architectural coupling between the FM and VM. We present a theoretical analysis of LoopFM, including a gain decomposition and an examination of the transfer ratio. Empirical evaluations on three public benchmarks reveal that LoopFM yields significant AUC improvements, such as gains exceeding 6% on the TaobaoAd dataset, while offering complementary knowledge transfer capabilities alongside KD. In industrial-scale deployments involving billions of examples and trillion-parameter FMs, LoopFM effectively doubles the knowledge transfer ratio compared to KD alone. This implementation resulted in a +0.5% conversion improvement in the first half of its initial launch period, followed by +1.03% and +1.22% conversion gains from two separate subsequent launches in the latter half.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






