RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting
Title: RAFT: Mitigating Catastrophic Forgetting in Domain Fine-Tuning via Data Refinement and Adaptive Distillation
Abstract:
Supervised fine-tuning (SFT) tailored to specific domains typically enhances performance within that niche but often compromises the model’s broader, general-purpose capabilities. We analyze this decline by identifying two critical deficiencies inherent in domain-specific SFT. The first is the supervision-compatibility gap, which arises because domain-specific targets frequently exhibit distinct stylistic and reasoning patterns that diverge from the natural responses generated by the pre-trained model. The second is the trajectory-preservation gap, where teacher-forced SFT focuses solely on optimizing fixed target tokens, neglecting to constrain how the model behaves when generating its own prefixes. Consequently, the model fails to retain its original behavioral traits.
To resolve these issues, we introduce RAFT (Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting), a two-stage framework designed to address both gaps. In the initial stage, RAFT generates supervision that is compatible with the model by employing answer fusion, semantic filtering, and self-conditioned rewriting. In the second stage, it implements Answer-Conditioned On-Policy Distillation. During this process, the original instruction-tuned model acts as a teacher, providing soft targets for trajectories generated by the student model, while the fused answer serves as contextual conditioning. To further stabilize the balance between domain-specific and general capabilities, we incorporate top-K temperature distillation and adaptive loss balancing based on Exponential Moving Average (EMA).
Experimental evaluations across five domains using three instruction-tuned backbone models demonstrate RAFT’s efficacy. It achieves an average domain accuracy improvement of 23.2% compared to standard SFT. Furthermore, it partially restores general capabilities degraded by SFT, yielding relative improvements of 18.2% on MS-Bench and 10.2% on IFEval. These findings indicate that integrating data refinement with trajectory-level preservation offers a robust strategy for domain fine-tuning that minimizes catastrophic forgetting.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




