Fast Unlearning at Scale via Margin Self-Correction
Title: Scalable Fast Unlearning Through Margin Self-Correction
Abstract: Model unlearning aims to modify a trained language model so that it behaves as though it never encountered specific training instances, all while maintaining performance on other tasks and eliminating the need for expensive full retraining. Current methods generally involve fine-tuning a pretrained model within a fixed computational budget, subsequently choosing the best version by testing multiple saved checkpoints against downstream validation data. This process introduces two major inefficiencies that hinder scalability: the continuation of training past the optimal balance between forgetting and retaining information, and the checkpoint selection phase, which demands additional storage and multiple evaluation cycles.
To overcome these hurdles, we propose MArgin Self-Correction (MASC), a streamlined unlearning technique featuring an online stopping mechanism that eliminates the need for downstream evaluation. When presented with a text sequence to be forgotten, MASC dynamically narrows the logit gap between the original next token and the most probable alternative tokens. The algorithm concludes the unlearning process once this gap averages out to a small value across a substantial majority of token positions within the forget sequences.
Experimental results on the TOFU, MUSE News, and MUSE Books benchmarks demonstrate that MASC delivers a forget-retain balance comparable to existing baselines but at a significantly lower computational expense. Furthermore, our analysis reveals that increasing the model size (i.e., the number of parameters) enhances trade-offs for both MASC and SimNPO; specifically, forget metrics remain consistent while the utility retained by the model improves.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



