ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models
Title: ZeroUnlearn: Enabling Few-Shot Knowledge Unlearning in Large Language Models
Abstract
The extensive training of large language models on vast web corpora inevitably leads to the retention of sensitive data—defined as inputs capable of triggering harmful outputs—thereby sparking significant privacy and safety concerns. Current machine unlearning techniques largely depend on retraining or aggressive fine-tuning, approaches that are either computationally burdensome or risk degrading related knowledge and overall model performance. To address these challenges, this study reframes machine unlearning as a precise knowledge re-mapping task utilizing model editing. We introduce ZeroUnlearn, a novel few-shot unlearning framework that neutralizes sensitive inputs by mapping them to a neutral target state and eliminating their original representations. By employing a multiplicative parameter update with a closed-form solution, ZeroUnlearn ensures representational orthogonality, facilitating efficient and targeted unlearning. Additionally, we adapt ZeroUnlearn into a gradient-based variant to support multi-sample unlearning. Experimental results indicate that our method surpasses existing baselines while maintaining general model utility. The code for this work is accessible at https://github.com/XMUDeepLIT/ZeroUnlearn.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






