Multi-Objective Reference-Aligned Machine Unlearning
Title: Multi-Objective Reference-Aligned Machine Unlearning
Abstract:
Machine unlearning seeks to eliminate the impact of designated training data points without compromising the overall performance of the model. Current single-objective methods, including gradient ascent and random relabeling, frequently lead to catastrophic forgetting. This occurs because their optimization goals are unbounded and conflicting, causing the model to drift away from its initial pre-trained knowledge.
To address this, we introduce Reference-Aligned UnLearning (RAUL), a multi-objective framework that simultaneously manages forgetting and retention. Instead of maximizing an unbounded loss, RAUL employs a bounded Kullback-Leibler (KL) alignment. This approach directs the predictions on forgotten samples toward a reference distribution that represents unseen data. This reference can be implemented as a uniform distribution or derived empirically from a held-out reference set. By bounding the forgetting objective, RAUL minimizes gradient conflicts with the retention goal.
We resolve the resulting multi-objective optimization (MOO) problem using Jacobian descent, a technique that combines multiple gradients into a unified direction that avoids conflict. Our findings indicate that RAUL yields a performance gap to full retraining that is closer than that of existing methods.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





