arXiv

Exact Unlearning in Reinforcement Learning

Title: Achieving Exact Unlearning in Reinforcement Learning

Abstract:

This study addresses the challenge of exact unlearning within reinforcement learning (RL). The primary objective is to develop an efficient mechanism that allows for the complete removal of any individual user’s data upon request. Specifically, the output generated by an online learner after the unlearning process must be indistinguishable from the model that would have resulted if the deleted user had never engaged with the system.

We demonstrate that for any $\rho > 0$, it is possible to design an RL algorithm that is $\rho$-TV-stable and incorporates an exact unlearning procedure. The expected computational expense of this unlearning process amounts to merely a $\rho \sqrt{\ln T}$ fraction of the cost required to retrain the model from scratch.

To achieve this, we construct a $\rho$-TV-stable RL algorithm tailored for tabular Markov decision processes (MDPs). This algorithm attains a regret bound of $\mathcal{O}(H^2 \sqrt{SAT} + H^3 S^2 A + {H^{2.5} S^2 A}/{\rho})$. In this formulation, $S$ represents the number of states, $A$ the number of actions, $H$ the episode horizon, and $T$ the total number of episodes. Furthermore, we derive a lower bound of $\Omega(H\sqrt{!SAT}! +! {SAH}/{\rho})$ for $\rho$-TV-stable RL algorithms. This finding confirms that our proposed algorithm is nearly minimax optimal.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

AI Concentration Risk Is the Problem: 3-Minutes MLIV
Bloomberg

AI Concentration Risk Is the Problem: 3-Minutes MLIV

The article argues that AI concentration risk, rather than the technology itself, is the primary concern. It highlights ...

Reuters

Foxconn announces strategic collaboration with Intel on next-gen AI infrastructure

Foxconn and Intel announced a strategic partnership to develop next-generation AI infrastructure. This collaboration aim...

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Reuters

Europe's tech 'liberation day'? Computer says not yet

Europe’s expected tech breakthrough remains unrealized, as current systems indicate that a true "liberation day" has not...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.