Position: Deployed Reinforcement Learning should be Continual
Title: Position: Deployed Reinforcement Learning Should Be Continual
Abstract:
Reinforcement Learning (RL) is witnessing growing interest and integration into practical applications. However, the majority of these implementations rely on a "train-then-fix" model, in which trained agents remain static during their interaction with the environment. These systems only resume learning when performance deteriorates to a critical level, triggering a need for retraining. In this position paper, we contend that placing an agent into operation—despite its lack of perfect optimality, provided it receives evaluative reward signals—constitutes an inherently continual RL challenge. We delineate four distinct sources of non-stationarity that emerge post-deployment, underscoring the imperative for perpetual learning and explaining why top-tier deployed agents must remain in a constant state of adaptation. By examining real-world instances where continual RL has succeeded, we outline the benefits of this approach and propose concrete steps for the community to transition away from the prevailing train-then-fix framework.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC





