arXiv

Task diversity produces systematic transfer but inhibits continual reinforcement learning

June 2, 2026 · Purab Seth, Neil Shah, Kunal Jha, Samuel J. Gershman, Max Kleiman-Weiner, Wilka Carvalho · Original Source

Title: Systematic Transfer via Task Diversity Comes at the Cost of Continual Reinforcement Learning

Abstract: The objective of continual reinforcement learning is to develop agents capable of refining their performance on current tasks while simultaneously adapting to evolving task distributions. While training on a broad spectrum of diverse tasks has been shown to foster zero-shot generalization, prior research typically assesses this capability only after training is complete, using frozen model weights. It remains uncertain whether such diversity enhances an agent’s capacity for ongoing learning amidst distribution shifts. To address this, we present Banyan, a GPU-accelerated domain for continual RL that allows task diversity to be manipulated along three independent dimensions: the navigational map layouts, the interactive objects, and the hierarchical dependencies among sub-goals. Our findings indicate that, for individual distribution shifts, higher diversity across any of these axes enables agents to start training on new tasks at performance levels comparable to those achieved on previous tasks, even when the optimal policy structure changes. Nevertheless, as the frequency of these shifts rises, this localized transfer fails to support sustained continual learning; tasks with longer horizons reach performance plateaus, and agents tend to forget earlier task distributions following subsequent training. Banyan serves as a benchmark to investigate the conditions under which controlled task diversity generates transferable knowledge, the durability of such transfer, and its limitations in achieving true continual learning.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC