arXiv

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

June 2, 2026 · Danlong Yuan, Wei Wu, Enhan Zhao, Zhengren Wang, Xueliang Zhao, Huishuai Zhang, Dongyan Zhao · Original Source

Title: SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

Abstract:

While reinforcement learning (RL) has emerged as a central paradigm for training software engineering (SWE) agents, current methodologies predominantly depend on per-task containers to ensure isolation. However, at scale, this approach introduces significant challenges, including heavy storage demands from pre-built container images, delayed environment initialization, and the necessity for container-management privileges. To address these limitations, we introduce SWE-MiniSandbox, a streamlined, container-free solution designed to facilitate the scalable RL training of SWE agents while maintaining strict isolation.

Rather than utilizing isolated containers for each instance, SWE-MiniSandbox runs every task within a dedicated workspace supported by kernel-level mechanisms, thereby significantly cutting down system overhead. The method employs lightweight environment pre-caching strategies, which obviates the need for large container images. Consequently, our approach reduces disk consumption to roughly 5% of the space required by conventional container-based pipelines and decreases environment preparation time to approximately 25% of the baseline container duration. Empirical evaluations indicate that SWE-MiniSandbox delivers performance on par with standard container-based workflows. By eliminating the reliance on resource-intensive container infrastructure, SWE-MiniSandbox provides a practical and accessible framework for scaling RL-driven SWE agents, especially within research settings characterized by limited resources.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC