Robust Shielding for Safe Reinforcement Learning
Title: Ensuring Safety in Reinforcement Learning Through Robust Shielding
Shielding stands out as a potent method for formally securing reinforcement learning agents within Markov decision processes (MDPs). Yet, current shielding methodologies generally rely on the assumption that the transition dynamics relevant to safety are already knownâan assumption that rarely holds true in real-world applications. To overcome this hurdle, we present a new shielding framework tailored for robust MDPs (RMDPs), which are characterized by sets of transition probabilities rather than single values.
In our approach, safety is defined by the ability to satisfy a linear temporal logic (LTL) formula with a specified probability threshold, even when facing the worst-case transition probabilities inherent to the RMDP. We demonstrate that this framework is both sound and optimal for RMDPs: it ensures that every policy permitted by the shield is safe, and simultaneously, that every safe policy for an RMDP is allowed by the shield.
By integrating our method with established sampling techniques for learning MDP transition probabilities under probably approximately correct (PAC) guarantees, we can construct shields for MDPs. These shields offer high-confidence safety guarantees while imposing minimal restrictions on agent behavior. Experimental results indicate that shields applied to learned RMDPs successfully ensure safety in unknown MDP environments, while the expected return improves significantly as the volume of samples grows.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




