Global News Digest

arXiv

Robust Shielding for Safe Reinforcement Learning

Title: Ensuring Safety in Reinforcement Learning Through Robust Shielding

Shielding stands out as a potent method for formally securing reinforcement learning agents within Markov decision processes (MDPs). Yet, current shielding methodologies generally rely on the assumption that the transition dynamics relevant to safety are already known—an assumption that rarely holds true in real-world applications. To overcome this hurdle, we present a new shielding framework tailored for robust MDPs (RMDPs), which are characterized by sets of transition probabilities rather than single values.

In our approach, safety is defined by the ability to satisfy a linear temporal logic (LTL) formula with a specified probability threshold, even when facing the worst-case transition probabilities inherent to the RMDP. We demonstrate that this framework is both sound and optimal for RMDPs: it ensures that every policy permitted by the shield is safe, and simultaneously, that every safe policy for an RMDP is allowed by the shield.

By integrating our method with established sampling techniques for learning MDP transition probabilities under probably approximately correct (PAC) guarantees, we can construct shields for MDPs. These shields offer high-confidence safety guarantees while imposing minimal restrictions on agent behavior. Experimental results indicate that shields applied to learned RMDPs successfully ensure safety in unknown MDP environments, while the expected return improves significantly as the volume of samples grows.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.