arXiv

Dual Advantage Fields

Title: Dual Advantage Fields

Abstract: Offline goal-conditioned reinforcement learning demands both estimates of reachability over long horizons and the ability to compare local actions. While dual goal representations generate value fields capable of capturing global reachability, they fail to explicitly dictate which action is optimal at a specific state. To address this, we introduce Dual Advantage Fields (DAF), a method for extracting policies that converts a bilinear dual value model into a local advantage signal. Within the framework of bilinear dual parameterization, the goal embedding is defined as the gradient of the value field relative to the state representation. DAF utilizes an action-effect model to forecast the discounted feature displacement resulting from an action, subsequently evaluating actions based on how well this displacement aligns with the goal direction. In realizable scenarios, this scoring mechanism corresponds to the goal-conditioned Bellman advantage, ensuring a standard guarantee for local policy improvement. Evaluations across OGBench locomotion, manipulation, and puzzle tasks demonstrate that DAF enhances aggregate RLiable metrics and excels in environments where the locally optimal actions diverge from direct trajectories toward the ultimate goal.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...