arXiv

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

Title: Bridging Model-Free Efficiency and Model-Based Representations Through Latent Dynamics

Abstract: This paper introduces Unified Latent Dynamics (ULD), a new reinforcement learning framework designed to merge the computational efficiency of model-free techniques with the robust representational capabilities of model-based methods, all while avoiding the costs associated with explicit planning. By mapping state-action pairs into a latent space where the true value function can be approximated as linear, ULD operates effectively with a unified set of hyperparameters across a wide array of domains. This versatility extends from continuous control tasks involving low-dimensional and pixel-based inputs to complex, high-dimensional Atari environments. Theoretically, we demonstrate that, provided certain mild conditions are met, the fixed point achieved by our embedding-based temporal-difference updates aligns with that of a corresponding linear model-based value expansion. Furthermore, we establish explicit error bounds that link the fidelity of the embedding to the quality of value approximation. In implementation, ULD utilizes synchronized updates for its encoder, value, and policy networks, incorporates auxiliary losses to refine short-horizon predictive dynamics, and applies reward-scale normalization to maintain learning stability, particularly in scenarios with sparse rewards. Our evaluation across 80 environments—including Gym locomotion, DeepMind Control (covering both proprioceptive and visual modalities), and Atari—shows that ULD either matches or outperforms specialized model-free and general model-based baselines. The approach achieves cross-domain proficiency with minimal hyperparameter tuning and a significantly smaller parameter footprint. These findings suggest that value-aligned latent representations, on their own, are sufficient to provide the adaptability and sample efficiency typically reserved for full model-based planning systems.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade
Bloomberg

Emerging-Market Stocks Fall as Broadcom Miss Disrupts AI Trade

Broadcom’s earnings miss triggered a sell-off in AI stocks, dragging down emerging-market equities. This disruption high...

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role
Bloomberg

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role

Revolut co-founder and CTO Vlad Yatsenko is stepping down from his executive role. The resignation marks a significant l...

Netflix Top Tech Exec Stone on Integrating AI
Bloomberg

Netflix Top Tech Exec Stone on Integrating AI

Netflix’s top tech exec discusses integrating AI to enhance content discovery and production efficiency.

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive
Bloomberg

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive

Microsoft AI CEO Mustafa Suleyman criticized Anthropic’s models as too expensive. Meanwhile, Microsoft plans to allow us...

Ramp Notches $44 Billion Valuation in New Funding Round
Bloomberg

Ramp Notches $44 Billion Valuation in New Funding Round

RAMP secured a $44 billion valuation in its latest funding round. CEO Eric Glyman attended the 2026 Reagan National Econ...