Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences
Title: Well-Formulated KL-Regularized Control Through Wasserstein and Kalman-Wasserstein Divergences
Abstract:
While Kullback-Leibler (KL) divergence regularization is a staple in reinforcement learning, it suffers from critical limitations: it diverges to infinity when probability supports do not overlap and tends to degenerate in environments with low noise. To address these issues, we employ a unified information-geometric framework to construct KL analogs. This is achieved by substituting the Fisher-Rao geometry inherent in the dynamical formulation of the KL with transport-based geometries, allowing us to derive closed-form expressions for various common distribution families.
For elliptic distributions, we show that these new divergences remain finite even when covariances degenerate and are equal. Furthermore, they provide a geometric explanation for the regularization heuristics typically employed in Kalman ensemble methods. We validate the practical value of these divergences within the context of KL-regularized optimal control. Specifically, in the analytically tractable scenario of linear time-invariant systems driven by Gaussian process noise, the classical KL formulation simplifies to a quadratic control penalty that becomes singular as process noise approaches zero. Our proposed variants eliminate this singularity, ensuring the resulting optimization problems are well-posed. Testing on both a double integrator and a cart-pole system demonstrates that the controls generated by these methods maintain nontrivial feedback mechanisms and deliver superior closed-loop performance.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





