Chebyshev Policies and the Mountain Car Problem: Reinforcement Learning for Low-Dimensional Control Tasks
Title: Optimizing Low-Dimensional Control with Chebyshev Policies: Insights from the Mountain Car Problem
Abstract: By providing an analytical solution to the Mountain Car problem—a standard benchmark in reinforcement learning (RL)—this study derives an optimal control strategy, thereby resolving a discrepancy that has persisted for 36 years. This breakthrough uncovers two unexpected findings: while the optimal control mechanism is fundamentally simple, contemporary RL agents exhibit a significant deviation from optimality. Guided by this analysis, we propose Chebyshev policies as a universal, dense class of RL policies grounded in first principles. Designed to serve as direct substitutes for neural networks, these policies cut regret by a factor of 4.18 while utilizing 277 times fewer parameters. This efficiency enhances sample efficiency, interpretability, and real-time performance. We further validate Chebyshev policies on additional RL tasks, including a real-world nonlinear motion control testbed, where they consistently outperform neural networks when paired with PPO, ARS, and REINFORCE. Our findings highlight Chebyshev policies as a robust, lightweight alternative or complement to neural networks for low-dimensional control applications.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





