Towards Simple and Provable Parameter-Free Adaptive Gradient Methods
Title: Simple and Provable Parameter-Free Adaptive Gradient Methods
Abstract: The dynamic adjustment of learning rates during optimization has propelled deep model training forward, largely due to algorithms like Adam and AdaGrad. Nevertheless, the practical application of these methods is often hindered by the inefficiencies and challenges associated with manually tuning learning rates. To mitigate this, recent studies have prioritized the creation of "parameter-free" algorithms designed to function effectively without such tuning. However, current parameter-free adaptations of Adam and AdaGrad are frequently criticized for their complexity and/or absence of formal convergence proofs. This study introduces AdaGrad++ and Adam++, which are straightforward, parameter-free versions of the original algorithms that include rigorous convergence guarantees. We demonstrate that AdaGrad++ attains convergence rates similar to those of standard AdaGrad in convex optimization scenarios, without requiring predefined assumptions about the learning rate. Likewise, Adam++ achieves convergence rates comparable to Adam without imposing any constraints on learning rates. Empirical evaluations across multiple deep learning applications confirm the competitive effectiveness of Adam++.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





