arXiv

In-Expectation Convergence of Stochastic Gradient Methods under Heavy-Tailed Noise

June 2, 2026 · Zijian Liu · Original Source

Title: Expectation-Based Convergence of Stochastic Gradient Algorithms Amidst Heavy-Tailed Disturbances

Abstract: It is widely held that stochastic gradient techniques fail to converge when the noise accompanying stochastic gradients possesses only a finite $p$-th moment for $p\in\left(1,2\right)$, a condition referred to as the heavy-tailed noise assumption. Conversely, recent investigations have demonstrated that standard Stochastic Gradient Descent ($\textsf{SGD}$)—implemented without any alterations to its update mechanism—can, in fact, converge in expectation when addressing convex problems defined over bounded domains. This finding underscores the latent efficacy of traditional stochastic gradient approaches. Building upon these advancements, this paper presents a thorough examination of stochastic optimization under heavy-tailed noise conditions. We derive novel in-expectation convergence guarantees for Stochastic Mirror Descent ($\textsf{SMD}$) and Accelerated Stochastic Mirror Descent ($\textsf{ASMD}$) within convex optimization contexts, as well as for $\textsf{SGD}$ and Stochastic Gradient Descent with Momentum ($\textsf{SGDM}$) in nonconvex settings. Crucially, our findings remain valid without necessitating algorithmic modifications and circumvent the restrictive constraints, such as the requirement for bounded domains, found in earlier literature. Furthermore, our analytical approach introduces a novel, sophisticated, and robust framework for investigating heavy-tailed stochastic optimization, thereby paving a new path for comprehending first-order stochastic gradient algorithms.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC