Local MixVR: Breaking the Communication-Sample Dependence in Distributed Learning
Title: Local MixVR: Eliminating Communication-Sample Dependence in Distributed Learning
Abstract
Scalable distributed learning is frequently hindered by significant communication overhead. Although current approaches, including Local SGD, Minibatch SGD, and their accelerated counterparts, strive to optimize the utilization of data points, they remain constrained by a communication-round complexity that grows with the total sample size $N$. This study presents Local MixVR, a novel distributed framework that combines local updates with variance-reduction strategies to effectively suppress local noise. We demonstrate that Local MixVR represents the first distributed approach to remove the dependency of communication complexity on $N$, resulting in a complexity that depends solely on the number of workers $M$. In typical scenarios where $M < O\left(N^{1/4}\right)$, Local MixVR surpasses the leading Minibatch Accelerated SGD baseline. This achievement not only closes a persistent gap in distributed optimization but also sets a new standard for communication-efficient training.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





