ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models
Title: ParaBlock: Parallelizing Communication and Computation in Block Coordinate Federated Learning for Large Language Models
Abstract:
Federated learning (FL) is widely recognized as a robust paradigm for privacy-preserving model training. In recent years, federated block coordinate descent has emerged as a preferred strategy for training large-scale models, enabling clients to update only specific model subsets locally rather than the entire architecture. However, when applied to large language models (LLMs), individual blocks often contain a vast number of parameters. This volume creates significant communication bottlenecks, especially for clients with limited resources. To mitigate these latency issues during the fine-tuning or training of LLMs, we introduce ParaBlock, an innovative framework that employs two concurrent threads to handle communication and computation simultaneously, thereby boosting efficiency. Our theoretical analysis demonstrates that ParaBlock attains convergence rates comparable to those of standard federated block coordinate descent methods. Extensive empirical tests on general instruction-following and mathematical reasoning tasks confirm that ParaBlock preserves high performance levels while delivering substantial gains in communication efficiency.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



