Global News Digest

arXiv

Faster Synchronous On-Policy RL via Straggler-Aware Group Sizing

Title: Enhancing Efficiency in Synchronous On-Policy RL Through Straggler-Adaptive Group Sizing

Abstract:

While synchronous reinforcement learning algorithms like Group Relative Policy Optimization (GRPO) offer reliable and reproducible on-policy training, they remain significantly susceptible to stragglers. In these systems, a single extended rollout can bottleneck the entire group, delaying both reward calculation and parameter updates. This synchronization delay intensifies as group sizes grow, establishing a conflict between the advantages of larger cohorts and the increasing wall-clock costs associated with synchronization stalls.

To address this, we introduce Straggler-Aware Group Control (SAGC), a mechanism that dynamically adjusts the training group size in real-time based on observed rollout performance. SAGC treats group-size selection as an online constrained optimization challenge, aiming to preserve the advantages of larger groups while managing the long-term frequency of straggler occurrences.

Our experiments demonstrate that SAGC consistently lowers the incidence of stragglers and boosts wall-clock efficiency across both GRPO and DAPO training frameworks, whether applied to vanilla or robust engineered baselines. These improvements are accompanied by competitive or superior training rewards. Furthermore, the benefits extend to final model performance: SAGC matches or exceeds the strongest static group-size baselines on downstream reasoning benchmarks, frequently generating shorter outputs without the need for explicit length penalties. These findings establish dynamic group control as a viable strategy for enhancing the efficiency and resilience of synchronous on-policy reinforcement learning.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.