Global News Digest

arXiv

ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate

Title: ARCA: Mitigating Token Signal Degeneration via Adapter-Residual Credit Assignment

Abstract:

In the realm of reinforcement learning for language models, token-level credit assignment is typically modeled under the assumption that the entire policy is trainable. However, contemporary LLM-RL workflows frequently employ parameter-efficient fine-tuning techniques, with Low-Rank Adaptation (LoRA) being particularly prevalent. We contend that this disconnect between theory and practice obscures a critical structural vulnerability. Because LoRA constrains the policy to a low-rank vicinity around the reference model, standard intrinsic credit signals—such as surprisal, entropy reduction, and policy divergence—often suffer from degeneration following within-trajectory normalization. This degeneration manifests as either uniform weight distribution or excessive concentration on a limited number of task-agnostic tokens.

To address this, we formalize the phenomenon and propose direct measurement through concentration diagnostics, including the weight Gini coefficient and the effective-token ratio. Building on this analysis, we introduce Adapter-Residual Credit Assignment (ARCA), a lightweight framework that determines token salience by examining the adapter’s own hidden-state residual, defined as $|h^{\text{adapted}}_t - h^{\text{base}}_t|_2$. Unlike conventional methods that rely on output distribution uncertainty, ARCA identifies where the adapter actively modifies the model’s internal states. This approach eliminates the need for learned reward models, value heads, or tree construction. In a comprehensive GRPO sweep on the MATH dataset using Qwen3-1.7B, ARCA demonstrated the anticipated non-degenerate credit distribution in the middle regime, maintaining performance competitive with rank-matched baseline methods under equivalent rollout budgets.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ā€˜as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ā€˜as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers ā€œas much as possible,ā€ emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.