arXiv

Experience-Driven Dynamic Exits for LLMs with Reinforcement Learning

Title: Dynamic Exits for LLMs via Reinforcement Learning: An Experience-Based Approach

Abstract: Large Language Models are currently constrained by the sluggish pace of autoregressive inference. Although self-speculative decoding offers a pathway to acceleration, its performance is often limited by rigid parameters, such as fixed speculation lengths and predetermined exit layers. To address these inefficiencies, we formulate the optimization problem as a Markov Decision Process and introduce LEDE, a framework grounded in offline reinforcement learning. LEDE employs a learned policy to dynamically determine the most effective exit layer and speculation length at each generation step, guided by the local context of the sequence. This approach effectively balances the trade-off between computational expenditure and the quality of the drafted tokens. Extensive testing on Llama-2 and Llama-3 architectures demonstrates that LEDE delivers speedups ranging from $2.0\times$ to $2.7\times$ compared to standard autoregressive decoding, while also outperforming static speculative baselines by an additional 17%.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...