R2-Router: A New Paradigm for LLM Routing with Reasoning
Title: R2-Router: Establishing a New Framework for LLM Routing Through Reasoning
Abstract:
The rapid expansion of Large Language Models (LLMs), characterized by their varied capabilities and pricing structures, has given rise to the field of LLM routing. This approach functions by learning to forecast the performance and expense of different models for specific queries, ultimately choosing the option that offers superior quality at a minimal cost. Nevertheless, current routing mechanisms operate on the implicit assumption that each LLM possesses a static quality and cost profile for any given query. This perspective overlooks a critical factor: the quality of a single LLM’s output fluctuates depending on its length. Consequently, existing routers often discard potent models when their projected costs surpass the allocated budget, thereby missing the potential for these models to provide high-quality results at a lower cost if their output length is restricted.
To resolve this limitation, we present R2-Router, an innovative system that treats the output length budget as a flexible parameter. It simultaneously identifies the optimal LLM and the ideal length budget, utilizing length-constrained instructions to enforce these limits. This methodology allows R2-Router to uncover scenarios where a high-capability LLM, when restricted in output length, surpasses a less powerful model operating at a similarly cost-efficient level—insights that were previously inaccessible to traditional methods. Alongside this routing framework, we introduce R2-Bench, the inaugural dataset designed to evaluate LLM behavior across a spectrum of output length constraints. Our experimental results demonstrate that R2-Router attains state-of-the-art performance while reducing costs by a factor of 4 to 5 compared to existing routing solutions. This research marks a significant shift toward "routing as reasoning," transforming routers from passive selectors into active reasoners that strategically determine both the appropriate model and its associated cost budget. The source code for this project is openly accessible at https://github.com/UCF-ML-Research/R2-Router.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





