Global News Digest

arXiv

ThinkSwitch: Context Distillation with LoRA and Weight Interpolation for Specific-Purpose Reasoning Tasks

Title: ThinkSwitch: Leveraging LoRA and Weight Interpolation for Context Distillation in Specialized Reasoning

Abstract:

While large language models frequently enhance their performance on complex problems by allocating additional inference-time compute to generate a reasoning trace prior to delivering a final answer, this approach introduces significant drawbacks, including increased latency, higher token expenses, and greater deployment complexity. To address these challenges, we propose ThinkSwitch, a resource-efficient co-training method for paired instruct and thinking checkpoints.

Beginning with compatible Qwen3-4B models for both instruct and thinking purposes, the procedure operates iteratively: the thinking checkpoint generates answers, after which the reasoning traces are stripped away. The resulting answer-only pairs are then distilled into the instruct checkpoint using QLoRA. Concurrently, a new thinking checkpoint is reconstructed via spherical weight interpolation. This process requires no manual labeling; the sole human input consists of task prompts, while the model autonomously generates the corresponding labels.

Experimental results on a 30-question subset of AIME 2026 demonstrate that ThinkSwitch raises the instruct checkpoint’s score from 10/30 to 20/30, and the thinking checkpoint’s from 14/30 to 22/30. Similarly, on a 30-question segment of PubMedQA, the instruct checkpoint improved from 13/30 to 18/30, while the thinking checkpoint rose from 18/30 to 25/30. The entire experiment was conducted using 15 training prompts per domain at a total cost of $2.86 on a single cloud-based RTX 3070. Although these findings stem from a small-scale study, they suggest that targeted distillation cycles can effectively embed the advantages of explicit reasoning into model weights, all while maintaining a distinct thinking mode.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.