arXiv

Hint Tuning: Less Data Makes Better Reasoners

Title: Hint Tuning: Less Data Makes Better Reasoners

Abstract:

While large reasoning models attain high accuracy via extended chain-of-thought processes, they often produce 5 to 8 unnecessary tokens, employing verbose reasoning uniformly irrespective of task complexity. To address this, we introduce Hint Tuning, a method that requires minimal data to instruct models in calibrating their reasoning depth. Our central premise is that the corresponding instruction-following model acts as an optimal difficulty probe. By evaluating the instruct model’s performance under varying levels of guidance, we can automatically generate training data encompassing three distinct states: No-Hint (for direct answers), Sparse-Hint (utilizing minimal prefixes), and Full-Hint (providing complete reasoning). This strategy transforms the subjective problem of labeling difficulty into an objective consistency check between the reasoning and instruct models. Leveraging just 1,000 self-annotated samples, Hint Tuning reduces token usage by 24–66% (averaging 31.5%) across various scales (4B–32B) of mainstream reasoning models, including Qwen3-Thinking and DeepSeek-R1-Distill, without compromising accuracy on five major benchmarks. In contrast to approaches dependent on extensive distillation datasets or costly reinforcement learning, we secure superior efficiency by simply aligning with the capabilities of the instruct model. The associated code and data can be accessed at https://github.com/redai-infra/hint-tuning.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...