Global News Digest

arXiv

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Title: Harness-1: Leveraging Reinforcement Learning for Search Agents via State-Externalizing Harnesses

Abstract:

Training search agents typically involves developing policies over expanding transcripts, a process where the model is tasked with navigating search strategies while simultaneously tracking observed data, identifying relevant evidence, monitoring open constraints, and verifying claims. We contend that this approach burdens the policy with excessive routine state management. Consequently, reinforcement learning is compelled to optimize both semantic search choices and recoverable bookkeeping tasks that the environment could handle more reliably. To address this, we present Harness-1, a 20-billion-parameter search agent (functioning as a retrieval subagent) trained using reinforcement learning within a stateful search harness. This harness manages environment-side working memory, encompassing a candidate pool, a curated set tagged with importance levels, compact evidence links, verification logs, compressed and deduplicated observations, and budget-conscious context rendering. The policy continues to oversee semantic decisions, such as determining search queries, selecting documents for retention or discard, deciding what requires verification, and identifying the appropriate time to conclude. In evaluations across eight retrieval benchmarks covering the web, finance, patents, and multi-hop question answering, Harness-1 attained an average curated recall of 0.730. This performance surpasses the leading open search subagent by 11.4 points and remains competitive against significantly larger frontier models. The improvements were particularly pronounced on held-out transfer benchmarks, indicating that reinforcement learning applied to explicit search state can yield retrieval behaviors that generalize effectively beyond the domains used during training. Our code is publicly accessible at https://github.com/pat-jj/harness-1.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.