arXiv

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

June 2, 2026 · Md Zarif Ul Alam, Alireza Salemi, Hamed Zamani · Original Source

Title: Critic-R: Enhancing Agentic Search Through Instruction-Tuned Retrievers and Natural Language Introspective Feedback

Abstract:

Agentic search systems rely on iterative interactions with retrieval models to address complex questions. While significant advancements have been made, optimizing these retrievers remains difficult, typically necessitating costly co-training processes or reliance on gold-standard annotations, which hinders their practical application in real-world scenarios. To address this, we introduce Critic-R, a novel framework designed to explicitly establish a feedback loop between the reasoning agent and the retrieval model throughout both the training and inference phases.

At the core of Critic-R is a critic model that assesses the agent’s introspective reasoning trace after it has processed retrieved evidence. This evaluation determines whether the current context adequately supports the subsequent reasoning step. The framework employs two distinct but complementary mechanisms:

Critic-R-Zero: An inference-time mechanism that implements a query refinement loop, iteratively rewriting queries and retrieval instructions.
Critic-Embed: An optimization strategy for retrieval models that utilizes successful and failed refinement trajectories as automatic supervision, eliminating the need for manual relevance annotations.

We tested Critic-R across several benchmark datasets, including HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle. Our findings demonstrate that Critic-R leads to substantial improvements in both the quality of retrieval and the accuracy of downstream answers.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC