arXiv

KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering

June 3, 2026 · Xin Sun, Zhongqi Chen, Xing Zheng, Qiang Liu, Shu Wu, Bowen Song, Zilei Wang, Weiqiang Wang, Liang Wang · Original Source

Title: KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering

Abstract: Knowledge Base Question Answering (KBQA) requires models to connect natural language inquiries with rigid knowledge graph structures by producing executable logical forms. Although Large Language Models (LLMs) have propelled progress in this domain, existing methods frequently falter due to a specific dichotomy of errors: they either produce hallucinated queries that fail to verify schema validity or rely on inflexible, template-driven reasoning that merely mimics synthetic traces rather than demonstrating genuine environmental understanding. To overcome these hurdles, we introduce KBQA-R1, a framework that transitions the approach from textual imitation to interaction optimization through Reinforcement Learning. By modeling KBQA as a multi-turn decision process, our system learns to traverse the knowledge base via a sequence of actions, utilizing Group Relative Policy Optimization (GRPO) to hone its strategies based on tangible execution feedback instead of static supervision. Additionally, we propose Referenced Rejection Sampling (RRS), a data synthesis technique designed to mitigate cold-start issues by ensuring that reasoning traces are strictly aligned with ground-truth action sequences. Comprehensive evaluations on the WebQSP, GrailQA, and GraphQuestions datasets reveal that KBQA-R1 attains state-of-the-art results, successfully anchoring LLM reasoning in verifiable execution.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC