arXiv

KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering

Title: KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering

Abstract: Knowledge Base Question Answering (KBQA) requires models to connect natural language inquiries with rigid knowledge graph structures by producing executable logical forms. Although Large Language Models (LLMs) have propelled progress in this domain, existing methods frequently falter due to a specific dichotomy of errors: they either produce hallucinated queries that fail to verify schema validity or rely on inflexible, template-driven reasoning that merely mimics synthetic traces rather than demonstrating genuine environmental understanding. To overcome these hurdles, we introduce KBQA-R1, a framework that transitions the approach from textual imitation to interaction optimization through Reinforcement Learning. By modeling KBQA as a multi-turn decision process, our system learns to traverse the knowledge base via a sequence of actions, utilizing Group Relative Policy Optimization (GRPO) to hone its strategies based on tangible execution feedback instead of static supervision. Additionally, we propose Referenced Rejection Sampling (RRS), a data synthesis technique designed to mitigate cold-start issues by ensuring that reasoning traces are strictly aligned with ground-truth action sequences. Comprehensive evaluations on the WebQSP, GrailQA, and GraphQuestions datasets reveal that KBQA-R1 attains state-of-the-art results, successfully anchoring LLM reasoning in verifiable execution.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...