arXiv

CacheRAG: A Semantic Caching System for Retrieval-Augmented Generation in Knowledge Graph Question Answering

Title: CacheRAG: Implementing Semantic Caching for Retrieval-Augmented Generation in Knowledge Graph Question Answering

Abstract:

The convergence of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) has propelled significant progress in Knowledge Graph Question Answering (KGQA). Nevertheless, current LLM-driven KGQA frameworks operate as stateless planners. They formulate retrieval strategies in isolation, failing to leverage historical query patterns—a limitation comparable to a database system that attempts to optimize every query from the ground up without utilizing a plan cache. This inherent architectural weakness results in schema hallucinations and restricts the scope of retrieval. To address these issues, we introduce CacheRAG, a cache-augmented architecture designed for LLM-based KGQA that evolves stateless planners into continual learners.

In contrast to conventional database plan caching, which prioritizes frequency optimization, CacheRAG establishes three innovative design principles specifically adapted for LLM environments:

  1. Schema-agnostic User Interface: We employ a two-stage semantic parsing framework based on Intermediate Semantic Representation (ISR). This allows non-expert users to interact exclusively via natural language. Simultaneously, a Backend Adapter provides the LLM with local schema context, ensuring the safe compilation of executable physical queries.
  2. Diversity-optimized Cache Retrieval: By utilizing a two-layer hierarchical index (Domain $\rightarrow$ Aspect) alongside Maximal Marginal Relevance (MMR), the system maximizes structural variety among cached examples. This approach effectively reduces reasoning homogeneity.
  3. Bounded Heuristic Expansion: The implementation of deterministic subgraph operators for depth and breadth, combined with strict complexity guarantees, substantially improves retrieval recall while preventing the risk of unbounded API execution.

Comprehensive experiments across various benchmarks indicate that CacheRAG surpasses state-of-the-art baselines, achieving a 13.2% increase in accuracy and a 17.5% improvement in truthfulness on the CRAG dataset.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...