arXiv

CacheRAG: A Semantic Caching System for Retrieval-Augmented Generation in Knowledge Graph Question Answering

June 2, 2026 · Yushi Sun, Lei Chen · Original Source

Title: CacheRAG: Implementing Semantic Caching for Retrieval-Augmented Generation in Knowledge Graph Question Answering

Abstract:

The convergence of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) has propelled significant progress in Knowledge Graph Question Answering (KGQA). Nevertheless, current LLM-driven KGQA frameworks operate as stateless planners. They formulate retrieval strategies in isolation, failing to leverage historical query patterns—a limitation comparable to a database system that attempts to optimize every query from the ground up without utilizing a plan cache. This inherent architectural weakness results in schema hallucinations and restricts the scope of retrieval. To address these issues, we introduce CacheRAG, a cache-augmented architecture designed for LLM-based KGQA that evolves stateless planners into continual learners.

In contrast to conventional database plan caching, which prioritizes frequency optimization, CacheRAG establishes three innovative design principles specifically adapted for LLM environments:

Schema-agnostic User Interface: We employ a two-stage semantic parsing framework based on Intermediate Semantic Representation (ISR). This allows non-expert users to interact exclusively via natural language. Simultaneously, a Backend Adapter provides the LLM with local schema context, ensuring the safe compilation of executable physical queries.
Diversity-optimized Cache Retrieval: By utilizing a two-layer hierarchical index (Domain $\rightarrow$ Aspect) alongside Maximal Marginal Relevance (MMR), the system maximizes structural variety among cached examples. This approach effectively reduces reasoning homogeneity.
Bounded Heuristic Expansion: The implementation of deterministic subgraph operators for depth and breadth, combined with strict complexity guarantees, substantially improves retrieval recall while preventing the risk of unbounded API execution.

Comprehensive experiments across various benchmarks indicate that CacheRAG surpasses state-of-the-art baselines, achieving a 13.2% increase in accuracy and a 17.5% improvement in truthfulness on the CRAG dataset.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC