Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents
Title: Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents
Abstract:
Interactive agents designed for text-to-SQL tasks navigate database operations through a multi-turn process that encompasses schema exploration, query execution, feedback analysis, and decision adjustment. While long-term memory enables these agents to leverage previous experiences, current retrieval mechanisms face significant limitations. Conventional static approaches depend on fixed similarity heuristics that fail to optimize downstream utility, whereas dynamic methods typically learn from sparse final outcomes and operate within a single decision horizon. This singular focus is inadequate because the relevance of memories shifts across different interaction phases; insights beneficial for initial planning often differ from those required for local, state-dependent execution.
To address this, we introduce MERIT, a dynamic, multi-horizon memory retrieval framework. MERIT employs a dual-level structure: episode-level memory serves as a guide for global strategy, while turn-level memory supports local decision-making. Both tiers utilize learned retrieval policies refined through reinforcement learning. To overcome the challenge of limited intermediate supervision during the training of turn-level retrieval, MERIT incorporates a lightweight Process Reward Model that generates dense proxy rewards for selecting local memories.
Evaluation on the BIRD-Interact dataset demonstrates that MERIT surpasses baselines lacking memory, those using static retrieval, and those employing dynamic retrieval, achieving higher success rates with fewer average interaction turns. Furthermore, transfer tests on Spider2-Snow indicate positive cross-benchmark performance without the need for benchmark-specific tuning. These findings underscore the effectiveness of multi-horizon retrieval in enhancing experience reuse for interactive text-to-SQL agents.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





