arXiv

Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval

June 4, 2026 · Jiaxi Li, Ke Deng, Yun Wang, Jingyuan Huang, Yucheng Shi, Qiaoyu Tan, Jin Lu, Ninghao Liu · Original Source

Title: Enhancing Web Agent Performance Through State-Grounded Dynamic Retrieval for Online Skill Acquisition

Abstract:

To enhance multi-step web automation across similar tasks, language agents are increasingly turning to reusable skills. A burgeoning area of research focuses on online skill learning, a process where agents dynamically generate new skills from past task trajectories and apply them in real-time to subsequent tasks. However, current approaches predominantly operate at the task level: they retrieve a static set of skills based on the initial instruction and maintain that set unchanged throughout the execution process. This rigid strategy is ill-suited for web environments, where the optimal next action is determined not merely by the overarching goal but also by the immediate webpage state. Since web pages frequently evolve into scenarios not anticipated by the initial skills, static retrieval often proves insufficient.

To bridge this gap, we introduce State-Grounded Dynamic Retrieval (SGDR), an online skill learning framework designed to facilitate stepwise skill reuse for web agents. SGDR integrates three core components: a sliding-window extraction mechanism that converts completed trajectories into reusable sub-procedures callable at intermediate states; a dual text-code representation system that links skill retrieval directly to executable actions; and a state-grounded dynamic retrieval mechanism that aligns skills with both the task objective and the live webpage state.

We evaluated SGDR on the WebArena benchmark across five distinct domains. The results demonstrate that SGDR consistently surpasses strong baseline methods, achieving average success rates of 37.5% using GPT-4.1 and 24.3% with Qwen3-4B. These figures represent relative improvements of 10.6% and 10.0%, respectively, over the most effective baseline. The implementation code is publicly accessible at https://github.com/plusnli/skill-dynamic-retrieval.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC