arXiv

Learning While Acting: A Skill-Enhanced Test-Time Co-Evolution Framework for Online Lifelong Learning Agents

June 4, 2026 · Bo Mao, Jie Zhou, Yutao Yang, Xin Li, Xian Wei, Qin Chen, Xingjiao Wu, Liang He · Original Source

Title: Learning Through Action: A Skill-Augmented Test-Time Co-Evolution Framework for Online Lifelong Learning Agents

Abstract:

For Large Language Model (LLM) agents navigating dynamic, interactive settings, lifelong learning is indispensable. Nevertheless, current lifelong learning agents designed for long-horizon tasks generally rely on retrieving discrete skills or past experiences using static parameters during the inference phase. This approach hinders their ability to continuously internalize feedback received at test time, a capability characteristic of human learners. To address this limitation, we introduce Skill-enhanced Test-Time Co-Evolution (\texttt{LifeSkill}), a two-stage reinforcement learning framework tailored for Online Lifelong Learning Agents.

Our approach features Verifier-Guided Skill Learning, a mechanism designed to overcome the scarcity of direct supervision in skill extraction. By rewarding candidate skills based on the average success rate of verifiers across multiple skill-conditioned policy rollouts, this method incentivizes the model to produce skills that are genuinely effective for task resolution, rather than simply appearing plausible in textual form. Additionally, we propose Online Skill Internalization, a process that enhances the policy model during test-time interactions by converting skill-conditioned trajectories into reward signals. This strategy allows the agent to embed reasoning capabilities directly into its parameters, thereby circumventing the issue of context bloat associated with experience retrieval. Evaluations on LifelongAgentBench demonstrate that LifeSkill achieves a 7-point absolute improvement in average performance compared to existing lifelong agent baselines.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC