arXiv

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

June 2, 2026 · Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang, Qing Zong, Jiahe Guo, Zhongwei Xie, Yiyan Ji, Yauwai Yim, Hongyu Luo, Xiyu Ren, Ruan Chenyu, Haoran Li, Yangqiu Song · Original Source

Title: SkillRevise: Enhancing LLM-Generated Agent Competencies Through Trace-Conditioned Revision

Abstract: Agent skills function as procedural tools that empower Large Language Model (LLM) agents to carry out workflows, enforce constraints, and handle failures. While current self-evolving approaches improve skills by leveraging accumulated trajectories, they face significant challenges in cold-start scenarios where only a single, flawed initial skill is present. As a result, the creation of these skills typically relies on either expert authoring or single-shot LLM generation. The former is expensive and often misaligned with the actual execution patterns of LLM agents, whereas the latter, though syntactically correct, frequently lacks behavioral robustness. To address this discrepancy, we introduce SkillRevise, a framework grounded in execution data that iteratively improves initial skills. SkillRevise identifies defects using execution evidence, pulls relevant repair strategies from a general memory repository, and implements edits anchored in execution data. By re-running candidate skills and assessing their empirical utility, the system systematically preserves the most effective version. Tests across five LLMs and three benchmarks demonstrate that SkillRevise significantly surpasses one-shot baselines, raising the base agent’s success rate on SkillsBench from 36.05% to 61.63%. Additionally, the refined skills show strong transferability across different models, indicating they capture generalized procedural knowledge rather than model-specific quirks.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC