arXiv

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills

June 2, 2026 · Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Leo Yu Zhang · Original Source

Title: "Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills

Abstract:

Large language model (LLM) coding agents are increasingly dependent on third-party extensions known as "skills." These skills combine helper scripts with natural language instructions and operate with full user privileges. While community registries have developed to facilitate the distribution of these tools, their security implications have largely gone unexamined, primarily because of a lack of labeled threat data. This study provides a comprehensive security assessment of 98,380 skills sourced from two prominent registries. By employing a hybrid approach of static pattern matching and dynamic behavioral verification, we isolated 157 skills demonstrating confirmed malicious activity. These threats comprised 632 distinct vulnerabilities distributed across 13 different attack methods.

Our findings indicate that these threats are intentional rather than accidental. On average, each malicious skill contained 4.03 vulnerabilities that spanned multiple stages of an attack. We identified two primary attack vectors, which exhibited a statistically significant negative correlation: credential theft facilitated by remote code execution, and the manipulation of agents through adversarial instructions hidden within documentation. Notably, more than half of the confirmed incidents stemmed from a single threat actor who deployed templated brand impersonation on a large scale.

Additionally, we observed a direct relationship between the sophistication of an attack and the effort invested in concealing it. Highly advanced skills consistently utilized undocumented capabilities alongside the exploitation of trust mechanisms native to the platform. After we initiated responsible disclosure procedures, registry administrators removed all 157 reported skills, achieving a 100% removal rate. To support ongoing research into the security of LLM agent ecosystems, we have made both our detection pipeline and dataset publicly accessible.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC