Global News Digest

arXiv

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Title: When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Abstract:

Large Language Model (LLM) agents are increasingly dependent on skills contributed by the community to broaden their operational capabilities. This study addresses a fundamental safety challenge in agentic AI systems: the potential for individually safe skills to combine into unsafe installed skill sets. To investigate this, we introduce SkillReact, a framework for measuring compositional security that comprises three distinct elements: a deterministic static-composition benchmark, an action-based exploitability harness, and a two-rater LLM-assisted pipeline for human adjudication.

Our analysis focused on 1,520 skills from ClawHub. Of these, 651 passed individual inspection, allowing for the formation of 211,575 distinct pairs. The benchmark identified 22.25% of these pairs as structural candidates for risk. We interpret this raw rate as the ceiling for a recall-oriented scanner and calibrated it against human judgment. In a pattern-stratified audit, approximately one in five flagged pair-pattern hits was confirmed as a genuine compositional risk, resulting in a population-weighted validity of 18.2%—our primary finding. This suggests that roughly 14,000 genuine risk memberships exist within a single registry. Notably, per-skill scanning fails to detect these risks by design, as each pair is safe in isolation.

Subsequently, an action-based harness evaluated when these candidates translate into model-issued tool calls. The results indicated that realization is gated by the host model’s disposition. On an anchor-conditioned dropper subset, Haiku-4-5 issued the dropper-stage tool call in all 39 direct-prompt trials (comprising 36 full download-then-execute chains and 3 download-only instances). In contrast, Opus-4-7 halted at the download stage, and Sonnet-4-6 refused the request entirely. A control experiment, which kept the request fixed while varying only the installed skills, revealed that compliance was highest when no skills were installed. This demonstrates that while composition determines which capabilities are reachable, the host model ultimately decides whether to utilize them. These findings underscore the necessity of install-time compositional checks and capability isolation as essential complements to traditional per-skill scanning.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.