arXiv

Description-Code Inconsistency in Real-world MCP Servers: Measurement, Detection, and Security Implications

June 4, 2026 · Yutao Shi, Xiaohan Zhang, Xiangjing Zhang, Xihua Shen, Hui Ouyang, Huming Qiu, Mi Zhang, Min Yang · Original Source

Title: Description-Code Inconsistency in Real-world MCP Servers: Measurement, Detection, and Security Implications

Abstract:

The Model Context Protocol (MCP) has established itself as a vital standard, enabling Large Language Models (LLMs) to leverage external tools. Within this framework, LLMs depend on natural language descriptions supplied by MCP servers to identify and execute specific functions. This process rests on the implicit assumption that these descriptions accurately mirror the underlying code implementations; however, this alignment is not rigorously verified in practical deployments. Consequently, MCP systems are vulnerable to a phenomenon known as Description-Code Inconsistency (DCI), wherein a tool’s stated capabilities and security parameters diverge from its actual code behavior.

This study offers a thorough examination of DCI within live MCP server environments. We provide a formal definition of the issue and introduce a comprehensive taxonomy that categorizes inconsistencies into functional discrepancies and undeclared side effects. Leveraging this taxonomy, we engineered DCIChecker, an automated framework that integrates structure-aware static analysis with a Direct-Reverse-Arbitration prompting technique to cross-verify tool descriptions against their corresponding code.

We evaluated this framework using a substantial dataset of 19,200 description-code pairs sourced from 2,214 real-world MCP servers. Our findings indicate that DCI is prevalent, affecting 9.93% of the analyzed pairs. Furthermore, we show that DCI establishes a significant defense blind spot, enabling a spectrum of risks ranging from operational failures to covert malicious activities. To address these challenges, we outline mitigation strategies aimed at enforcing semantic consistency and bolstering the reliability of the burgeoning agentic ecosystem.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC