Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models
Title: The Inevitable Trade-off: An Information-Theoretic Limit on Capability and Robustness in Vision-Language-Action Models
Abstract:
Vision-Language-Action (VLA) models demonstrate high efficacy on unperturbed data yet remain highly vulnerable to minor adversarial interference. For instance, a PGD attack with an intensity of $16/255$ causes the success rate of OpenVLA-7B on the LIBERO benchmark to plummet from $95\%$ to less than $5\%$. While the existence of a theoretical lower bound for this performance trade-off had long been an open question, we establish that such a limit indeed exists. We demonstrate that for any VLA policy, the sum of its capability, defined as $I(\Astar;\Api)$, and its robustness, quantified as $I(\Api;\Atildepi)-I(\Api;\delta)$, is bounded above by $H(\Astar)+I(X;\Xtilde)$. This upper limit represents the sum of task entropy and adversarial channel capacity. The derivation relies on two applications of the Data Processing Inequality.
While the pixel-level bound acts as a loose ceiling guarantee—deviating by approximately $10^3$ nats—an encoder-specific corollary significantly tightens this constraint by more than an order of magnitude. In this tighter regime, realized capability already accounts for $5\%$ to $9\%$ of the total information budget. We empirically validate Theorem~\ref{thm:main}, observing zero violations across 308 distinct test cells. These cells include 252 closed-form Gaussian-VLA configurations, 48 OpenVLA-7B setups tested under LIBERO with PGD attacks (spanning 4 suites, 4 $\eps$ values, and 3 seeds), 4 Square-Attack instances, and 4 multi-step scenarios ($T=10$).
Furthermore, a complementary measurability inequality, $\Rob_{\text{disc}} \le \Cap_{\text{disc}}$, holds true across 144 cross-architecture cells. These comparisons span OpenVLA, OpenVLA-OFT (which uses continuous-$L_1$), and SmolVLA (which employs flow-matching). This analytical framework also yields three label-free diagnostic tools: a pre-flight encoder ceiling, a defense-forensics probe capable of distinguishing between input-side and language-model interventions, and a head-agnostic robustness ratio that allows for consistent comparison across discrete-token, $L_1$-regression, and flow-matching policies. Collectively, these insights provide a unified axis for defense strategies and architecture comparisons, addressing gaps in current methodologies.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





