Global News Digest

arXiv

Detection vs. Execution: Single-Bucket Probes Miss Half the Mamba-2 State Sink

Title: Detection Versus Execution: Single-Bucket Probes Overlook Half of Mamba-2’s State Sink

Abstract:

Mechanistic interpretability frequently operates under the premise that a probe capable of identifying a specific representational signature also isolates the circuit responsible for the corresponding computation. However, we demonstrate that this assumption breaks down systematically within the Mamba-2 architecture. By investigating the "state sink"—characterized by disproportionate Delta-gate activation on boundary tokens, similar to the attention sink—we reveal that single-bucket probes capture only a minor execution layer while overlooking a significantly larger detection layer that shares the same representational signature.

In Mamba-2, the state sink splits into two distinct functional groups of heads. BOS-specialist heads, which constitute approximately 5% of the heads in the 2.7B model, are causally responsible for both BOS-context and newline-target predictions across various model scales and datasets. In contrast, dual heads, which account for 27–35% of the heads and are identified through multi-class aggregation of the same probe, exhibit stronger representational similarity between BOS and newline tokens but demonstrate substantially weaker causal influence when subjected to ablation. This finding underscores that representational similarity does not equate to functional equivalence.

This distinction has critical implications for downstream performance: ablating BOS-specialist heads causes RULER NIAH retrieval accuracy to plummet from 1.00 to 0.00 at a context length of 1024 in both Mamba-1 (2.8B) and Mamba-2 (2.7B). Conversely, ablating size-matched complementary heads leaves baseline performance intact. A random channel-bucketing control helps rule out substrate granularity as the sole factor, pointing instead to Mamba-2’s head-shared Delta projection as a key element. Ultimately, while probe-derived specialty can identify execution circuits, the same probe at coarse granularity also recovers detection circuits; distinguishing between the two requires class-conditional ablation rather than class-conditional cosine similarity.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.