arXiv

Step-Level Sparse Autoencoder for Reasoning Process Interpretation

Title: Disentangling Reasoning Steps with Sparse Autoencoders

Abstract: Although Large Language Models (LLMs) have demonstrated robust complex reasoning abilities via Chain-of-Thought (CoT) strategies, their internal reasoning patterns remain difficult to decipher. While Sparse Autoencoders (SAEs) have become a potent instrument for interpretability, current methods primarily function at the token level. This creates a granularity mismatch when attempting to capture crucial step-level data, such as semantic transitions and reasoning direction. To address this, we introduce the Step-level Sparse Autoencoder (SSAE), an analytical framework designed to separate various facets of an LLM’s reasoning steps into distinct sparse features. By precisely regulating the sparsity of step features relative to their context, we establish an information bottleneck during step reconstruction. This mechanism isolates incremental information from background noise, distributing it across several sparsely activated dimensions. Our experiments, conducted across various base models and reasoning tasks, validate the utility of the extracted features. Through linear probing, we successfully predict both surface-level metrics, including generation length and the distribution of the first token, and more complex attributes, such as the logicality and correctness of the step. These findings suggest that LLMs possess at least a partial awareness of these properties during the generation process, thereby laying the groundwork for their self-verification capabilities. The code is accessible at https://github.com/Miaow-Lab/SSAE.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Withings Debuts New Smart Scale Marketed Toward GLP-1 Users
Bloomberg

Withings Debuts New Smart Scale Marketed Toward GLP-1 Users

Withings launched a new smart scale targeting GLP-1 users, offering advanced body composition analysis. This device help...

TechCrunch

Rocket engine startup Impulse raises $500 million to hire people, not AI

Rocket engine startup Impulse Space raised $500 million to hire 200 engineers, prioritizing human expertise over AI for ...

Startup Impulse Space Raises $500 Million, Valued at $4 Billion
Bloomberg

Startup Impulse Space Raises $500 Million, Valued at $4 Billion

Impulse Space secured $500 million in funding, achieving a $4 billion valuation. This investment supports the developmen...

Walmart’s Answer to Apple Pay Wants to Be Your Favorite Financial App
Bloomberg

Walmart’s Answer to Apple Pay Wants to Be Your Favorite Financial App

Walmart’s new financial app aims to rival Apple Pay, positioning itself as a preferred digital payment and banking solut...

Nvidia Is Bigger, Stronger, and Trying to Slay the Laptop Dragon Again
Bloomberg

Nvidia Is Bigger, Stronger, and Trying to Slay the Laptop Dragon Again

Nvidia unveiled the RTX Spark Superchip at Computex 2026, aiming to challenge Intel’s PC dominance and modernize hardwar...

TechCrunch

Pacific Fusion’s latest prototype packs 440 gigawatts into an 80-nanosecond burst

Pacific Fusion’s new prototype delivers 440 gigawatts in 80 nanoseconds, securing over $1 billion in funding and enablin...