arXiv

Reasoning Structure of Large Language Models

June 3, 2026 · Fr\'ed\'eric Berdoz, Luca A. Lanzend\"orfer, Fabian Farestam, Roger Wattenhofer · Original Source

Title: The Reasoning Architecture of Large Language Models

Abstract

Standard evaluations of Large Reasoning Models (LRMs) typically rely on indicators such as the number of tokens generated or the precision of final answers. Yet, identical performance scores on these metrics can mask significant differences in the underlying reasoning frameworks. To overcome this blind spot, we present a scalable benchmark based on logic puzzles alongside a methodology that transforms unstructured reasoning traces into verifiable graphs of claims and their dependencies. This approach converts the reasoning process into a structured, quantifiable entity, enabling the statistical analysis of its topology. Utilizing this framework, we propose a metric for reasoning efficiency that measures the density of the model’s logical progression. Our examination of open-source reasoning models reveals that these structural measurements distinguish behavioral patterns that are otherwise conflated by traditional token counts and accuracy rates, offering a robust instrument for identifying failure modes and assessing how reasoning capabilities evolve alongside increasing puzzle complexity.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC