Language Models Compare Quantities Using Number-specific and Unit-specific Heuristics
Title: Language Models Rely on Numerical and Unit-Specific Shortcuts to Compare Quantities
Abstract: Language models (LMs) must integrate numerals with symbolic unit scales to handle quantities involving measurement units, such as 1.2 m or 110 cm. This study investigates how LMs perform comparisons of such quantities across various unit systems within controlled environments. Our findings indicate that accuracy declines near the comparison boundary, a region where minor value fluctuations dictate the correct outcome. These mistakes are not random but systematic; linear surrogate models can accurately predict LM preferences based on cues related to numerical differences and unit-scale differences. Furthermore, causal interventions applied to subspaces aligned with these specific variables alter the model’s output. The evidence implies that LMs do not first convert both expressions into an exact, shared-scale representation. Instead, they rely on a collection of heuristics targeting numerals and units to make comparisons.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





