Invariant Gradient Alignment for Robust Reasoning Distillation
Title: Enhancing Robust Reasoning Distillation through Invariant Gradient Alignment
Abstract: Large language models (LLMs) are prone to shortcut learning, a phenomenon where they systematically struggle with out-of-distribution (OOD) inputs that present different semantic surfaces than the training data, even if the underlying logical structure remains the same. This limitation hampers knowledge distillation pipelines that aim to transfer chain-of-thought reasoning capabilities to smaller student models. To address this, we propose Invariant Gradient Alignment (IGA), a novel training framework designed to synchronize gradient updates across examples that are semantically varied yet logically isomorphic. IGA relies on three key innovations: first, the use of Logical Isomer Sets, which comprise problem groups that share the same logical structure across disparate semantic fields such as mathematics, law, medicine, and science; second, a differentiable Continuous Gradient Conflict Mask that reduces parameter dimensions exhibiting high cross-domain gradient variance while safeguarding invariant directions; and third, a truncated Singular Value Decomposition (SVD) projection that maps the masked gradient back onto the LoRA low-rank manifold, thereby preserving parameter efficiency. Theoretical analysis demonstrates that IGA provides tighter OOD generalization bounds compared to Empirical Risk Minimization (ERM), with performance scaling alongside the number of isomer domains, and achieves convergence at the standard Stochastic Gradient Descent (SGD) rate under mild regularity conditions. Empirical evaluations show that IGA surpasses eight baseline methods across four distinct benchmarks, achieving accuracy improvements of up to 14.3 percentage points over ERM-SFT. Furthermore, it attains a Logical Consistency Score of 0.031, compared to 0.142 for the baseline, marking a fourfold enhancement in representational invariance.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






