Adaptive Causal Alignment for High-Confidence Adversarial Training
Title: Adaptive Causal Alignment for High-Confidence Adversarial Training
Abstract:
Inverse adversarial training typically relies on high-confidence predictions to stabilize the learning of robust models. However, our analysis exposes a fundamental paradox: such confidence often arises not from a true understanding of intrinsic object semantics, but from overfitting to non-causal background correlations. We identify visual context as a dual-natured signal that can act as either a necessary supportive prior or a spurious confounder. Consequently, existing approaches that employ blind suppression strategies are flawed, inevitably causing significant Feature Loss.
To address this challenge, we introduce High-Confidence Causally Aligned Training (HICAT), a unified framework designed to establish Semantic Equilibrium. HICAT operates through a "Measure-Debias-Align" pipeline. It incorporates a Learnable Background-Bias Estimator (LBBE) to adaptively assess the utility of contextual information. Based on this assessment, an Adaptive Debiasing mechanism executes precise logit rectification. Furthermore, to enforce strict feature disentanglement, we introduce a geometrically grounded Foreground Logit Orthogonal Enhancement (FLOE) loss. Comprehensive experiments conducted on CIFAR-10, CIFAR-100, and ImageNet-1K show that HICAT consistently outperforms matched baselines across various architectures, including CNNs and ViTs, while notably narrowing the robust generalization gap.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC




