SORA: Free Second-Order Attacks in Fast Adversarial Training
Title: SORA: Enabling Free Second-Order Attacks in Rapid Adversarial Training
Abstract:
While Adversarial Training (AT) stands as a premier defense mechanism against adversarial examples, efficient single-step variants are frequently plagued by Catastrophic Overfitting (CO). This phenomenon is characterized by a sharp decline in robustness against multi-step attacks, even when the model maintains high performance against single-step perturbations. This paper addresses this specific vulnerability through two primary contributions. First, we define Epsilon Overfitting (EO), a conceptual framework suggesting that rigid perturbation magnitudes and directions intensify CO. Our analysis demonstrates that incorporating variability into perturbations substantially enhances robust generalization across various architectures and datasets. Second, we introduce PertAlign (Perturbation Alignment), a theoretically supported metric with negligible computational overhead. PertAlign forecasts the onset of CO by evaluating gradient alignment throughout different attack stages. Building on these findings, we present SORA, an adaptive step-size AT approach that dynamically modulates perturbations according to the geometry of the loss surface. SORA reliably averts CO, attaining state-of-the-art levels of both robustness and clean accuracy. Notably, it achieves these results across diverse datasets and architectures using a static set of hyperparameters, a crucial feature for practical implementation in fast AT. Comprehensive experiments on a wide range of datasets and models indicate that SORA either matches or exceeds the robustness of existing methods, while also offering improved clean accuracy and greater efficiency. The source code can be accessed at https://github.com/SecondOrderAT/SORA.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




