Vision Transformer Finetuning Benefits from Non-Smooth Components
Title: Non-Smooth Elements Enhance Vision Transformer Fine-Tuning
Abstract:
While the smoothness of transformer architectures has been thoroughly investigated regarding adversarial robustness, training stability, and generalization, its significance in transfer learning remains largely unexplored. This study examines the capacity of vision transformer components to adjust their outputs in response to input variations—a property we define as \emph{plasticity}. Quantified as an average rate of change, plasticity measures sensitivity to input perturbations, where elevated plasticity corresponds to reduced smoothness. Through rigorous theoretical analysis and extensive empirical testing—comprising over $1,000$ fine-tuning iterations on large-scale vision transformers—we demonstrate that this framework offers principled direction for selecting components to prioritize during adaptation. A critical insight for practitioners is that the substantial plasticity observed in attention mechanisms and feedforward layers consistently correlates with superior fine-tuning outcomes. These results challenge the conventional wisdom that smoothness is inherently beneficial, presenting a fresh viewpoint on the functional characteristics of transformers. The associated code can be accessed at https://github.com/ambroiseodt/vit-plasticity.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC





