EvoPrompt: Guided Prompt Evolution for Vision-Language Models Adaptation
Title: EvoPrompt: Guided Prompt Evolution for Vision-Language Models Adaptation
Abstract:
Adapting large-scale vision-language models (VLMs) to downstream tasks often faces the hurdle of scarce labeled data. Although parameter-efficient prompt learning presents a viable solution, it frequently leads to the catastrophic forgetting of pre-trained knowledge. Addressing this issue, our study is driven by the premise that controlling the evolutionary trajectory of prompts is crucial for enabling adaptation without knowledge loss. Consequently, we introduce EvoPrompt, a novel framework that explicitly directs prompt trajectories to facilitate fine-tuning that retains knowledge. Our method utilizes a Modality-Shared Prompt Projector (MPP) to derive hierarchical prompts from a single, unified embedding space. A key aspect of our approach is an evolutionary training strategy that separates low-rank updates into directional and magnitude components. This separation ensures that early-learned semantic directions remain intact while only their magnitudes are adjusted, allowing prompts to evolve without erasing foundational knowledge. Furthermore, Feature Geometric Regularization (FGR) stabilizes this process by imposing feature decorrelation, thereby averting representation collapse. Comprehensive experiments confirm that EvoPrompt delivers state-of-the-art results in few-shot learning scenarios, while simultaneously maintaining the robust zero-shot capabilities inherent in pre-trained VLMs.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




