Beyond Low-Rank: Low-Rank Sparse Prompting via Spiking Neural Network and Prompt Factorization
Title: LoRSP: Enhancing Visual Prompting with Spiking Neural Networks and Prompt Factorization
Original: arXiv:2606.01945v1 Announce Type: new Abstract: Visual Prompting (VP) has emerged as an efficient paradigm for adapting large-scale pre-trained vision models to downstream tasks by incorporating learnable prompts at the input level. However, existing VP methods typically employ dense pixel-level prompts, which often suffer from redundant perturbations, limited generalization and energy inefficiency. To overcome these limitations, we propose to integrate brain-inspired spiking learning into visual prompt learning tasks. As we know that spiking neuron can perform inexpensive information processing by transmitting the input data into discrete spike trains and return sparse outputs. Inspired by this, we propose \textbf{Lo}w-\textbf{R}ank visual \textbf{S}pike \textbf{P}rompting (LoRSP), a novel framework that learns dynamic low-rank sparse visual prompts naturally via a Spiking neuron learning mechanism. The core idea of LoRSP is to exploit the brain-inspired sparse firing mechanism of spiking neurons to generate pixel-level sparse prompt for each instance. To be specific, we first construct a series of prompt factors via low-rank factorization to capture distinct prompt subspaces. These prompt factors are then fed into an SNN architecture, which performs the integrate-and-fire process to emit spikes. As a result, our LoRSP generates a \emph{sparse} visual prompt while maintaining the low-rank constraint. This design enables instance-specific selective prompting, leading to more compact and robust adaptation across diverse downstream tasks. Extensive experiments on five heterogeneous vision backbones and multiple benchmarks demonstrate that LoRSP achieves competitive performance while requiring fewer tunable parameters compared to existing VP methods.
Rewrite: Visual Prompting (VP) has gained traction as a cost-effective strategy for tailoring large-scale pre-trained vision models to specific downstream applications by introducing learnable prompts directly at the input stage. Nevertheless, conventional VP approaches generally rely on dense, pixel-level prompts, which are prone to energy inefficiency, constrained generalization capabilities, and redundant noise. Addressing these challenges, this study introduces a brain-inspired spiking learning approach to visual prompt adaptation. Drawing on the fact that spiking neurons facilitate low-cost information processing by converting input data into discrete spike trains and yielding sparse outputs, we introduce \textbf{Lo}w-\textbf{R}ank visual \textbf{S}pike \textbf{P}rompting (LoRSP). This innovative framework utilizes a spiking neuron learning mechanism to naturally acquire dynamic, low-rank, and sparse visual prompts. At its heart, LoRSP leverages the sparse firing dynamics characteristic of spiking neurons to produce sparse, pixel-level prompts tailored to individual instances. Specifically, the method begins by deriving prompt factors through low-rank factorization to isolate distinct prompt subspaces. These factors are subsequently processed by a Spiking Neural Network (SNN) architecture, which executes an integrate-and-fire routine to generate spikes. Consequently, LoRSP produces \emph{sparse} visual prompts without compromising the low-rank structure. This approach facilitates selective, instance-specific prompting, resulting in a more compact and resilient adaptation process across a wide range of downstream tasks. Comprehensive evaluations across five diverse vision backbones and various benchmarks reveal that LoRSP delivers competitive results while utilizing fewer tunable parameters than current VP techniques.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





