Improved Belief-Attention in Vision Task
Title: Enhanced Belief-Attention Mechanism for Visual Recognition
Abstract: The recently introduced Belief-Attention \cite{Guoqiang25BeliefAttention} aims to boost Transformer performance by calculating the orthogonal projection of the softmax-weighted sum of value ($V$) vectors relative to the original $V$ vectors, utilizing the resulting perpendicular component as a residual signal. In this work, we present an ablation study demonstrating that the projected component also retains significant information regarding token correlations, which should not be discarded. Building on this insight, we propose an extension to Belief-Attention that leverages both the perpendicular and projected components. Specifically, the projected component is processed through an activation function and a linear mapping before being integrated with the target token. From a conceptual standpoint, this neural block functions as a two-layer feedforward network (FFN) embedded within the new attention mechanism. Furthermore, while standard attention identifies token correlations through the inner-product matrix $QK^T$, we introduce an additional inner-product matrix, $ZZ^T$, to capture more complex relationships. We designate this enhanced module as Belief2-Attention. Theoretical analysis confirms that Belief2-Attention offers greater expressivity than standard Attention. Finally, we validate the efficacy of Belief2-Attention on vision tasks, including image classification and segmentation.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




