arXiv

Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning

Title: Evi-Steer: Facilitating the Steering of Biomedical Vision-Language Models via Efficient and Generalizable Evidential Tuning

Abstract: Achieving precise multimodal comprehension of biomedical imagery through parameter-efficient adaptation of vision-language foundation models is essential. However, current methodologies are predominantly deterministic, frequently faltering when faced with domain shifts or ambiguous alignments between images and text. This vulnerability is especially pronounced in clinical environments, where models must maintain robustness amid limited data availability and shifting domains. To address this, we introduce Evi-Steer, an evidential cross-modal framework designed for low-dimensional steering of BiomedCLIP. This approach facilitates uncertainty-aware, parameter-efficient fine-tuning by modifying merely 0.11% of the model’s total parameters.

Our method executes lightweight, low-dimensional token updates across both textual and visual encoders while concurrently quantifying epistemic uncertainty. These uncertainty metrics modulate gate residuals, enabling the model to adopt a conservative adaptation strategy when evidence is insufficient. Additionally, we propose a cross-modal confidence fusion mechanism grounded in Dempster-Shafer theory. This allows visual adaptation to be conditioned on textual confidence levels, effectively neutralizing conflicting or uncertain updates across modalities.

We performed an extensive evaluation of Evi-Steer across 15 biomedical imaging datasets, covering eight distinct organs and eight imaging modalities, within the contexts of few-shot learning and domain generalization. The results demonstrate that Evi-Steer consistently surpasses state-of-the-art techniques in scenarios involving domain shifts and few-shot learning, offering a practical and resilient strategy for integrating vision-language models into real-world clinical workflows. The source code can be accessed at https://github.com/HealthX-Lab/Evi-Steer.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...