Learning to Trim: End-to-End Causal Graph Pruning with Dynamic Anatomical Feature Banks for Medical VQA
Title: Learning to Trim: End-to-End Causal Graph Pruning with Dynamic Anatomical Feature Banks for Medical VQA
Abstract: The generalization capabilities of Medical Visual Question Answering (MedVQA) systems are frequently compromised by their dependence on dataset-specific biases—such as recurring anatomical configurations or question-type regularities—rather than authentic diagnostic evidence. While current causal methods largely rely on static modifications or post-hoc corrections, this study introduces Learnable Causal Trimming (LCT), a framework that embeds causal pruning directly into end-to-end optimization. Central to this approach is the Dynamic Anatomical Feature Bank (DAFB), which employs a momentum-based update mechanism to capture global prototypes of common anatomical and linguistic patterns, thereby approximating dataset-level regularities. Additionally, we develop a differentiable trimming module that assesses the dependency between individual instance representations and the global feature bank. This process softly suppresses features that exhibit high correlation with global prototypes, while simultaneously highlighting instance-specific evidence. By enabling the model to adaptively prioritize causal signals over spurious correlations, this learnable mechanism enhances performance. Empirical evaluations on the VQA-RAD, SLAKE, SLAKE-CP, and PathVQA benchmarks confirm that LCT consistently outperforms existing debiasing strategies in terms of robustness and generalization.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





