Enhancing MedSAM with a Lightweight Box Predictor for Medical Image Segmentation
Title: Boosting MedSAM Efficiency via a Compact Box Predictor for Medical Image Segmentation
Abstract:
Performing semantic segmentation in medical imaging remains a formidable challenge, primarily driven by limited data availability and the significant heterogeneity observed across different imaging modalities. Although foundation models such as the Segment Anything Model (SAM) offer potential, they typically require specialized adaptation to function effectively on medical imagery. Furthermore, while point prompts represent the most intuitive method for user interaction, they often lack the necessary spatial context for accurate segmentation, especially when dealing with structures that have irregular shapes or low contrast.
To address these limitations, this study introduces an improved segmentation framework that incorporates a lightweight Box Predictor module into the MedSAM architecture. By leveraging localized image embedding features, this module approximates a bounding box from a single user click. This approach delivers essential spatial guidance that mitigates the ambiguity inherent in point prompts, all while adding only 1.6 million parameters and incurring minimal inference overhead. The training process employs a two-stage pipeline, during which the Box Predictor is trained in isolation prior to its integration with MedSAM.
We assessed the generalization performance of our proposed method through comprehensive evaluations on four distinct datasets: FLARE22, BRISC, BUSI, and LungSegDB. These datasets cover a range of imaging modalities, including CT, MRI, and Ultrasound. The results demonstrate enhanced accuracy and robustness across diverse anatomical structures and imaging domains. Specifically, the method achieved Dice scores of 0.89 on BUSI, 0.93 on FLARE22, 0.88 on BRISC, and 0.98 on LungSegDB. The source code for this project is accessible at https://github.com/Amirhosseinmovahedi/MedSAM-BoxPredictor.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC





