arXiv

XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

June 4, 2026 · Byunghyun Ko, Aleksei Anisimov, Kobe Ke, Suhas Bharthepude, Jeongkyu Lee · Original Source

Title: XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

Original: arXiv:2606.04301v1 Announce Type: new Abstract: Acquiring labeled medical image data is resource-intensive and a challenge further exacerbated in cross-domain scenarios where source and target datasets differ in imaging equipment, population, or clinical site. This study introduces XSSR (Cross-Domain Self-Supervised Representative Selection), a framework designed to minimize annotation effort in the target domain while maintaining robust segmentation performance. XSSR comprises three stages: first, a Masked Autoencoder (MAE) is trained on unlabeled source data to establish a shared embedding space without requiring target labels; second, a greedy selection algorithm scores unlabeled target samples based on a composite density, novelty, and diversity criterion; and third, a U-Net segmentation model is trained exclusively on the selected subset. The novelty-diversity trade-off parameter, alpha, is automatically calibrated by minimizing embedding-space coverage, eliminating manual tuning. We evaluate XSSR on three public benchmarks: Chest X-ray, RIGA+ retinal fundus imaging, and multi-site Prostate MRI, each under a fixed 5% annotation budget. XSSR achieves 99.3% of full-data performance on Chest X-ray using only 22 labeled samples, surpasses random selection by up to 2.5 Dice points on Prostate MRI, and consistently outperforms the CoreSet baseline by 0.4 to 1.2 Dice points across all datasets. Ablation studies indicate that diversity is the most influential scoring component, and per-site analysis shows that performance correlates with scanner similarity to the source domain.

Rewrite: Title: XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

Abstract: The procurement of annotated medical images is costly and becomes particularly difficult in cross-domain settings, where variations in imaging hardware, patient demographics, or clinical centers create disparities between source and target datasets. To address this, we propose XSSR (Cross-Domain Self-Supervised Representative Selection), a method aimed at reducing labeling costs in the target domain without compromising segmentation accuracy. The XSSR pipeline consists of three distinct phases: initially, a Masked Autoencoder (MAE) is trained on unlabeled source images to create a unified embedding space, bypassing the need for target annotations. Next, a greedy algorithm evaluates unlabeled target samples using a metric that combines density, novelty, and diversity. Finally, a U-Net model is trained solely on the subset of samples chosen by the selection process. The parameter alpha, which balances novelty and diversity, is automatically determined by minimizing embedding-space coverage, thus removing the need for manual hyperparameter adjustment. We tested XSSR on three public datasets—Chest X-ray, RIGA+ retinal fundus images, and multi-site Prostate MRI—utilizing a strict 5% annotation budget. Results show that XSSR reached 99.3% of the performance achieved with full data on Chest X-ray, requiring just 22 labeled samples. On Prostate MRI, it improved upon random selection by as much as 2.5 Dice points, and it exceeded the CoreSet baseline by 0.4 to 1.2 Dice points across all evaluated datasets. Further analysis reveals that diversity plays the most critical role in the scoring mechanism, and performance is closely linked to the similarity between the target scanner and the source domain.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC