arXiv

XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

Title: XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

Original: arXiv:2606.04301v1 Announce Type: new Abstract: Acquiring labeled medical image data is resource-intensive and a challenge further exacerbated in cross-domain scenarios where source and target datasets differ in imaging equipment, population, or clinical site. This study introduces XSSR (Cross-Domain Self-Supervised Representative Selection), a framework designed to minimize annotation effort in the target domain while maintaining robust segmentation performance. XSSR comprises three stages: first, a Masked Autoencoder (MAE) is trained on unlabeled source data to establish a shared embedding space without requiring target labels; second, a greedy selection algorithm scores unlabeled target samples based on a composite density, novelty, and diversity criterion; and third, a U-Net segmentation model is trained exclusively on the selected subset. The novelty-diversity trade-off parameter, alpha, is automatically calibrated by minimizing embedding-space coverage, eliminating manual tuning. We evaluate XSSR on three public benchmarks: Chest X-ray, RIGA+ retinal fundus imaging, and multi-site Prostate MRI, each under a fixed 5% annotation budget. XSSR achieves 99.3% of full-data performance on Chest X-ray using only 22 labeled samples, surpasses random selection by up to 2.5 Dice points on Prostate MRI, and consistently outperforms the CoreSet baseline by 0.4 to 1.2 Dice points across all datasets. Ablation studies indicate that diversity is the most influential scoring component, and per-site analysis shows that performance correlates with scanner similarity to the source domain.

Rewrite: Title: XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

Abstract: The procurement of annotated medical images is costly and becomes particularly difficult in cross-domain settings, where variations in imaging hardware, patient demographics, or clinical centers create disparities between source and target datasets. To address this, we propose XSSR (Cross-Domain Self-Supervised Representative Selection), a method aimed at reducing labeling costs in the target domain without compromising segmentation accuracy. The XSSR pipeline consists of three distinct phases: initially, a Masked Autoencoder (MAE) is trained on unlabeled source images to create a unified embedding space, bypassing the need for target annotations. Next, a greedy algorithm evaluates unlabeled target samples using a metric that combines density, novelty, and diversity. Finally, a U-Net model is trained solely on the subset of samples chosen by the selection process. The parameter alpha, which balances novelty and diversity, is automatically determined by minimizing embedding-space coverage, thus removing the need for manual hyperparameter adjustment. We tested XSSR on three public datasets—Chest X-ray, RIGA+ retinal fundus images, and multi-site Prostate MRI—utilizing a strict 5% annotation budget. Results show that XSSR reached 99.3% of the performance achieved with full data on Chest X-ray, requiring just 22 labeled samples. On Prostate MRI, it improved upon random selection by as much as 2.5 Dice points, and it exceeded the CoreSet baseline by 0.4 to 1.2 Dice points across all evaluated datasets. Further analysis reveals that diversity plays the most critical role in the scoring mechanism, and performance is closely linked to the similarity between the target scanner and the source domain.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...