arXiv

CoralBay: A Self-Supervised CT Foundation Model

June 3, 2026 · Ioannis Gatopoulos, Nicolas K\"anzig, Sebastian Ot\'alora, Fei Tang · Original Source

Title: CoralBay: A Self-Supervised CT Foundation Model

Abstract:

While self-supervised learning has successfully driven large-scale pre-training on 2D natural images to generate versatile visual representations that generalize well across various tasks, medical imaging presents unique challenges. Modalities like Computed Tomography (CT) scans are fundamentally distinct from natural images in both their structural composition and semantic content, being inherently three-dimensional. Volumetric data captures critical attributes such as spatial continuity, organ anatomy, and intensity-based tissue characteristics (e.g., Hounsfield Units), which 2D pre-training models fail to adequately represent.

To address this limitation, we present CoralBay, a self-distillation framework that builds upon DINO by incorporating a hierarchical 3D Swin backbone. This approach applies self-distillation to concatenated multi-scale features, facilitating data-efficient self-supervised learning that yields rich spatial representations. These representations effectively encode both fine-grained local structures and global semantics. Consequently, CoralBay demonstrates robust and consistent transferability to a broad spectrum of downstream radiological tasks across diverse anatomical regions.

Furthermore, we enhance the open-source \eva framework by launching a public, reproducible 3D radiology leaderboard. This initiative consolidates multiple datasets to create a standardized benchmark for evaluating methods in volumetric representation learning.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC