Spatial Transcriptomics-Guided Alignment Enhances Molecular Profiling in Pathology Foundation Model
Title: Spatial Transcriptomics-Guided Alignment Enhances Molecular Profiling in Pathology Foundation Model
Abstract:
Modern precision oncology depends heavily on comprehensive molecular profiling; however, its widespread adoption is currently obstructed by high expenses, limited sample availability, and lengthy processing periods. Although pathology foundation models (PFMs) show promise in predicting molecular phenotypes from standard hematoxylin and eosin (H&E) whole-slide images (WSIs), existing models are predominantly built on vision-centric self-supervised learning or vision-language alignment. These approaches often fall short because they lack spatially resolved molecular supervision, which is necessary to bridge the gap between subtle morphological changes and underlying genomic mutations. Spatial transcriptomics (ST) offers a revolutionary solution by allowing transcriptomic quantification within intact tissue sections, thus maintaining the exact spatial relationship between histological features and molecular profiles.
In this research, we introduce the Spatial Transcriptomics-guided Alignment framework for Molecular Profiling (STAMP), a method that equips PFMs with inherent molecular understanding. To facilitate this approach, we developed HumanST-1k, a comprehensive human ST dataset covering various sequencing platforms and anatomical organs. This atlas comprises 1.8 million pairs of H&E patches alongside their corresponding transcriptomic profiles, creating a robust corpus that connects histological structures with their molecular states. To address the technical noise present in raw transcriptomic data, STAMP employs a pathway-informed alignment strategy. This method aggregates transcriptomic information into biologically functional pathways, which are then incorporated into PFMs through parameter-efficient fine-tuning. Consequently, this alignment expands the representation space of PFMs, enabling them to detect sub-visual molecular signatures. The clinical value of these enhanced representations was confirmed via a multi-tier evaluation framework.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



