arXiv

CRISP -- Clustering-Based Redundancy-Reduced Instance Sampling for Pathology Case Representation and Retrieval

Title: CRISP: A Clustering-Driven Method for Reducing Redundancy in Instance Sampling for Pathological Case Representation and Retrieval

Abstract:

Digital pathology repositories are accumulating an increasing volume of whole-slide images (WSIs) for individual cases, a trend that captures spatially distinct tumor regions and highlights inherent morphological heterogeneity. Despite this data richness, prevailing methodologies typically depend on a single slide chosen by a pathologist, which results in the loss of valuable evidence contained in the remaining WSIs. To date, an autonomous system capable of comprehensive multi-WSI case processing has not been introduced. In this study, we propose an unsupervised framework for case-level analysis designed to synthesize information from every available slide within a case. Instead of limiting analysis to one designated slide, our approach generates case-level representations by selectively extracting informative patches from across the WSIs.

We introduce CRISP (Clustering-Based Redundancy-Reduced Instance Sampling for Pathology), a two-stage methodology. First, it minimizes redundancy within individual WSIs, and then it employs clustering-based sampling to identify a compact yet representative collection of patches for the entire case. This resulting patch set effectively encapsulates case-level heterogeneity without the computational burden of processing gigapixel images in their entirety, while simultaneously functioning as a direct retrieval index.

We evaluated CRISP using two breast cancer datasets from Mayo Clinic, focusing on diagnosis and treatment planning. Our results show that CRISP consistently performs on par with or better than the current standard practice, which combines model-based and pathologist-driven slide selection for patient and case search and retrieval. By automating the processing of cases and removing the subjectivity associated with WSI selection, CRISP offers the potential to unlock clinically significant information distributed across multiple WSIs that is currently ignored.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...