arXiv

Cryo-Bench: Benchmarking Foundation Models for Cryosphere Applications

June 3, 2026 · Saurabh Kaushik, Lalit Maurya, Beth Tellman, Valerio Marsocci · Original Source

Title: Cryo-Bench: Benchmarking Foundation Models for Cryosphere Applications

Abstract:

While Geo-Foundation Models (GFMs) have demonstrated significant potential in generating reliable maps even from sparse labeling across various Earth observation domains, their application to the cryosphere has been hindered by a scarcity of appropriate evaluation datasets. To bridge this gap, we present Cryo-Bench, a comprehensive benchmark designed to assess GFM performance across critical cryospheric elements. This benchmark encompasses debris-covered glaciers, glacial lakes, sea ice, and calving fronts, drawing data from diverse sensors and wide geographic areas.

We conducted a comparative analysis of 14 GFMs against UNet and ViT baselines to identify their respective strengths, weaknesses, and optimal deployment strategies. Our results indicate that when employing a frozen encoder, UNet secures the highest mean Intersection over Union (mIoU) of 66.38, with TerraMind following closely at 64.02 across the five datasets comprising Cryo-Bench.

In few-shot scenarios utilizing only 10% of the input data, foundation models such as DOFA and TerraMind surpassed the standard UNet. Specifically, DOFA, TerraMind, and UNet achieved mIoU scores of 59.53, 56.62, and 56.60, respectively. Conversely, fully fine-tuning GFMs yielded inconsistent results across different models and datasets. However, adjusting the learning rate in conjunction with fine-tuning significantly boosted GFM efficacy. For instance, evaluations on two key datasets, GLID and CaFFe, revealed an average relative performance gain of 12.77%.

Notably, despite limited representation of cryospheric data in their pretraining corpora, GFMs demonstrated robust domain adaptation capabilities, delivering meaningful outcomes across various tasks. Based on these insights, we recommend fine-tuning the encoder accompanied by hyperparameter optimization for achieving peak performance. Alternatively, for users requiring rapid results without extensive experimentation, utilizing frozen encoders is advised.

(\href{https://github.com/Sk-2103/Cryo-Bench}{GitHub})

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC