arXiv

Beyond Visual Fidelity: Benchmarking Super-Resolution Models for Large-Scale Remote Sensing Imagery via Downstream Task Integration

Title: Prioritizing Task Utility Over Visual Fidelity: A Downstream-Integrated Benchmark for Large-Scale Remote Sensing Super-Resolution

Abstract:

While super-resolution (SR) methods have significantly advanced the reconstruction of high-resolution imagery from low-resolution sources, current evaluation standards often fall short of capturing their real-world value. Although higher resolution offers visual clarity and aids in monitoring, existing SR research and benchmarks predominantly rely on fidelity metrics like PSNR and SSIM. This approach overlooks the primary purpose of super-resolved images: to enhance downstream applications such as change detection, biomass estimation, and land cover classification. To address this disconnect, we present GeoSR-Bench, a novel benchmark dataset designed to evaluate SR models through the lens of downstream task integration rather than mere visual fidelity.

GeoSR-Bench features high-quality, temporally aligned, and spatially co-located image pairs derived from approximately 36,000 diverse locations. The dataset covers a wide range of land cover types and resolutions, extending from 500m down to 0.6m. To our knowledge, this is the first SR benchmark that explicitly links the resolution improvements achieved by SR models to their effectiveness in Earth monitoring tasks, including infrastructure mapping, biophysical variable estimation, and land cover segmentation.

We utilized GeoSR-Bench to assess the perceptual quality and downstream performance of various SR architectures, including GANs, transformers, neural operators, and diffusion-based models. Our experimental framework comprised 270 distinct settings, encompassing two cross-platform SR tasks, nine SR models, three downstream task models, and five specific downstream tasks per SR task. The findings reveal a critical insight: enhancements in traditional SR metrics do not necessarily translate to better task performance; in some cases, the correlation is even negative. This suggests that conventional fidelity metrics offer limited utility for selecting models intended for downstream applications. Consequently, these results underscore the necessity of incorporating downstream task objectives into both the development and evaluation phases of SR models.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...