arXiv

From Local Training to Large-Scale Mapping: A Comparative Assessment of Machine Learning and Deep Learning for Transferable Satellite-Derived Bathymetry

June 3, 2026 · Hsiao-Jou Hsu, Joachim Moortgat · Original Source

Title: Scaling Up Satellite Bathymetry: A Comparative Study of Machine Learning and Deep Learning Approaches for Transferable Applications

Abstract:

While satellite-derived bathymetry (SDB) utilizing multispectral imagery offers a cost-efficient solution, its accuracy often degrades when applied across different geographic regions, particularly within optically complex coastal zones. This study assesses the efficacy of both machine learning and deep learning techniques for transferable SDB applications, focusing on depths between 0 and 20 meters using Sentinel-2 data. We benchmarked a Random Forest model against four Convolutional Neural Networks (CNNs)—specifically ResNet-50, ResNet-101, EfficientNet-B4, and ConvNeXt-Large. These models were trained on data from Pratas Island and specific areas of the Great Barrier Reef, then tested on spatially independent areas both within and across these regions.

Our analysis identifies the preservation of spatial continuity during the training phase as the most critical design factor. By utilizing contiguous reef blocks instead of random image patches, we significantly improved model performance. Additionally, we implemented a Smooth Weight Function (SWF)-weighted RMSE loss function to prioritize accuracy in near-surface waters. Under these optimized conditions, intra-regional RMSE values ranged from 1.15 to 1.92 meters across the 0-20 meter depth spectrum, dropping as low as 0.26 meters for depths between 2.99 and 3.78 meters. While deep learning models exhibited slightly higher RMSE values (2.46–2.98 meters), they demonstrated greater robustness.

Furthermore, we tested our proposed networks on the public MagicBathyNet aerial-RGB benchmark (0-16 meters), where they achieved RMSEs of 0.19–0.22 meters. This performance surpassed both a U-Net baseline and a task-specific transformer architecture, despite using substantially fewer parameters. To address environmental variability, we also leveraged multi-temporal repeat imagery. Training on such data increased diversity, while median-aggregating predictions across multiple passes at inference time effectively mitigated noise caused by fluctuating sun angles, atmospheric conditions, water properties, and tides. To facilitate scalable transfer to new locations, we have released the optimized architectures and pretrained weights.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC