LaVIDE: Language-Prompted Satellite Change Detection via Map-Image Alignment
Title: LaVIDE: Leveraging Language-Prompted Map-Image Alignment for Satellite Change Detection
Abstract:
When prior imagery is unavailable, relying on a map reference paired with a current satellite image enables timely monitoring of the Earth's surface. However, the semantic disparity between high-level map categories and low-level image details often obstructs the extraction of homogeneous features necessary for robust temporal association in change detection tasks. To address this, we introduce LaVIDE (Language-VIsion Discriminator for dEtecting changes), a novel framework that utilizes language as a bridge to connect map semantics with image specifics. This approach diverges from traditional methods that typically rely on pixel-level visual similarity comparisons or the propagation of segmentation errors.
Our method incorporates two key innovations: restricted prompt learning, which generates context-aware textual prompts to align map semantics with image content, and object-aware embedding enhancement, which integrates object-level attributes, such as shape and boundaries, into map representations. Together, these components facilitate robust cross-modal alignment within a unified language-vision feature space.
We evaluated LaVIDE across four benchmarks: DynamicEarthNet, HRSCD, BANDON, and SECOND. The results show that LaVIDE significantly outperforms state-of-the-art techniques, delivering improvements of $18.4\%$ in Intersection over Union (IoU) for multi-class change detection and $5.2\%$ for single-class tasks. By enhancing the accuracy of map-image change detection, our framework offers a practical solution for rapidly updating maps with minimal human intervention. This advancement holds significant potential for applications in urban planning, disaster assessment, and ecological conservation.
The code and datasets are accessible at: https://github.com/ShuGuoJ/LAVIDE.git.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






