Plan2Map: A Multimodal Benchmark for Document-Grounded Geospatial Boundary Reconstruction from Planning Records
Title: Plan2Map: A Multimodal Benchmark for Document-Grounded Geospatial Boundary Reconstruction from Planning Records
Abstract
Planning documents impose restrictions on specific geographic zones; however, the source materials typically offer indirect spatial cues instead of machine-readable boundary data. To address this, we present Plan2Map, a multimodal benchmark comprising 208 cases focused on reconstructing geospatial boundaries from UK planning records. The challenge requires systems to generate valid geospatial boundaries using only the source planning document, incorporating notice text, schedules, map plates, map labels, and boundary annotations. The reference GeoJSON files are withheld to facilitate scoring.
We introduce GeoPlanAgent, a system that integrates geospatial tools within a document-grounded framework. This approach breaks down the complex task into six distinct stages: evidence extraction, localization, map registration, boundary segmentation, projection, and verification. In evaluations on Plan2Map, GeoPlanAgent demonstrated significant superiority over direct Vision-Language Model (VLM) to GeoJSON baselines, achieving a mean Intersection over Union (IoU) of 0.736 and a median IoU of 0.904. Furthermore, 67.8% of its predictions reached an IoU of 0.8 or higher.
Diagnostic analysis reveals that while direct VLM predictions remain inconsistent, the primary sources of error for our system lie in localization and map registration. Additionally, we found that supervised boundary segmentation markedly enhances the quality of pixel-level masks. Plan2Map serves as a robust testbed for advancing multimodal geospatial reconstruction from public planning records.
Project page: https://odeb1.github.io/Plan2Map_Project_Page/.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



