arXiv

ActMVS: Active Scene Reconstruction with Monocular Multi-View Stereo

June 2, 2026 · Guo Pu, Yixuan Han, Zhouhui Lian · Original Source

Title: ActMVS: Active Scene Reconstruction with Monocular Multi-View Stereo

Abstract:

Active scene reconstruction empowers robots and UAVs to autonomously devise trajectories and map their surroundings, eliminating the need for expensive, manual data collection. In contrast to passive techniques, active reconstruction demands the real-time creation of high-confidence occupancy maps to ensure collision-free navigation. While traditional methods depend on depth sensors to update these maps—thereby adding significant cost and weight to the platform—our goal is to advance spatial intelligence through a vision-only, monocular approach. However, existing monocular reconstruction techniques are limited to offline processing and cannot generate globally consistent dense depth at the high frame rates necessary for robotic or UAV navigation. To address this limitation, we present ActMVS, the inaugural framework for monocular active reconstruction. By incorporating a view factor graph construction to guide Multi-View Stereo depth prediction, alongside global depth optimization, our system facilitates the online production of high-quality, globally consistent dense depth maps. This capability allows monocular robots and UAVs to sustain reliable occupancy maps for safe trajectory planning throughout the reconstruction process. Evaluations on the Replica dataset show that our approach performs competitively with RGB-D methods. The associated code and data can be accessed at https://github.com/TrickyGo/ActMVS.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC