GARDEN: Gravity-Aligned Reconstruction of Disentangled ENvironments from RGB images
Title: GARDEN: Gravity-Aligned Reconstruction of Disentangled ENvironments from RGB images
Abstract:
Transforming multi-view RGB data into 3D environments suitable for simulation presents significant hurdles, as current reconstruction methods typically generate monolithic scene models that lack explicit physical structure. These conventional approaches often leave scenes defined only up to an arbitrary global rotation and fail to separate rigid foreground objects from background geometry, thereby impeding stable physical interactions. While existing methods attempt to restore interactivity by substituting reconstructed items with pre-existing CAD assets, this strategy necessitates a time-consuming retrieval-and-replacement phase and compromises the geometric accuracy specific to the scene.
To address these limitations, we introduce GARDEN, a framework that operates solely on RGB input. GARDEN redefines the reconstruction process as a physically grounded scene factorization, yielding a structured hybrid scene representation. The core innovation leverages gravity as a universal physical prior. Our method first aligns the reconstruction within a unified Gravity-View frame to eliminate gauge ambiguity. Subsequently, it extracts object-centric rigid meshes with precise 6-DoF positioning and eliminates redundant object geometry from the background via conditional 3D point classification. This approach results in a representation that merges explicit rigid bodies with a decoupled background, facilitating direct physics simulation while maintaining visual fidelity. Evaluations across both simulated and real-world multi-view datasets demonstrate that GARDEN surpasses retrieval-based baselines in terms of object placement reliability, disentanglement quality, and the efficiency of rendering-simulation workflows.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





