Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling
Title: Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling
Abstract:
Geometrically precise scene reconstructions that align with sensor data may still be physically invalid, particularly when occlusions and measurement noise are present. For example, minor inaccuracies in estimating object poses and shapes can lead to implausible configurations—such as unstable equilibria or object interpenetration—when these estimates are transferred to a simulator. This limitation hinders the ability to predict a scene’s dynamic behavior using a digital twin, a critical capability for simulation-based planning and the control of contact-rich behaviors.
In this work, we argue that estimating object poses and shapes necessitates a holistic reasoning approach over the entire scene, rather than analyzing individual objects in isolation. This approach must account for physical plausibility and object interactions. To address this, we introduce Picasso, a physics-constrained reconstruction pipeline that generates multi-object scene reconstructions by integrating geometry, non-penetration constraints, and physical principles. Picasso employs a rapid rejection sampling technique that evaluates multi-object interactions, utilizing an inferred object contact graph to direct the sampling process.
Additionally, we present the Picasso dataset, comprising 10 real-world scenes rich in contacts, complete with ground truth annotations. We also open-source a metric designed to quantify physical plausibility as part of our benchmark. Through extensive evaluations on both the new Picasso dataset and the YCB-V dataset, we demonstrate that Picasso significantly surpasses current state-of-the-art methods. The resulting reconstructions are not only more aligned with human intuition but also physically plausible.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC






