PhyScene3D: Physically Consistent Interactive 3D Tabletop Scene Generation
Title: PhyScene3D: Generating Physically Consistent Interactive 3D Tabletop Scenes
Abstract
Creating 3D tabletop environments that adhere to physical laws remains a critical yet largely uncharted area for interactive and general-purpose robotic learning. This difficulty arises primarily from the complexity of dense object hierarchies and the irregular nature of object affordances. In this context, an interactive scene is defined as a collision-free, physically valid environment that can be directly imported into physics simulators. Current techniques, which span from decoupled symbolic solvers to end-to-end regression models, frequently encounter issues such as error propagation or overfitting to noisy labels that contain widespread physical inconsistencies.
To overcome these shortcomings, we present PhyScene3D, a novel framework that reimagines scene generation as a Human-Mimetic Constructive Process. Central to this approach is the Cognitive Topological Reasoning Chain (CTRC), which breaks down scene synthesis into a sequential, anchor-conditioned workflow. This method utilizes a 3D AABB-based placement strategy, thereby enforcing a robust structural inductive bias. Furthermore, to mitigate the effects of imperfect supervision and ensure physical feasibility, we propose Physics-Aware Denoising Alignment (PADA). This component combines a differentiable Signed Distance Field (SDF) with Test-Time Optimization (TTO) to map generated scenes onto a physics-compliant manifold without sacrificing semantic fidelity. Our experimental results indicate that PhyScene3D surpasses current state-of-the-art methods in both semantic accuracy and physical validity. Notably, it achieves a 40% decrease in the scene-wise collision rate when compared to human-annotated training data.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





