Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?
Title: Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?
Abstract:
While video generation models have achieved remarkable success in creating visually striking content, their outputs are currently restricted to the digital realm. This limitation raises a critical inquiry: how accurately do these models represent the physical world when their creations are translated into real-world actions? To address this, we propose robotic manipulation as a tangible metric for evaluating this capability. The core hypothesis is that if a model has genuinely internalized the laws of physics, the movements it generates should be translatable into functional robot behaviors.
To test this, we introduce Dream.exe, an evaluation framework that bridges the gap between video generation and physical execution. The system operates via a video-to-execution pipeline: starting with a scene image and a specific task description, Dream.exe generates a manipulation video, converts the depicted motions into robot trajectories, and runs these trajectories in a physics simulator. This process provides a grounding signal—specifically, execution success—that purely visual quality metrics fail to capture.
We utilized this pipeline to assess eight distinct models, including leading closed-source generators, open-source alternatives, and models specialized for robotics. Our benchmark comprises 101 manually curated manipulation tasks designed to test three varying levels of physical complexity, evaluated based on visual quality, trajectory fidelity, and execution success.
The results reveal that several models can achieve measurable execution success, indicating that the generative priors learned from massive internet-scale datasets already contain significant physical knowledge. However, we found that high visual quality is a poor indicator of executability. This discrepancy highlights a crucial gap in model capabilities that standard visual evaluations overlook. The Dream.exe framework will be made available as open-source software at https://github.com/showlab/Dream.exe.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






