StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement
Title: StressDream: Directing Video World Models for Resilient Policy Assessment and Enhancement
Abstract:
Video world models (WMs) have emerged as a promising tool for evaluating and enhancing policies by generating realistic future observations based on an ego-robot’s actions. Although WMs are capable of modeling distributions over potential futures, standard policy evaluation and improvement strategies often depend on nominal imaginations. This reliance can cause critical, high-impact consequences of robot actions to be overlooked unless an impractically large number of samples are generated. To facilitate robust policy evaluation and improvement through WM simulations, we introduce StressDream. This method guides imaginations toward high-impact but plausible outcomes—defined at inference time—by optimizing the initial noise inputs of diffusion-based WMs.
Optimizing high-dimensional noise presents significant challenges, as the process must account for subtle, scene-specific target events within generated videos while preventing the noise from drifting into out-of-distribution (OOD) territory, which results in unrealistic simulations. We tackle these issues using two complementary objectives: a semantic objective, which leverages a Vision-Language Model to provide informative gradients by analyzing the generated video content, and a plausibility objective, which ensures the optimized noise remains within realistic bounds. Utilizing state-of-the-art video world models for autonomous driving and robotic manipulation, we demonstrate that StressDream successfully directs simulations toward high-impact, plausible outcomes specified via text at inference time, such as task failures. This capability enables robust policy evaluation and improvement by pinpointing actions whose plausible futures involve undesirable results. Video demonstrations are accessible at https://junwon.me/StressDream/.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




