Stable Velocity: A Variance Perspective on Flow Matching
Title: Stable Velocity: A Variance Perspective on Flow Matching
Abstract: Although flow matching offers an elegant approach, its dependence on single-sample conditional velocities results in training targets with high variance. This instability hampers optimization and delays convergence. We address this by explicitly modeling the variance, revealing two distinct phases: a high-variance zone near the prior distribution, which poses significant optimization challenges, and a low-variance zone near the data distribution, where conditional and marginal velocities align closely. Building on these findings, we introduce Stable Velocity, a comprehensive framework designed to enhance both the training and sampling processes. During training, we present Stable Velocity Matching (StableVM), an unbiased objective that reduces variance, alongside Variance-Aware Representation Alignment (VA-REPA). The latter dynamically reinforces auxiliary supervision specifically within the low-variance regime. For inference, we demonstrate that the dynamics in the low-variance phase allow for closed-form simplifications, facilitating Stable Velocity Sampling (StableVS). This method provides acceleration without requiring finetuning. Our extensive evaluations on ImageNet at $256\times256$ resolution, as well as on large pretrained text-to-image and text-to-video models such as SD3.5, Flux, Qwen-Image, and Wan2.2, confirm consistent gains in training efficiency. Furthermore, we achieve more than $2\times$ faster sampling speeds in the low-variance regime while maintaining sample quality. The source code is accessible at https://github.com/linYDTHU/StableVelocity.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





