AAD-1: Asymmetric Adversarial Distillation for One-Step Autoregressive Video Generation
Title: AAD-1: Asymmetric Adversarial Distillation for One-Step Autoregressive Video Generation
Abstract:
This paper introduces AAD-1, a novel framework leveraging Asymmetric Adversarial Distillation to facilitate one-step autoregressive image-to-video generation. Although current state-of-the-art approaches utilize adversarial distillation, they are often plagued by motion collapse and training instability, which frequently lead to static video outputs. AAD-1 overcomes these limitations through innovative architectural and training strategies.
Central to our approach is an architectural shift that eliminates symmetry between the generator and the discriminator. We maintain the generator’s causal structure to ensure robust autoregressive sampling, while the discriminator operates bidirectionally across the entire spatiotemporal context. This design allows the discriminator to generate a single, holistic realism score for the complete video sequence, thereby effectively identifying global temporal failures and long-range drift—common causes of motion collapse in autoregressive models.
To further ensure training stability, we propose a phased training strategy. This method initially employs distribution matching to bootstrap a stable one-step generator, creating a warm-up phase that aligns the student’s distribution more closely with the teacher’s prior to the onset of adversarial distillation. Extensive evaluations on VBench confirm that AAD-1 delivers state-of-the-art performance in the realm of one-step autoregressive video generation.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC




