arXiv

Consistent Diffusion Language Models

June 2, 2026 · Hasan Amin, Yuan Gao, Yaser Souri, Subhojit Som, Ming Yin, Rajiv Khanna, Xia Song · Original Source

Title: Consistent Diffusion Language Models

Abstract

While diffusion language models (DLMs) offer a compelling alternative to autoregressive approaches by enabling parallel, sublinear-time generation, their practical utility has been hindered by the need for hundreds of refinement steps to produce high-quality samples. In continuous spaces, accelerating diffusion is commonly achieved through consistency training along the probability-flow ordinary differential equation (ODE). However, adapting this method directly to discrete diffusion is problematic because no equivalent sample-space ODE exists.

To address this challenge, we propose that the exact posterior bridge serves as the appropriate discrete counterpart. This bridge represents the closed-form conditional distribution connecting any two noise levels and is applicable to various corruption schemes, including uniform and masked diffusion. Leveraging this insight, we present Multi-Path Discrete Consistency (MPDC), a novel framework that trains a denoiser to remain invariant in expectation across these stochastic paths. We implement this principle as the Consistent Diffusion Language Model (CDLM), a single-stage training approach that eliminates the need for a pre-trained teacher model.

Theoretical analysis shows that our CDLM objective unifies several existing methods, recovering masked diffusion, continuous consistency models, and both progressive and discrete distillation as either analytic limits or empirical approximations. In empirical evaluations, CDLM achieves new state-of-the-art results in both conditional and unconditional text generation. It consistently surpasses robust base discrete diffusion models and frequently outperforms multi-stage distilled baselines across various sampling budgets, with the most significant improvements observed in few-step regimes. These findings establish CDLM as a scalable and principled foundation for the next wave of fast, high-fidelity discrete generative modeling.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC