arXiv

DAPD: Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs

June 2, 2026 · Bumjun Kim, Dongjae Jeon, Moongyu Jeon, Albert No · Original Source

Title: DAPD: Leveraging Attention for Dependency-Aware Parallel Decoding in Diffusion LLMs

Abstract:

Implementing parallel decoding in Diffusion Large Language Models (dLLMs) presents a significant challenge: while individual denoising steps yield only token-wise marginal distributions, simultaneously revealing multiple tokens necessitates an understanding of their interdependencies. To address this, we introduce Dependency-Aware Parallel Decoding (DAPD), a straightforward, training-free approach that employs self-attention to construct a conditional dependency graph among masked tokens. Within this framework, graph edges denote strong token interactions at each iteration, whereas the absence of edges signifies weak dependence. By treating parallel decoding as a problem of identifying an independent set on this graph, the method allows for the parallel unmasking of selected tokens. This strategy effectively prevents the simultaneous updating of tightly coupled tokens, eliminating the need for auxiliary models or retraining. Our experiments with LLaDA and Dream demonstrate that DAPD enhances the accuracy-to-step ratio compared to current techniques. Furthermore, it facilitates more widely distributed parallel updates, thereby better capitalizing on the any-order generation strengths inherent to dLLMs. The project resources can be accessed at https://ai-isl.github.io/dapd.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC