Global Sketch-Based Watermarking for Diffusion Language Models
Title: Implementing Global Sketch-Based Watermarking in Diffusion Language Models
Abstract: While watermarking techniques for language models have been thoroughly investigated within the autoregressive framework—characterized by the sequential generation of tokens—existing approaches predominantly rely on local-context strategies. These methods typically modify the probability distribution of the subsequent token based on its preceding context. In contrast, diffusion language models facilitate the joint sampling of distributions across multiple unresolved positions, thereby rendering the additive statistics of the complete sequence manageable during the generation process. This paper introduces a watermarking scheme specifically designed for masked diffusion language models, which governs a global, vector-valued sketch representation of the text. Unlike context-dependent approaches, this sketch-based formulation separates the detection mechanism from the specific local contexts encountered during generation. Consequently, it yields a statistic that is insensitive to token order and establishes a watermarking rule that does not simply appear as a bias toward specific tokens. We provide a comprehensive analysis of the method’s robustness, soundness, and distortion characteristics.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




