arXiv

STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models

Title: STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models

Abstract:

Diffusion large language models (DLLMs) have recently surfaced as a compelling alternative to traditional autoregressive LLMs. By leveraging bidirectional context and iterative masked denoising, these models generate text in a novel manner. However, their substantial model architectures and the computational demands of iterative denoising create significant memory and processing bottlenecks, driving the need for post-training quantization to facilitate efficient deployment.

This study highlights two primary obstacles in quantizing DLLMs to low bit-widths: temporal error accumulation and state-dependent activation disparity. Within every denoising step, masked and unmasked tokens display distinct activation distributions. Furthermore, quantization errors have the potential to compound across steps throughout the iterative decoding phase.

To overcome these hurdles, we introduce STaR-Quant, a post-training quantization (PTQ) framework designed to maintain consistency across state and time for DLLMs. STaR-Quant features State-Guided Activation Transformation (SGAT), which utilizes a unified static weight-side transformation to direct masked and unmasked tokens into separate activation transformation spaces. Additionally, it incorporates Temporal Attention Compensation (TAC), a mechanism that rectifies quantized attention representations through a lightweight block-diagonal affine mapping.

Experimental results on various representative DLLMs show that STaR-Quant consistently enhances low-bit weight-activation quantization performance compared to robust PTQ baselines. Moreover, the framework achieves substantial efficiency gains, offering up to a 3.14x reduction in memory usage and a 1.69x speedup relative to FP16 deployment.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Apollo’s Zelter Expects High-Grade Debt Sales to Top US Treasuries
Bloomberg

Apollo’s Zelter Expects High-Grade Debt Sales to Top US Treasuries

Apollo’s Zelter expects high-grade debt sales to surpass US Treasuries. He anticipates investment-grade debt outperformi...

EU Insurance Watchdog Warns on Loan Risks
Bloomberg

EU Insurance Watchdog Warns on Loan Risks

EIOPA warns insurers to closely monitor loan risks, though initial reports lack specific details on the nature or scope ...

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...