arXiv

HyperVQ: Enabling Hyperprior Entropy Modeling for VQ-Based Generative Image Compression

Title: HyperVQ: Facilitating Hyperprior Entropy Modeling in VQ-Based Generative Image Compression

Abstract:

While Vector Quantization (VQ) driven generative image compression has delivered exceptional perceptual fidelity, current VQ codecs are hindered by two core constraints. Firstly, these systems typically depend on static frequency distributions rather than efficient, content-adaptive entropy modeling, resulting in suboptimal coding efficiency. Secondly, the fundamental tension between discrete indices and continuous priors obstructs true end-to-end joint Rate-Distortion (RD) optimization.

To address these challenges, we introduce HyperVQ, a rigorous framework designed to build a high-performance hyperprior entropy foundation for VQ-based codecs. The central premise of HyperVQ is to relocate probability modeling exclusively into the continuous embedding space. Rather than forecasting probabilities for discrete symbols directly, HyperVQ generates a high-dimensional continuous multivariate Gaussian distribution for the continuous latents. By regarding discrete codebook entries as fixed "anchors" within this space, the method transforms the continuous Gaussian density into categorical index probabilities through relative distance calculations. This sophisticated approach yields a potent, spatially-adaptive entropy engine and ensures that the cross-entropy rate objective remains fully differentiable. Consequently, the network can actively and dynamically optimize the RD trade-off throughout the training process.

To guarantee practical applicability, we engineered the lightweight H Block and the Probability Estimation Engine (PEE) to support highly parallel, millisecond-level inference. Our experiments reveal that HyperVQ functions as a universal module compatible with various VQ architectures, including single-scale, large-codebook, and RVQ models. It achieves an average bitrate reduction of 18.5%, which is 7.28 times greater than the savings provided by traditional Huffman coding. This work establishes a robust, RD-controllable base for the next generation of generative image compression technologies.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...