Global News Digest

arXiv

BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization

Title: BitsMoE: Spectral Energy-Guided Bit Allocation for Efficient MoE LLM Quantization

Abstract:

While Mixture-of-Experts (MoE) large language models mitigate per-token computational costs via sparse expert activation, their practical deployment is hindered by significant memory demands, as all expert weights must remain stored in memory. Current compression techniques for MoE architectures face limitations in the ultra-low-bit range: pruning permanently eliminates model capacity, whereas coarse-grained quantization cannot effectively assign bits based on the varying importance of experts and weight directions. To address this, we introduce BitsMoE, a framework for MoE LLM quantization that utilizes spectral energy guidance for bit allocation.

BitsMoE employs Singular Value Decomposition (SVD) to break down each MoE layer into a shared basis and expert-specific spectral factors. The shared basis, which captures common structures across experts, is kept unquantized to preserve integrity, while the expert-specific factors serve as the units for fine-grained quantization. To assign bit-widths to these units, BitsMoE treats spectrum-wise mixed-precision quantization as an activation-aware reconstruction surrogate. It then resolves an integer linear program designed to minimize estimated reconstruction loss within a predetermined bit budget.

Evaluations on various MoE LLMs demonstrate that BitsMoE significantly curtails accuracy drops in ultra-low-bit scenarios. In tests involving 2-bit quantization of Qwen3-30B-A3B-Base, BitsMoE outperformed GPTQ by accelerating the quantization process by 12.3 times, boosting average accuracy by 27.83 percentage points, and enhancing decoding speed by 1.76 times. The source code and model are accessible at https://github.com/zjiayu064/BitsMoE.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.