arXiv

Learning Fine-grained Parameter Sharing via Sparse Tensor Decomposition

Title: Achieving Fine-Grained Parameter Sharing Through Sparse Tensor Decomposition

Abstract:

While large neural networks deliver state-of-the-art results across numerous applications, their massive scale creates significant barriers to deployment on devices with limited resources. Although various compression techniques exist, cross-layer parameter sharing has seen limited exploration within transformer architectures. To address this gap, we propose Fine-grained Parameter Sharing (FiPS), a comprehensive framework designed to compress Multi-Layer Perceptrons (MLPs) in transformers. FiPS integrates low-rank factorization, sparsity, and cross-block parameter sharing into a single, unified optimization process.

The method works by concatenating MLP weight matrices from a selected group of transformer blocks and decomposing them into two components: a shared basis and sparse, layer-specific projection matrices. Both components are initialized using Singular Value Decomposition (SVD) and are jointly refined through the minimization of block-wise reconstruction error.

Our experiments demonstrate that FiPS can reduce the size of Vision Transformers (ViTs) by as much as 33% with a top-1 accuracy drop of less than 1% on ImageNet-1k. When fine-tuning is applied, compression ratios reach up to 57%. For Large Language Models (LLMs), FiPS achieves compression rates of up to 20%, surpassing existing SVD-based techniques in both perplexity and downstream benchmark performance at equivalent compression levels. Furthermore, when paired with Quantization-Aware Training (QAT), a 3-bit FiPS implementation on the Gemma-2-2B model yields lower perplexity than 2-bit QAT alone, while maintaining an identical 8x compression ratio. These findings confirm that fine-grained parameter sharing is a viable and efficient strategy for compressing transformer MLPs.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...