When Is 0.1% Enough? Analyzing the Combined Effects of Dimensionality Reduction and Quantization on Text Embedding Compression
Title: Is a 0.1% Compression Ratio Sufficient? Evaluating the Synergy of Dimensionality Reduction and Quantization in Text Embedding Compression
Abstract:
State-of-the-art text embedding models typically generate high-dimensional vectors with real-valued components, leading to significant overhead in both storage requirements and computational processing. While various approaches have been introduced to mitigate these costs through either dimensionality reduction or quantization, the combined impact of employing both techniques simultaneously remains underexplored. This study provides a comprehensive analysis of compressing text embeddings by integrating dimensionality reduction with quantization. Our evaluation spans four distinct MTEB task families and utilizes four pre-trained embedding models. The findings reveal that merging these two methods yields compression ratios far superior to those achieved by either technique in isolation. Notably, in specific scenarios, embeddings can be compressed to merely 0.1% of their initial volume with negligible loss in performance. Furthermore, the research indicates that the most effective compression approach is contingent upon the specific task at hand.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





