arXiv

MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

Title: MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

Original: arXiv:2604.24374v2 Announce Type: replace

Abstract: While representation learning serves as a cornerstone of Natural Language Processing (NLP), developing embeddings that maintain efficacy across varying computational constraints remains a significant hurdle. Matryoshka Representation Learning (MRL) addresses this by offering a flexible inference approach utilizing nested embeddings. Nevertheless, acquiring such structures necessitates deliberate coordination regarding how information is distributed across both embedding dimensionality and model depth. To address this, we introduce MIPIC (Matryoshka Representation Learning via Self-Distilled Intra-Relational Alignment and Progressive Information Chaining), a comprehensive training framework aimed at generating Matryoshka representations that are both structurally coherent and semantically dense.

MIPIC ensures structural consistency across dimensions through Self-Distilled Intra-Relational Alignment (SIA). This mechanism aligns the geometric and attention-based relationships at the token level between complete and truncated representations, employing top-k CKA self-distillation. In tandem, the framework facilitates semantic consolidation across depth via Progressive Information Chaining (PIC). PIC operates as a scaffolded alignment strategy that progressively transfers established task semantics from deeper layers to earlier ones. Comprehensive evaluations on STS, NLI, and classification benchmarks—covering a wide range of models from TinyBERT to BGEM3 and Qwen3—showcase that MIPIC produces Matryoshka representations that are highly competitive across all capacity levels. Notably, the method demonstrates substantial performance gains in scenarios involving extreme low-dimensional constraints.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...