arXiv

DSL-Topic: Improving Topic Modeling by Distilling Soft Labelsfrom Language Models

June 4, 2026 · Raymond Li, Amirhossein Abaskohi, Chuyuan Li, Gabriel Murray, Giuseppe Carenini · Original Source

Title: Enhancing Topic Modeling Through Language Model Distillation of Soft Labels

Abstract: Conventional neural topic models generally rely on optimizing the reconstruction of Bag-of-Words (BoW) representations, a process that frequently neglects contextual nuances and faces challenges related to data sparsity. To address these limitations, this study presents a new training framework for topic models known as Distilling Soft Labels (DSL) from Language Models (LMs). By projecting next-token probabilities—conditioned on a specific prompt—onto a predefined vocabulary, the method generates contextually rich reconstruction signals. The topic models are then trained to reconstruct these soft labels using hidden states derived from the LM. This approach yields superior topics that better reflect the corpus’s thematic architecture. Comprehensive experiments reveal that DSL significantly enhances both topic coherence and assignment accuracy compared to current baseline methods. Furthermore, we propose a retrieval-based evaluation metric, which indicates that our method substantially surpasses existing techniques in locating semantically related documents, thereby underscoring its value for applications focused on retrieval.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Bloomberg

Dimon and SpaceX Executives to Pitch IPO to Clients

June 4, 2026

JPMorgan Chase CEO Jamie Dimon and SpaceX executives are pitching IPO details to clients.

Financial Times

Europe is finally flexing its innovation muscles

June 4, 2026

The EU’s new tech sovereignty package signals a positive shift from defensive regulation to proactive innovation, markin...

Bloomberg

Apollo’s Zelter Expects High-Grade Debt Sales to Top US Treasuries

June 4, 2026

Apollo’s Zelter expects high-grade debt sales to surpass US Treasuries. He anticipates investment-grade debt outperformi...

Bloomberg

EU Insurance Watchdog Warns on Loan Risks

June 4, 2026

EIOPA warns insurers to closely monitor loan risks, though initial reports lack specific details on the nature or scope ...

Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

June 4, 2026

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

June 4, 2026

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Top international news

DSL-Topic: Improving Topic Modeling by Distilling Soft Labelsfrom Language Models

Related Articles

Dimon and SpaceX Executives to Pitch IPO to Clients

Europe is finally flexing its innovation muscles

Apollo’s Zelter Expects High-Grade Debt Sales to Top US Treasuries

EU Insurance Watchdog Warns on Loan Risks

Glazer Family Members Said to Study Manchester United Stake Sale

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines