arXiv

Balancing Knowledge Distillation for Imbalance Learning with Bilevel Optimization

Title: Optimizing Knowledge Distillation for Imbalanced Learning via Bilevel Strategies

Abstract: Knowledge distillation facilitates the transfer of expertise from a high-capacity teacher model to a more compact student, relying on a blend of hard and soft loss functions. However, in scenarios involving imbalanced datasets, maintaining a static weight ratio between these two loss types can destabilize the learning trajectory. While recent literature attempts to reweight these components within long-tailed distributions, the majority of existing approaches fail to adjust weights at the individual sample level and overlook the student’s dynamic behavior throughout training.

To overcome these limitations, we introduce BiKD, a bilevel optimization framework designed to dynamically balance hard and soft losses on a per-sample basis. This approach utilizes a weight generation network that derives adaptive weights for each sample, informed by a small, balanced validation set. Consequently, the student model is trained using a flexible combination of weighted hard and soft losses, enabling it to optimize both terms effectively. Additionally, we present a multi-step Stochastic Gradient Descent (SGD) strategy to enhance both the accuracy and efficiency of the weight model’s optimization. Experimental results on long-tailed CIFAR-10 and CIFAR-100 datasets demonstrate that our method outperforms contemporary balanced distillation techniques across various imbalance factors.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...