DerMAE: Improving skin lesion classification through conditioned latent diffusion and MAE distillation
Title: DerMAE: Enhancing Skin Lesion Classification via Conditioned Latent Diffusion and MAE Distillation
Abstract: Deep learning models for skin lesion classification frequently encounter significant challenges due to class imbalance, as malignant instances are notably scarce. This disparity often results in skewed decision boundaries during the training process. To mitigate this issue, our approach leverages class-conditioned diffusion models to produce synthetic dermatological imagery, which is then used for self-supervised Masked Autoencoder (MAE) pretraining. This strategy allows large Vision Transformer (ViT) architectures to acquire robust features pertinent to the domain. Recognizing the necessity for lightweight models in real-world clinical environments, we further employ knowledge distillation to transfer these learned representations to a more compact ViT student model optimized for mobile deployment. Our experimental findings demonstrate that combining MAE pretraining on synthetic data with distillation techniques not only boosts classification accuracy but also facilitates efficient on-device inference, thereby supporting practical clinical application.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





