Self-Soupervision: Cooking Model Soups without Labels
Title: Self-Soupervision: Preparing Model Soups Without Labels
Abstract: Model soups represent a peculiar yet highly effective method of combining parameters. This process involves taking a base model (the stock), fine-tuning it into several distinct variants (the ingredients), and then blending their parameters back into a single entity (the soup) to enhance predictive performance. Although existing soup techniques rely on supervised learning and optimize for identical losses on labeled datasets, our approach, termed Self-Soupervision, extends the concept of soups to the realm of self-supervised learning (SSL). By utilizing Self-Souping, we can customize ingredients using new data sources, such as unlabeled data from transfer tasks or data shifts designed to improve robustness. Our experiments demonstrate that applying Self-Souping to corrupted test data, followed by fine-tuning on uncorrupted training data, increases robustness by 3.5% on ImageNet-C and 7% on LAION-C. Furthermore, Self-Soupervision enables the use of a wide array of SSL algorithms to generate the diverse ingredients necessary for more resilient models. For the first time, we demonstrate that these ingredients can vary not only in their SSL hyperparameters but also, surprisingly, in their underlying SSL algorithms. We present soups composed of ingredients from MAE, MoCoV3, MMCR, and LeJEPA, which achieve higher accuracy than any individual SSL component alone.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



