IdEst: Assessing Self-Supervised Learning Representations via Intrinsic Dimension
Title: IdEst: Evaluating Self-Supervised Learning Representations Through the Lens of Intrinsic Dimension
Abstract:
While self-supervised learning (SSL) has become a dominant approach for extracting meaningful features from unlabeled data, its standard evaluation method—linear probing—suffers from high computational demands, sensitivity to hyperparameters, and a lack of insight into the underlying geometry of the representation space. Drawing on the established link between neural network generalization and intrinsic dimension (ID), we introduce IdEst, a novel technique for quantifying the ID of SSL representations using the Minimum Spanning Tree dimension estimator ($\mathrm{dim}_\mathrm{MST}$). Our experiments, conducted across a wide range of datasets, model architectures, and pretraining objectives, reveal that IdEst exhibits a strong correlation with downstream performance in linear probing tasks. Additionally, we show that IdEst facilitates efficient hyperparameter tuning, offering a substantial reduction in computational overhead compared to supervised baselines. These findings position intrinsic dimensionality as a robust, geometrically grounded metric for assessing SSL representations, serving as a valuable complement to traditional supervised probing methods.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



