What Cosine Similarity of Label Representations Can and Cannot Tell us
Title: The Limitations and Specific Utility of Cosine Similarity in Label Representations
Abstract
While cosine similarity is a standard metric for evaluating the likeness of vector representations within neural networks, it does not inherently correlate with model probabilities. This study demonstrates that for softmax classifiers—encompassing both autoregressive language models and image classifiers—the cosine similarity between label representations, referred to as "unembeddings" in this context, offers no insight into the probabilities the model assigns. We provide a proof showing that for any two given unembeddings, one can construct an alternative model that yields identical probability outputs for all inputs, yet exhibits a cosine similarity of either 1 or -1 between those representations.
Conversely, we find that for sigmoid classifiers, which allow for multiple labels per input, the complete set of pairwise cosine similarities between unembeddings fully determines the possible label combinations. In contrast, for softmax classifiers that generate a ranked list of labels from most to least probable, understanding the potential predictions requires knowledge of the pairwise cosine similarities among all differences of unembeddings. Ultimately, we argue that interpreting the cosine similarity of unembeddings in isolation, without considering the specific classifier that generated them, is misleading.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





