A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity
Title: A Common Valence Axis Unites Contemporary LLMs and Human EEG: The Saturation Regularity
Abstract:
Large language models (LLMs) have evolved into potent representation learners, with their internal features showing growing alignment with human cognitive processes. This study investigates whether contemporary LLMs can act as a framework for deciphering neural representations in the human brain, specifically targeting emotional valence within EEG data. Initially, we construct a one-dimensional valence vector, referred to as the V-axis, derived from modern LLMs using merely nine sentences designed to evoke emotions. The validity of this axis is confirmed through zero-shot transfer to sentiment benchmarks and by demonstrating cross-model consistency across fourteen different LLMs. Subsequently, we demonstrate that this LLM-derived direction corresponds to human neural activity. In a public EEG dataset comprising 123 subjects viewing affective videos, a single linear projection of EEG features effectively tracks the V-axis position of each stimulus. Furthermore, 36 EEG emotion classifiers, trained without any prior exposure to the V-axis, spontaneously reconstruct this identical direction within their internal representations. This suggests that the same underlying valence structure arises in both language models and human electrophysiology.
However, this convergence does not translate into an effective training signal. We evaluated twenty-five distinct alignment strategies, encompassing knowledge distillation, representational similarity, contrastive learning, and topographic losses. None of these methods enhanced decoding performance; in fact, sixteen of them significantly degraded accuracy. We formalize this phenomenon as the "saturation regularity": once task labels alone guide a brain-decoding network toward the target direction, additional supervision primarily distorts an already saturated basin, while the load-bearing within-class residual receives insufficient useful gradient. This regularity also highlights where future improvements should focus: the residual subspace that remains inaccessible to supervision. Guided by this insight, we employ an ensemble approach leveraging residual diversity rather than supervising the basin. This strategy improved balanced accuracy by 10.5% over the previous best result on the FACED dataset, with the same effect replicated on the SEED-V dataset.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




