arXiv

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

June 2, 2026 · Yuchen Wang, Haonan Wang, Yu Guo, Honglong Yang, Xiaomeng Li · Original Source

Title: Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

Abstract:

Translating non-invasive EEG signals into natural language represents a highly promising but difficult endeavor. Despite recent advancements, existing state-of-the-art models are hindered by three critical limitations: Semantic Bias, which causes generated outputs to devolve into generic linguistic patterns; Signal Neglect, where models over-rely on Large Language Model (LLM) priors to produce fluent text that lacks grounding in actual neural data; and the "BLEU Trap," a phenomenon where high-frequency function words artificially boost n-gram scores, thereby obscuring the absence of genuine semantic accuracy.

To address these issues, we introduce SemKey, a novel multi-stage architecture that moves away from standard end-to-end approaches. SemKey enforces signal-grounded generation by leveraging four distinct semantic objectives: sentiment, topic, length, and surprisal. These semantic anchors are extracted directly from EEG embeddings and integrated with an Active Retrieval Decoding mechanism. This design forces the LLM to base its token generation on neural signals rather than defaulting to linguistic priors. Additionally, we dismantle the BLEU Trap by implementing a rigorous evaluation protocol that utilizes distribution-based and retrieval metrics, such as Fr\'echet Distance. Our extensive experiments show that SemKey significantly reduces hallucinations on noisy inputs and achieves state-of-the-art performance under these robust evaluation frameworks. The code will be made available at https://github.com/xmed-lab/SemKey upon acceptance.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC