Global News Digest

arXiv

LLMs Need Encoders for Semantic IDs Too

LLMs Require Encoders for Semantic IDs as Well

arXiv:2606.00324v1
Announcement Type: Cross

Abstract:

In multimodal large language models (LLMs), dedicated encoders are essential for connecting non-textual modalities—such as vision encoders for images or depth models for audio codec tokens—since raw token embeddings fail to capture modality-specific structures. This paper posits that Semantic IDs (SIDs), which serve as hierarchical codes in generative recommendation systems, represent another distinct modality. In this context, the meaning of a SID level token is contingent upon its prefix context. However, current approaches typically incorporate SID tokens directly into the vocabulary, relying solely on training to infer these context-dependent meanings from the ground up.

To address this, we introduce PrefixMem, a lightweight SID encoder that utilizes prefix n-gram memory tables. This architecture delivers structured, prefix-conditioned representations to the LLM at SID token positions. Similar to vision encoders in multimodal architectures, PrefixMem can undergo independent pre-training before being integrated into any LLM for joint fine-tuning.

Our evaluation, conducted on large-scale data from Pinterest across various LLM families, demonstrates that PrefixMem enhances deepest-level SID accuracy by as much as 46% (relative) and boosts full-SID retrieval recall by up to 22% (relative) while maintaining matched training compute. The encoder’s advantages are particularly pronounced in challenging cases where greedy decoding falls short, yielding relative accuracy improvements of up to 77%. These findings confirm that, much like other non-language modalities, SID tokens derive significant benefit from the use of a dedicated encoder.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ā€˜as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ā€˜as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers ā€œas much as possible,ā€ emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.