arXiv

Beyond Retrieval: Learning Compact User Representations for Scalable LLM Personalization

June 4, 2026 · Heng Cao, Fan Zhang, Jian Yao, Yujie Zheng, Changlin Zhao, Lu Hao, Yuxuan Wei, Wangze Ni, Huaiyu Fu, Yuqian Sun, Xuyan Mo · Original Source

Title: Advancing Scalable LLM Personalization Through Compact User Representations

Original: arXiv:2606.04547v1 Announce Type: cross Abstract: Personalizing large language models requires adapting model behavior to individual users while preserving robustness and deployment-scale efficiency. Existing approaches typically personalize LLMs either at the input level, by retrieving user histories or constructing profile prompts, or at the parameter level, by maintaining user-specific parameter-efficient modules. The former makes personalization sensitive to retrieval quality and prompt design, whereas the latter incurs storage and maintenance costs that grow with the user population. To address these limitations, we propose TAP-PER (Temporal Attentive Prefix for PERsonalization), a prefix-based framework that encodes user preferences as learnable representations, eliminating explicit prompt construction and replacing heavy per-user adapters with lightweight user-state prefix embeddings. Inspired by personalized recommendation systems, TAP-PER decomposes user modeling into user-state and query-conditioned components, and incorporates temporal signals to capture the evolving nature of user interests. Experiments on six LaMP tasks show that TAP-PER consistently outperforms prompt-based and model-based baselines across classification, rating, and generation settings. Moreover, TAP-PER uses 130x fewer per-user parameters than OPPU and roughly half the total parameter footprint of PER-PCS at the 1,000-user scale, demonstrating that scalable LLM personalization can be achieved without explicit prompt construction or heavy per-user adapters.

Rewritten: arXiv:2606.04547v1 Announce Type: cross Abstract: Achieving robust and efficient large-scale deployment of personalized large language models (LLMs) involves tailoring model responses to specific individuals. Current methods generally fall into two categories: input-level personalization, which relies on retrieving past user interactions or crafting profile-based prompts, and parameter-level personalization, which employs specialized, parameter-efficient modules for each user. The input-level approach is vulnerable to variations in retrieval accuracy and prompt engineering, while the parameter-level approach suffers from escalating storage and maintenance burdens as the user base expands. To overcome these challenges, we introduce TAP-PER (Temporal Attentive Prefix for PERsonalization), a framework that utilizes prefix-based learnable representations to encode user preferences. This method removes the need for explicit prompt engineering and substitutes resource-intensive per-user adapters with efficient user-state prefix embeddings. Drawing inspiration from recommendation engines, TAP-PER splits user modeling into distinct user-state and query-conditioned elements, integrating temporal data to reflect the dynamic shifts in user interests. Our evaluation across six LaMP tasks demonstrates that TAP-PER consistently surpasses both prompt-based and model-based benchmarks in classification, rating, and generation tasks. Furthermore, at a scale of 1,000 users, TAP-PER requires 130 times fewer parameters per user than OPPU and maintains a total parameter footprint approximately half that of PER-PCS. These results confirm that scalable LLM personalization is feasible without relying on explicit prompts or bulky per-user adapters.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC