LiSeCo: Linear Semantic Control for Language Generation
Title: LiSeCo: Linear Semantic Control for Language Generation
Abstract:
As Large Language Models (LLMs) become increasingly integral to high-stakes applications, there is a growing demand for language generation techniques that offer both computational efficiency and reliable performance assurances. To meet this challenge, we leverage a prevailing perspective that concept semantics within LLMs are linearly encoded in their latent space. Specifically, we posit that natural language generation corresponds to a trajectory traversing this continuous semantic manifold, manifested through the model’s hidden activations.
This conceptual framework enables the application of control theory to text generation within the latent space. Building on this, we introduce Linear Semantic Control (LiSeCo), a lightweight intervention method that operates without gradients. LiSeCo dynamically redirects generation trajectories away from areas associated with unwanted meanings. The method functions by directly intervening, in an online manner, with the activations of the token currently being generated, specifically within the embedding space.
Crucially, LiSeCo goes beyond merely guiding activations toward a target area. By employing classical control theory techniques, it ensures context-dependent precision, guaranteeing that activations are confined to a specific, pre-defined region of the embedding space that aligns with acceptable semantics. The intervention is calculated via a closed-form solution derived from an optimal controller formulation, thereby minimizing any impact on generation latency. This mechanism for controlling activations in embedding space facilitates fine-grained manipulation of the attributes within the generated sequence. Our experiments demonstrate that LiSeCo effectively manages various tasks—including toxicity, sentiment, and language (English/Spanish) steering—while preserving the overall quality of the text.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC


