arXiv

Decomposing how prompting steers behavior

Title: Unpacking the Mechanisms of Prompt-Driven Behavioral Control

Abstract:

While prompting effectively directs the behavior of large language models (LLMs) and vision-language models (VLMs) without requiring weight adjustments, the precise manner in which instructional changes alter internal representations to generate these behaviors remains poorly understood. To address this, we propose a nested geometric decomposition framework that conceptualizes prompting as a transformation of the representational geometry associated with the content following the prompt.

For each pair of prompts, we align the representations of identical stimuli under different instructions using a hierarchy of increasingly complex stimulus-invariant maps: translation, rigid transformation with uniform scaling, sequential axis scaling, affine transformation, and nonlinear transformation. We subsequently evaluate the causal impact of each map by substituting a single layer’s hidden state (derived from prompt A) for held-out stimuli with its mapped equivalent, thereby measuring the extent to which the representational geometry and behavior of prompt B are recovered.

Our analysis, conducted across three LLMs, three VLMs, and six datasets covering text and images (encompassing style, emotion, scene content, and numerical data), demonstrates that prompts consistently reshape internal representations to align with the structure of the instructed task. Variance decomposition via cross-validation indicates that a significant portion of the activation changes induced by prompts is explained by shape-preserving maps, particularly translation and rigid transformations with uniform scaling. Furthermore, tier profiles expose routing strategies that vary by model and task across different layers.

Notably, while translation and rigid tiers enhance behavioral agreement, the affine transformation tier is the first to nearly restore the target prompt’s task geometry, resulting in corresponding improvements in behavior. This finding implies that cross-dimensional linear mixing serves as a primary mechanism through which prompts reorganize representations to fit instructed task structures. Ultimately, our framework breaks down prompt-induced representational shifts into interpretable geometric components, elucidating how models route task-relevant information to execute prompt-driven actions.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...