MidSteer: Optimal Affine Framework for Steering Generative Models
Title: MidSteer: Optimal Affine Framework for Steering Generative Models
Abstract: The manipulation of intermediate representations has proven to be a potent technique for governing generative models, especially within contexts requiring post-deployment alignment and safety protocols. Yet, despite its widespread practical utility, the field has long lacked a robust theoretical underpinning. This study addresses this deficiency by rigorously formalizing the theory behind concept steering. We begin by connecting steering mechanisms to affine concept erasure, demonstrating that conventional methods for eliminating undesirable behaviors represent a specific instance of LEACE, a closed-form technique for affine erasure. Subsequently, we develop a structured theoretical foundation for concept switching, termed LEACE-Switch, and delineate the specific assumptions that allow it to serve as an optimal affine solution. Leveraging these insights, we propose MidSteer (Minimal Disturbance concept Steering), a more versatile affine framework for concept manipulation. MidSteer loosens the restrictive assumptions of prior models, facilitating directed transformations that minimize disturbance. Our evaluations show that MidSteer yields competitive results across diverse tasks, modalities, and architectures, encompassing both vision diffusion models and large language models.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





