Letting Tutor Personas Speak Up for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization
Title: Empowering LLMs with Tutor Personas: Deriving Steering Vectors from Dialogue via Preference Optimization
Abstract
As large language models (LLMs) emerge as a dominant form of generative artificial intelligence (AI), their integration into educational tutoring has gained significant traction. However, existing research on LLM-based tutoring often focuses on training a monolithic tutor policy, thereby failing to account for the rich diversity of pedagogical approaches found in practice. In authentic tutor-student exchanges, instructional intent is executed through adaptive strategies; tutors dynamically adjust scaffolding levels, directiveness, feedback mechanisms, and affective support to meet individual learner needs. These nuanced variations significantly influence both dialogue dynamics and student engagement.
This study investigates how tutor personas, extracted from human tutor-student dialogues, can be utilized to steer LLM behavior without the need for explicit prompt engineering. We develop a steering vector through preference optimization, identifying a specific direction in activation space that directs model outputs toward distinct tutor personas. Our findings indicate that this vector effectively captures tutor-specific variability across various dialogue contexts. It enhances semantic alignment with actual tutor utterances and boosts performance in preference-based evaluations, all while maintaining high lexical similarity to the original data. Furthermore, an examination of the learned scaling coefficients uncovers an interpretable structure among tutors, reflecting consistent behavioral distinctions. These outcomes highlight that activation steering provides a robust and transparent method for managing tutor-specific variations in LLMs by leveraging signals directly sourced from human conversation data.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





