ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents
Title: ChatSOP: A Controllable LLM Dialogue Framework Using SOP-Guided MCTS Planning
Abstract: While Large Language Model (LLM)-powered dialogue agents demonstrate exceptional capabilities across a wide array of tasks, their lack of controllability persists as a significant hurdle. This limitation frequently results in unfocused exchanges or the failure to complete specific objectives, despite the agents’ improved ability to comprehend user intent and generate human-like responses. To mitigate these issues, we propose integrating Standard Operating Procedures (SOPs) to strictly regulate the flow of conversation. In this work, we present ChatSOP, a new planning framework that leverages SOP-guided Monte Carlo Tree Search (MCTS) to significantly boost the controllability of LLM-driven dialogue systems.
To support this approach, we have developed a dataset featuring SOP-annotated dialogues across multiple scenarios. This data was generated via a semi-automated role-playing system powered by GPT-4o and underwent rigorous manual quality assurance. Furthermore, we introduce an innovative methodology that combines Chain of Thought reasoning with supervised fine-tuning to predict SOPs, while employing SOP-guided MCTS to determine the optimal actions during interactions. Our experimental findings confirm the efficacy of this approach, revealing a 27.95% increase in action accuracy relative to baseline models based on GPT-3.5, along with substantial performance improvements for open-source models. Both the dataset and the associated code are made publicly available.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




