arXiv

Multi$^2$: Hierarchical Multi-Agent Decision-Making with LLM-Based Agents in Interactive Environments

Title: Multi$^2$: Enhancing Hierarchical Multi-Agent Decision-Making in Interactive Settings via LLM-Based Agents

Abstract: A primary objective in large language model (LLM) research is the development of agentic systems capable of planning, executing actions, and adapting through continuous engagement with dynamic environments. Although contemporary LLM-based agents demonstrate remarkable contextual reasoning capabilities, their decision-making over extended horizons remains unstable, frequently encountering "objective drift," a phenomenon where goals and plans diverge during prolonged interactions. To address this, we present Multi$^2$, a hierarchical multi-agent framework that structurally decomposes agent behavior into distinct, complementary roles. In this architecture, a high-level agent (System 1) generates context-aware sub-goils via supervised fine-tuning (SFT), while a low-level agent (System 2) performs atomic actions using offline-to-online reinforcement learning (RL) within interactive settings. This architectural separation facilitates stable long-horizon control, reduces the risk of objective drift, and supports efficient adaptation. Empirical results across various interactive environments show that Multi$^2$ consistently surpasses robust agentic baselines, exhibiting superior coordination and robustness in multi-turn interactions. Furthermore, to address a significant gap in the training and evaluation of hierarchical decision-making for LLM-based agents, we introduce and make publicly available three new hierarchical benchmark datasets.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...