arXiv

Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning

June 3, 2026 · Beyazit Yalcinkaya, Marcell Vazquez-Chanlatte, Ameesh Shah, Hanna Krasowski, Sanjit A. Seshia · Original Source

Title: Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning

Original: arXiv:2511.02304v2 Announce Type: replace-cross

Abstract: This research investigates the development of multi-task, multi-agent policies designed for cooperative, temporal objectives, operating under a centralized training and decentralized execution paradigm. By employing automata to define agent-specific tasks, the team-level goal can be decomposed into more manageable, smaller sub-tasks. Despite this advantage, current methods suffer from sample inefficiency and are restricted to single-task scenarios, necessitating the retraining of policies for every new task. To address these limitations, we introduce Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning (ACC-MARL), a novel framework aimed at learning decentralized team policies conditioned on specific tasks. We outline the obstacles to implementing ACC-MARL, offer solutions, and provide a proof of optimality for our approach. Additionally, we demonstrate that the value functions acquired during training can facilitate optimal task assignment during the testing phase. Our experimental results highlight emergent, multi-step coordination capabilities among agents, including complex behaviors such as pressing a button to unlock a door, holding the door open, and executing short-circuiting tasks.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC