arXiv

MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment

June 2, 2026 · Ziyan Wang, Yali Du, Yudi Zhang, Meng Fang, Biwei Huang · Original Source

Title: MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment

Abstract:

Offline Multi-agent Reinforcement Learning (MARL) proves essential in contexts where real-time interaction is either unfeasible or hazardous. Although independent learning strategies provide significant benefits regarding scalability and flexibility, they struggle with the precise attribution of credit to specific agents in offline environments, primarily because agents cannot engage with the environment. To resolve this credit assignment issue within offline MARL, we introduce a novel framework called Multi-Agent Causal Credit Assignment (MACCA). By modeling the generative process as a Dynamic Bayesian Network, MACCA effectively maps the interdependencies among environmental variables, states, actions, and rewards. By fitting this model to offline datasets, the method determines each agent’s specific contribution by examining the causal links to their individual rewards, thereby guaranteeing both interpretability and precision in credit allocation. Furthermore, the modular design of our approach enables it to be effortlessly combined with a wide array of existing offline MARL techniques. From a theoretical standpoint, we demonstrate that, given an offline dataset, both the fundamental causal structure and the function responsible for generating agents' individual rewards are identifiable, thereby validating the mathematical soundness of our model. Empirical results show that MACCA surpasses current state-of-the-art techniques and also boosts the performance of other methods when utilized as a backbone component.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC