arXiv

Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance Sampling

June 4, 2026 · Marc Walden, Jason Liu, Shaashwath Sivakumar, Ryan Liu, Hamza Khan · Original Source

Title: Optimizing MADDPG for Multi-Agent Systems Through Action Prediction and Geometric Importance Sampling

Abstract:

This study explores improvements to Multi-Agent Deep Reinforcement Learning by introducing two key enhancements to the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework. First, we present a new Action Inference mechanism designed to allow individual agents to forecast the intended moves of their counterparts. This capability significantly boosts the precision and robustness of each agent’s policy. Second, we implement an importance sampling technique based on the geometric distribution within the replay buffer. This approach prioritizes recent and high-value experiences, effectively addressing the non-stationary challenges characteristic of multi-agent settings.

To validate these modifications, we conducted experiments on the discrete-action Predator-Prey scenario using the PettingZoo library, a versatile Python platform for multi-agent reinforcement learning benchmarks. The findings demonstrate that Action Inference substantially enhances both learning stability and cooperative dynamics among agents. Furthermore, the application of geometric distribution-based importance sampling yields marked gains in exploration efficiency compared to the standard MADDPG algorithm.

The source code for this project is accessible at: https://github.com/shaashwathsivakumar/MARL_Proj

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC