arXiv

Network Distributed Multi-Agent Reinforcement Learning for Consensus Control of Quadcopters

June 2, 2026 · Youssef Mahran, Zeyad Gamal, Aamir Ahmad, Ayman El-Badawy · Original Source

Title: Achieving Quadcopter Consensus via Network-Embedded Distributed Multi-Agent Reinforcement Learning

Abstract:

This study introduces the Network Distributed Multi-Agent Reinforcement Learning (ND-MARL) framework, designed specifically for managing consensus control in quadcopter swarms. Unlike traditional multi-agent MARL approaches that depend on either centralized planning or entirely decentralized execution, ND-MARL integrates the swarm’s communication graph directly into the decision-making process. Operating under a 2-Neighbor communication topology, each agent gathers data from just two neighbors to execute actions via a distributed policy.

The framework employs a hierarchical structure where a high-level distributed consensus planner, trained using Multi-Agent Soft Actor-Critic (MASAC), generates reference target positions. These targets are then tracked by a low-level quadcopter controller. Comparative analysis with a centralized MARL controller reveals that ND-MARL ensures seamless consensus trajectories and effective planner-tracker integration.

A key advantage of this learned controller is its zero-shot scalability. Policies developed for a three-agent system can be directly deployed to swarms comprising up to 250 agents, maintaining the same 2-Neighbor communication topology without the need for retraining or fine-tuning. While the system achieves consistent convergence across varying team sizes, larger groups exhibit an increased steady-state spread resulting from sparse information propagation. These results position ND-MARL as a robust and stable solution for distributed, communication-aware consensus control in quadcopter networks.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC