Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles
Title: Enhancing Multi-Agent Reinforcement Learning for Underwater Acoustic Tracking with Autonomous Vehicles
Abstract: Autonomous vehicles (AVs) present an economical approach to executing scientific missions, including underwater tracking. While Reinforcement Learning (RL) has proven effective for controlling these vehicles, scaling the technology to manage fleets—a necessity for tracking multiple targets or those moving at high speeds—remains a significant hurdle. Multi-Agent Reinforcement Learning (MARL) is well-known for its poor sample efficiency. Although high-fidelity simulators such as Gazebo’s LRAUV can accelerate single-robot simulations by up to 100 times compared to real-time, they fail to provide comparable speedups in multi-vehicle contexts, rendering MARL training impractical. However, high-fidelity simulation is indispensable for testing complex policies and bridging the sim-to-real gap.
To overcome these constraints, we have engineered a GPU-accelerated environment that delivers a speedup of up to 30,000x relative to Gazebo while maintaining dynamic fidelity. This innovation facilitates rapid, end-to-end GPU-based training and allows for seamless transfer to Gazebo for evaluation purposes. Furthermore, we propose a Transformer-based architecture, TransfMAPPO, which learns policies that remain invariant to both the number of targets and fleet size. This capability supports curriculum learning, allowing for the training of larger fleets across increasingly complex scenarios. Following extensive large-scale GPU training, we conducted rigorous evaluations in Gazebo, demonstrating that our approach keeps tracking errors under 5 meters, even when handling multiple fast-moving targets.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



