Scaling Datasets for Multi-Sensor, Multi-Agent, and Multi-Domain Learning in Autonomous Systems
Title: Expanding Dataset Scale for Multi-Sensor, Multi-Agent, and Multi-Domain Autonomous System Learning
Abstract: Current dataset offerings are insufficient for the large-scale learning required in autonomous environments that involve multiple agents, sensors, or domains, particularly where coordination and diversity are critical. To address this gap, we introduce a modular pipeline for dataset generation that produces terabyte-scale data with ground-truth labels for ground-based, aerial, and infrastructure-centric systems. Built upon the AVstack framework and the CARLA simulator, this approach supports both single- and multi-agent setups with adaptable sensor configurations, facilitating controlled experiments under difficult conditions. Our representative studies on perception and sensor fusion demonstrate how this synthesized data can effectively enable application-specific training and collaborative autonomous operations.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




