arXiv

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

June 4, 2026 · Juan Zhong, Yuhang Shi, Zukang Xu, Xi Chen · Original Source

Title: Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

Abstract: The integration of Transformer-based architectures has established a dominant paradigm in autonomous driving, largely due to their proficiency in modeling long-range spatial dependencies, multi-agent interactions, and multimodal contexts across perception, prediction, and planning tasks. However, the practical implementation of these models in real-world vehicles is hindered by the significant latency, memory consumption, and energy demands associated with high-capacity attention mechanisms. This paper provides a comprehensive review of prominent Transformer-based autonomous driving models, categorizing them according to their functional roles, sensing configurations, and architectural structures. Crucially, the study evaluates these models through the lens of deployment feasibility, exploring how efficiency constraints influence design decisions in practical applications. Furthermore, we examine compression and acceleration techniques pertinent to Transformer-based driving systems, such as quantization, pruning, knowledge distillation, low-rank approximation, and efficient attention mechanisms, while discussing their respective advantages, drawbacks, and suitability for specific tasks. Instead of viewing compression merely as a post-processing add-on, we emphasize its role as a fundamental system-level design element that directly impacts deployability, robustness, and safety. The survey concludes by outlining open challenges and prospective research avenues aimed at establishing standardized, safety-aware, and hardware-conscious evaluation frameworks for efficient autonomous driving systems.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC