DeepIPCv3: Event-Aware Multi-Modal Sensor Fusion for Sudden Pedestrian Crossing Avoidance
Title: DeepIPCv3: Leveraging Event-Aware Multi-Modal Sensor Fusion to Prevent Sudden Pedestrian Incidents
Abstract:
End-to-end autonomous driving architectures currently depend heavily on frame-based sensors, which introduce inherent perception delays and motion blur during high-dynamic situations, such as unexpected pedestrian crossings. To mitigate this significant safety gap, we present DeepIPCv3, an innovative multi-modal navigation framework. This system synergistically combines the dense 3D spatial geometry provided by LiDAR point clouds with the microsecond-resolution, asynchronous data streams generated by Dynamic Vision Sensors (DVS). We employ a Transformer-based cross-modal attention mechanism to dynamically align these disparate data types, enabling the network to instantly prioritize rapid dynamic updates while maintaining comprehensive structural awareness of the scene. Subsequently, a hybrid policy networkâintegrating heuristic trajectory tracking with direct neural predictionsâmaps the fused latent representations into safe local waypoints and actionable control commands.
Given the substantial physical dangers of live-testing such sudden crossing scenarios, we conducted rigorous offline evaluations using a bespoke multi-modal dataset gathered under both bright noon and difficult evening lighting conditions. Comprehensive ablation and comparative analyses reveal that DeepIPCv3 delivers state-of-the-art predictive capabilities. By effectively neutralizing motion blur and exposure failures, the integration of LiDAR and DVS data results in the minimal trajectory and control command errors, facilitating highly reactive and mathematically bounded evasive actions independent of ambient light levels. To foster further research, the code for this framework will be publicly available at https://github.com/oskarnatan/DeepIPCv3.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




