arXiv

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

Title: Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

Abstract: Accurate perception and dependable decision-making in autonomous driving rely heavily on 3D object detection. Nevertheless, BEV-based detectors frequently suffer from degraded spatiotemporal consistency due to cross-frame inconsistencies caused by both object and ego-motion. These issues result in misaligned BEV features over time. To overcome these hurdles, we introduce Co-Fusion4D, a comprehensive framework designed to maintain cross-frame spatiotemporal consistency while curbing temporal feature drift.

Co-Fusion4D utilizes a current-frame-centric approach, establishing the present frame as the primary information source. It selectively integrates historical data only after it has undergone spatiotemporal filtering and alignment. This dominant-complementary mechanism reduces cumulative alignment errors, blocks the propagation of noisy features, and leverages trustworthy temporal cues to generate a more stable BEV representation. Furthermore, the framework incorporates a Dual Attention Fusion (DAF) module to bolster spatiotemporal feature interaction. DAF combines intra-frame spatial attention with inter-frame temporal attention to dynamically align and merge multi-frame features. This process highlights regions consistent with motion while filtering out spurious correlations.

Moving away from standard uniform fusion methods, this architecture significantly enhances the temporal stability and discriminative power of BEV representations. Comprehensive evaluations on the nuScenes benchmark reveal that Co-Fusion4D attains state-of-the-art results, achieving a mean Average Precision (mAP) of 74.9% and a NuScenes Detection Score (NDS) of 75.6%. These results are accomplished without the need for test-time augmentation or external datasets.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...