Learning Association via Track-Detection Matching for Multi-Object Tracking
Title: Enhancing Multi-Object Tracking Through Track-Detection Matching for Association Learning
Abstract: The primary goal of multi-object tracking is to preserve object identities across sequential video frames by effectively linking detections. Current literature is largely divided into two categories: tracking-by-detection strategies, which offer computational efficiency but depend on manually engineered association heuristics, and end-to-end frameworks that learn these associations from data but incur significant computational costs. To address these trade-offs, we introduce Track-Detection Link Prediction (TDLP), a novel tracking-by-detection approach. TDLP executes per-frame association through link prediction between existing tracks and new detections, essentially forecasting the most likely continuation for each track in every frame. While the architecture is optimized for geometric inputs like bounding boxes, it remains flexible enough to integrate supplementary signals such as appearance and pose. By learning association patterns directly from data, TDLP eliminates the need for handcrafted rules, offering a modular solution that is more computationally efficient than end-to-end trackers. Our extensive evaluations across various benchmarks reveal that TDLP consistently outperforms state-of-the-art methods, surpassing both traditional tracking-by-detection and end-to-end models. Furthermore, we conduct a comprehensive analysis contrasting link prediction with metric learning-based association, demonstrating the superior effectiveness of link prediction, especially when dealing with heterogeneous features like detection bounding boxes. The source code for this work is accessible at https://github.com/Robotmurlock/TDLP.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






