Understanding Identity Continuity in Thermal Video through Scene-Level Consistency
Title: Enhancing Identity Continuity in Thermal Video via Scene-Level Consistency
Thermal pedestrian multiple object tracking (MOT) continues to face significant hurdles, primarily due to fragmented trajectories resulting from weak appearance features and frequent detection failures. This study investigates whether lightweight post-processing techniques can restore identity continuity, bypassing the need for resource-intensive re-identification models or intricate online association mechanisms. Building upon a baseline established by YOLOv8 and SORT, we introduce a modular identity-repair backend. This system employs online short-gap remapping alongside offline tracklet relinking, leveraging temporal, spatial, motion, and border-based cues to mend broken tracks.
Our evaluation on the official PBVS Thermal Pedestrian MOT benchmark, combined with controlled ablation studies on a fixed validation split, reveals that the most substantial improvements in identity metrics stem from conservative relinking strategies. This approach increased the IDF1 score from 82.25 to 84.93 without compromising MOTA performance. Furthermore, the analysis demonstrated that many heuristic thresholds remain robust across a wide range of operating conditions.
These findings indicate that in thermal imagery, where information is inherently limited, achieving robust identity recovery is more effective through high-precision trajectory relinking than by increasing the complexity of the tracker itself. Consequently, this work offers a controlled perspective on identity recovery in thermal video, highlighting that scene-level spatial-temporal consistency is the dominant factor in maintaining identity continuity, rather than relying solely on local frame-to-frame association.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




