Zero-Shot Multi-Animal Tracking in the Wild
Title: Zero-Shot Multi-Animal Tracking in the Wild
Abstract:
Accurately tracking multiple animals is essential for advancing our understanding of ecological dynamics and behavioral patterns. However, this task is notoriously difficult due to the significant variability in animal appearances, movement trajectories, and living environments. Conventional methods generally demand substantial manual effort, including scenario-specific fine-tuning and the engineering of custom heuristics. To address these challenges, our study investigates the potential of vision foundation models for performing zero-shot multi-animal tracking. We extend the SAM2MOT framework by integrating Grounding DINO with the Segment Anything Model2 (SAM 2), implementing three specific adaptations tailored to the unique visual and behavioral traits of animals. Notably, this approach requires no retraining or dataset-specific hyperparameter adjustments. While we also assessed the newer SAM3 model, we found practical constraints that hinder its effectiveness for wild multi-animal tracking scenarios. Our proposed method establishes a new state-of-the-art performance benchmark on several datasets, including Chimp-Act, Bird Flock Tracking, AnimalTrack, and a portion of GMOT-40, showcasing strong generalization capabilities across a wide range of species and habitats. The source code for this project can be accessed at https://github.com/ecker-lab/SAM2-Animal-Tracking.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





