Video-Based Animal Re-Identification (VARe-ID) from Multiview Spatio-Temporal Track Clustering
This work presents a modular software pipeline and end-to-end workflow for video-based animal re-identification, which assigns consistent individual IDs by clustering multiview spatio-temporal tracks with minimal human intervention. Starting from raw video, the system detects and tracks animals, scores and selects informative left/right views, computes embeddings, clusters annotations by viewpoint, and then links clusters across time and varying perspectives using spatio-temporal continuity. Automated consistency checks resolve remaining ambiguities. Preliminary experiments demonstrate near-perfect identification accuracy with very limited manual verification. The workflow is designed to be generalizable across species. Currently, trained models support Grevy’s and Plains zebras, with plans to expand to a broader range of species.
References
Links to related resources (libraries, tools, etc.) or external documentation
Acknowledgements
- National Science Foundation (NSF) funded AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) (OAC 2112606).
- Imageomics Institute (A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning) is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under Award (OAC 2118240).
- Support from Rensselaer Polytechnic Institute (RPI).
- Support from Finnish Cultural Foundation.
- Resources from Ohio Supercomputer Center made it possible to train and test algorithmic components.
Citation
Ankit K. Upadhyay, Ekaterina Nepovinnykh, S. M. Rayeed, Aidan Westphal, Lawrence Miao, Julian Bain, Jaeseok Kang, Tuomas Eerola, Heikki Kälviäinen, Charles V. Stewart. Animal Re-Identification via Multiview Spatio-Temporal Track Clustering. Rensselaer Polytechnic Institute, LUT University, Brno University of Technology, CV4Animals, CVPR 2025.