Overcoming Fine-Grained Visual Challenges in Animal Re-Identification via Semantic Feature Alignment
Abstract
Identifying individual animals at different points in space and time is vital for effective wildlife monitoring and biodiversity conservation. While existing computer vision methods have shown promise in re-identifying animals, their capability in Animal Re-Identification (Animal ReID) remains restricted by the inherent visual variations, specifically high intra- and low inter-identity variations. High intra-identity variations refer to high visual diversity within the same individual due to pose or form changes and occlusions, and low inter-identity variations refer to subtle visual differences between distinct individuals due to fine-grained appearances. To address these challenges, we propose the Clip-based Animal RE-identification (CARE) framework, which leverages the image-conditioned textual description generation and individual-level semantic feature alignment, mitigating the negative impacts of visual variations in Animal ReID. Crucially, we have packaged CARE into a stand-alone toolkit and piloted it with stakeholders, facilitating real-world wildlife monitoring for biodiversity conservation. Extensive experiments on benchmark and in-the-wild datasets further demonstrate that CARE consistently outperforms state-of-the-art methods, validating its effectiveness in Animal ReID. The code is available at https://anonymous.4open.science/r/CARE-WACV.