One-Shot Fine-Grained Re-Identification of Paint Marked Honey Bees using Vision Foundation Models
Abstract
Accurate re-identification (ReID) of individual insects is crucial for quantitative studies of pollinator behavior, with key applications in biological research and ecological monitoring.This work leverages vision foundation models to enable the fine-grained ReID of honey bees from video with a single paint marking and a single reference track. Such marking avoids the disruption required for gluing tags or painting color codes. We present a new challenging dataset of 9495 images and 45 identities, obtained at outdoor bee feeders with significant pose and illumination changes. Our one-shot capable approach first pre-processes video footage to extract pose-normalized bee crops and remove the background using a segmentation foundation model (e.g., SAM2). It then uses a self-supervised visual foundation model (e.g., DINOv3) for image and patch embeddings, coupled with contrastive metric learning and track information to generate robust ID embeddings for ReID. Compared to existing methods, our approach significantly reduces training data requirements.Our extensive studies show how the choices of the different steps of the pipeline impact performance, offering practical insights for future animal re-identification.