MANTA: Physics-Informed Generalized Underwater Object Tracking
Abstract
Underwater object tracking is challenging due to wavelength-dependent attenuation and scattering, which severely distort appearance across depths and water conditions. Existing trackers, typically trained on terrestrial data, fail to generalize to these physics-driven degradations. We present MANTA, a physics-informed framework that involves both representation learning and tracking design for underwater scenarios. We propose a dual-positive contrastive learning strategy that couples temporal consistency with Beer–Lambert augmentations, yielding generalizable features robust to temporal and underwater distortions. We further introduce a multi-stage tracking pipeline where a motion-based primary tracker is augmented with a physics-informed secondary association algorithm that integrates geometric consistency and appearance similarity for efficient re-identification under occlusion, disappearance, and drift. To complement standard IoU metrics, we propose Center–Scale Consistency (CSC) and Geometric Alignment Score (GAS) to assess geometric fidelity. Experiments on four underwater benchmarks (WebUOT-1M, UOT32, UTB180, UWCOT220) show that MANTA achieves state-of-the-art performance, improving Success AUC by up to 6\%, while ensuring stable long-term generalized underwater tracking and efficient runtime.