PointSt3R: Point Tracking through 3D Ground Correspondence
Rhodri Guerrier · Adam Harley · Dima Damen
Abstract
Recent advances in foundational 3D reconstruction models, such as DUSt3R and MASt3R, have shown great potential in 2D and 3D correspondence in static scenes. In this paper we propose to adapt them for the task point tracking through 3D grounded correspondence. We first demonstrate that these models are competitive point trackers when focusing on static points, present in current point tracking benchmarks ($+34.3\%$ on EgoPoints static vs. CoTracker2). As these models are trained exclusively on static correspondence data, we propose to combine the reconstruction loss with training for dynamic correspondence, fine-tuning MASt3R using a relatively small amount of dynamic synthetic data. Specifically, we achieve competitive 2D point tracking results on a number of datasets (e.g. 71.0 $\delta_{avg}$ on TAP-Vid-DAVIS compared to 75.7 for CoTracker2) without any temporal knowledge. Furthermore, we also show that 3D tracking can actually be improved on TAP-Vid-3D PStudio ($+1.9\%$ when compared against CoTracker3+ZoeDepth).
Successful Page Load