Correcting and Quantifying Systematic Errors in 3D Box Annotations for Autonomous Driving
Abstract
Accurate annotations are critical to providing ground truth to supervised learning and to evaluating the performance of autonomous vehicle systems. These vehicles are typically equipped with active sensors, such as LiDAR, which scan the environment in predefined patterns. 3D box annotation based on data from such sensors is challenging in dynamic scenarios, where objects are observed at different timestamps, hence different positions. Without proper handling of this phenomenon, systematic errors are prone to being introduced in the annotations.Our work is the first one to describe why such annotation errors occur and illustrates them using examples from widely used, publicly available datasets. Through our novel estimation method, we correct the annotations so that they follow physically feasible trajectories and achieve spatial and temporal consistency with the sensor data. For the first time, we define metrics for this problem; and we evaluate our method on the Argoverse 2 and MAN TruckScenes datasets, as well as in our proprietary dataset. Our approach demonstrates robust performance in increasing the quality of ground truth by more than 17% in these datasets. Finally, we quantify the annotation errors in them and find that the original annotations are misplaced by up to 2.5 m, with highly dynamic objects being the most affected. Our code is provided in the supplementary and will be published upon acceptance.