CAST: Evaluating Multi-Object Trackers with Context-Aware Switch and Transfer Scores
Abstract
Multi-object tracking (MOT) has been a subject of intensive research fordecades. Multiple standard datasets and benchmarks have been set up, and severalevaluation metrics, such as MOTA, IDF1 and HOTA. These metrics have become thede facto standard for comparing and ranking trackers on standardized datasets tomeasure progress. In this paper, we focus on MOTA and HOTA, and present a studyof cases where these metrics' behaviors may not be desirable. In addition, wedemonstrate how they might not be ideal when used as a tool to inspect atracker's failure cases. We point out that these issues are related to the sizesof the context windows in which they measure association quality, where MOTA istoo nearsighted while HOTA can be too holistic depending on the task settings.In this paper, we rethink the familiar notion of identity switches (IDSw)proposed in MOTA, and propose a generalized version of it by introducing acontext window when evaluating the ID assignment choice for each detection. Weshow that the proposed metric named CAST mitigates the limitations of HOTA andMOTA, and demonstrate its usefulness when diagnosing model failures. Our codeand toolkit will be available for the community to advance both the developmentand application of MOT.