Cosine Similarity is Almost All You Need (for Prototypical-Part Models)
Luke Moffett · Frank Willard · Maximillian Machado · Emmanuel Mokel · Jon Donnelly · Zhicheng Guo · Adam Costarino · Julia Yang · Giyoung Kim · Alina Barnett · Cynthia Rudin
Abstract
Prototypical-part networks are a popular interpretable alternative to black-box deep learning models for computer vision because of their faithful, prototype-based self-explanations.However, in practice, they have proven difficult to train because they are highly sensitive to hyperparameter tuning and difficult to comprehend because they contain a large number of prototypes.We show that replacing $\ell_2$ distance with an angular prototype similarity in the original ProtoPNet greatly improves robustness to hyperparameter selection and is sufficient to produce accuracy and sparsity competitive with state-of-the-art on many backbones and datasets.We also show cosine similarity leads to superior accuracy for five different ProtoPNet architectures (ProtoPNet, TesNet, Deformable ProtoPNet, ProtoTree, and ST-ProtoPNet).Finally, we demonstrate ProtoPNet with cosine similarity produces better semantics than $\ell_2$: prototypes from cosine models score better on prototype quality metrics and are perceived as more similar 3:2 in a user study.
Successful Page Load