SilverLining: Data-First Mitigation of Spatial and Spectral Shortcuts Without Introducing New Confounders
Abstract
Deep neural networks exploit shortcuts—spurious correlations like laterality markers (spatial) or scanner-specific noise (spectral)—that severely compromise generalization in medical imaging. While recent work addresses individual shortcut types through model architecture or loss modifications, there is an argument for preprocessing the data itself, providing a more model-agnostic and visually interpretable approach. Furthermore, many healthcare applications face multiple concurrent shortcuts that are both spatial and spectral, which existing methods struggle to handle. We present SilverLining, an attention-based preprocessing framework that simultaneously identifies and mitigates both spatial and spectral shortcuts without introducing new spurious correlations. Our key insight is that naive removal of shortcut features can itself create new shortcuts, where models learn to exploit the removal patterns as new spurious correlations. We address this through a novel confounder-free correction strategy that maintains consistent preprocessing patterns across all classes in both spatial and frequency domains, preventing new confounders. Extensive experiments demonstrate SilverLining's effectiveness: achieving 0.87 AUC on controlled vision tasks and 0.94 AUC on counter-shortcut medical imaging evaluation where shortcuts are reversed; and improving cross-institutional chest X-ray classification from 0.72 to 0.77 AUC. Our data-centric approach provides an effective solution for reducing multiple types of data shortcuts without architectural modifications, creating preprocessed datasets that improve model robustness.