Network-agnostic distortion-robust projections for wide-angle image understanding
Abstract
Due to their increased field of view, wide-angle lenses are increasingly used in applications such as VR, security, or in autonomous driving. Typically, existing models either ignore wide-angle distortions or ``undistort'' images to a perspective projection, often resulting in severe stretching. More recent distortion-aware architectures address these issues, yet they impose substantial computational burdens and limit the use of powerful pre-trained vision backbones. In this work, we revisit the undistortion strategy by exploring alternative projection functions beyond the conventional perspective model.Specifically, we investigate square-to-disc mapping functions, most notably, the elliptical grid map (EGM) projection, which minimizes stretching. We show how EGM projection can be combined with the known lens distortion curves to achieve distortion invariance directly in image space. This network-agnostic approach seamlessly integrates with existing deep learning architectures, allowing fine-tuning on pre-trained models trained on large perspective datasets, while adapting to both seen and unseen wide-angle lenses without re-training each time the lens changes during evaluation. We perform experiments on the semantic segmentation task, comparing methods on zero-shot adaptation to unseen lenses from different wide-angle lenses. Our extensive experiments show that using the EGM projection with existing segmentation models significantly outperforms baselines when trained on bounded distortion levels and tested across both seen and out-of-distribution distortions. Furthermore, EGM projection achieves improved performance on real-world datasets, highlighting the robustness and practicality of our approach in real-world applications.