ScoliGaitX: A Deep Multi-Modal Fusion Network for Scoliosis Assessment via Gait Video Analysis
Abstract
Scoliosis presents significant diagnostic challenges, especially in its early stages due to structural deformities of the spine. The most common type, Adolescent Idiopathic Scoliosis (AIS), typically appears during rapid growth periods between ages 10 and 15, accounting for about 80–85% of all scoliosis cases. Currently, diagnosis mainly involves repeated X-rays and clinical evaluations, which are costly and expose patients to frequent radiation. To address these challenges, we propose ScoliGaitX, a novel non-invasive system for assessing scoliosis through gait video analysis. Our approach uses a multi-stream deep learning model trained specifically for scoliosis classification using the Scoliosis1K dataset. We integrate three different gait modalities—(i) silhouettes to capture appearance, (ii) optical flow for motion analysis, and (iii) GEI-subtracted sequences to highlight deviations from normal gait—to effectively identify gait abnormalities associated with scoliosis. A key component of our proposal is the Align Gate Fuse (AGF) module, designed to efficiently learn relationships between different modalities. It achieves this by adaptively assigning importance to each modality through a lightweight global gating mechanism. Our experimental results demonstrate that ScoliGaitX significantly outperforms existing methods, achieving an accuracy of 89.05%, which is 7.05% higher than the previous best approach (ScolNet-MT), while maintaining excellent specificity. This highlights the promise of our method for providing early, radiation-free scoliosis assessment.