A framework for real-time Surgical Phase Recognition with application to Robot-Assisted Partial Nephrectomy
Abstract
Surgical practice has increasingly integrated advanced technologies to improve procedural outcomes, efficiency, and safety in modern operating rooms. Within this evolving landscape, Automated Surgical Phase Recognition (SPR) leverages Artificial Intelligence to temporally segment surgical workflows into key events, thereby supporting both real-time decision-making and off-line analysis. Despite the potential of SPR, previous research focused on short and linear surgeries, giving limited attention to the development, assessment, and deployment of real-time systems for complex surgical workflows. This work addresses these gaps by targeting the highly-complex and non linear workflow of Robot-Assisted Partial Nephrectomy (RAPN). We develop a real-time SPR system trained on 143 annotated RAPN surgical videos covering 15 distinct phases. The system incorporates a trainable canonical calibration error estimator combined with Viterbi decoding for more reliable outcomes. Additionally, we introduce a novel assessment framework designed to simultaneously evaluate off-line, real-time, and averaged SPR performance, synthesizing historical phase predictions over time. To facilitate practical deployment, we implement the SPR pipeline as an end-to-end application using the NVIDIA Holoscan platform, specifically tailored for real-time inference scenarios. The system was successfully tested during three live RAPN procedures on human patients performed in a collaborating hospital, achieving an average inference latency of 16.65 ms and an accuracy of 68.2%. Results highlight improvement in performance through the integration of Viterbi decoding in this complex surgical scenario, while canonical calibration, despite yielding marginal gains in overall performance, enhance classification reliability. We show the feasibility of deploying a real-time SPR pipeline for RAPN, which holds promise for optimizing OR planning. The application will be available upon acceptance at https://github.com/nvidia-holoscan/holohub