Track: Oral Session 8A: Biomedical, Healthcare, and Medicine

Tue 10 March 13:30 - 13:42 PDT

Cycle-consistent Multi-graph Matching for Self-supervised Annotation of C. Elegans

Sebastian Stricker ⋅ Christoph Karg ⋅ Lisa Hutschenreiter ⋅ Bogdan Savchynskyy ⋅ Dagmar Kainmueller

In this work we present a novel approach for unsupervised multi-graph matching, which applies to problems for which a Gaussian distribution of keypoint features can be assumed. We leverage cycle consistency as loss for self-supervised learning, and determine Gaussian parameters through Bayesian Optimization, yielding a highly efficient approach that scales to large datasets. Our fully unsupervised approach enables us to reach the accuracy of state-of-the-art supervised methodology for the biomedical use case of semantic cell annotation in 3D microscopy images of the worm C. elegans. To this end, our approach yields the first unsupervised atlas of C. elegans, i.e. a model of the joint distribution of all of its cell nuclei, without the need for any ground truth cell annotation. This advancement enables highly efficient semantic annotation of cells in large microscopy datasets, overcoming a current key bottleneck. Beyond C. elegans, our approach offers fully unsupervised construction of cell-level atlases for any model organism with a stereotyped body plan down to the level of unique semantic cell labels, and thus bears the potential to catalyze respective biomedical studies in a range of further species.

Tue 10 March 13:42 - 13:54 PDT

Automated Suturing Skill Assessment in Robot-assisted Surgery from Endoscopic Videos using Clinically-guided Evaluation Criteria

Atharva Deo ⋅ Ujjwal Pasupulety ⋅ Nicholas Matsumoto ⋅ Jay Moran ⋅ Cherine Yang ⋅ Jeanine Kim ⋅ Rafal Kocielnik ⋅ Aurash Naser-Tavakolian ⋅ Andrew Hung

Surgery continues to be perceived as an art, where proficiency is primarily achieved through years of experience. Artificial Intelligence research has yielded insight into the performance of expert surgeons and their associations with patient outcomes. Clinician expertise has led to the development of systematic assessments for fundamental skills (e.g., End-to-End Assessment of Suturing Expertise [EASE]) that contribute to positive outcomes. However, evaluating these skills requires manual expert review of endoscopic videos and is prone to inconsistencies between human raters. In this work, we present AutoEASE, the first end-to-end pipeline to automatically assess suturing performance from raw endoscopic video data using EASE rubrics. Our system utilizes a Mixture of Expert models (MoE) ; Multiscale vision transformers and 3D convolutional neural networks trained on Robot-assisted Radical Prostatectomy videos with over 13000 data points. For a given stitch clip, the MoE pipeline first determines each phase (needle handling, driving, withdrawal) of a continuous stitch and predicts a binary score (fail / ideal) for seven sub-skills based on rubrics defined in EASE. AutoEASE achieves 0.98 AUC while detecting each phase. For EASE score prediction, the complete end-to-end pipeline attains $\geq$ 0.77 AUC in sub-skills associated with needle handling and driving. The promising performance of AutoEASE at the individual stitch level demonstrates the feasibility of developing more sophisticated assessment and reporting tools for complete surgical procedures objectively and at scale.

Tue 10 March 13:54 - 14:06 PDT

Deep Image Decomposition for Medical Imaging Anonymization and Curation

Yael Elkin ⋅ Gal Arie ⋅ Tammy Raviv Raviv

Medical scans often include patient identifiers and clinical annotations that must be removed prior to data sharing or use in downstream model training. With machine learning now central to clinical imaging analysis, reliable removal of such non-imaging artifacts is essential for preserving patient privacy, reducing bias, and improving data quality. However, this crucial curation step is frequently overlooked or addressed heuristically.We present a deep learning framework that automatically detects and removes overlaid text, markers, and other non-imaging elements from clinical scans while restoring the underlying image content. The model comprises two components: a detection module that localizes non-imaging regions, and a dual-generator architecture for unsupervised image decomposition, where one generator reconstructs the imaging content and the other produces the non-imaging components. Unlike conventional inpainting, our method bypasses explicit segmentation by leveraging explainable AI (XAI) maps from the detection module to guide artifact masking and restoration.We demonstrate robust curation performance on three datasets, one MRI and two ultrasound, for both public and private sources.Results show high visual quality (Turing-test validated) and strong quantitative scores (SSIM, PSNR, FID). Importantly, training downstream classification and segmentation models with scans curated by our method substantially improves results compared to models trained on data containing overlaid annotations. In fact, our performance on various metrics (e.g., accuracy, F1 score, IoU, and Dice) is comparable to those obtained with clean, marker-free training data. Code is included with the submission. Our private dataset will be released upon acceptance.

Tue 10 March 14:06 - 14:18 PDT

Intraoperative 2D/3D Registration via Spherical Similarity Learning and Differentiable Levenberg-Marquardt Optimization

Minheng Chen ⋅ Youyong Kong

Intraoperative 2D/3D registration aligns preoperative 3D volumes with real-time 2D radiographs, enabling accurate overlay of additional auxiliary anatomical information that is not visible in intraoperative imaging onto the surgical scene. This provides precise localization of instruments and implants, enhancing surgical accuracy and safety.A recently proposed fully differentiable similarity learning framework, which enables neural networks to approximate the geodesic distance between two poses on the manifold in SE(3), has garnered considerable attention. It greatly increases the capture range of registration and mitigates the effects of substantial disturbances on registration. However, existing methods approximate manifold in Riemannian geometry within Euclidean space, leading to inaccurate portrayal of manifold's local structure, with a lengthy convergence process. To address the above limitations, we explore similarity learning on non-Euclidean spherical feature spaces to improve the ability to capture and fit complex manifold features.We extract feature embeddings using a CNN-Transformer encoder, project them into spherical space, and approximate their geodesic distances with Riemannian geodesic distances in the bi-invariant SO(4) space. This enables the learning of a more expressive and geometrically consistent deep similarity metric, enhancing the network’s ability to distinguish subtle pose differences.Fully differentiable Levenberg-Marquardt optimization is adopted to replace the existing gradient descent method to accelerate the convergence of the search during inference phase.Extensive experiments and ablation studies on real and synthetic datasets demonstrate that our approach achieves superior registration accuracy in both patient-specific and patient-agnostic scenarios.

Tue 10 March 14:18 - 14:30 PDT

ACuRE: Accurate Continuity-Regularized SpO2 Estimation Using Liquid Time-Constant Networks

Shahzad Ahmad ⋅ DR. MISHRA ⋅ Sania Bano ⋅ Sukalpa Chanda ⋅ Yogesh Rawat

Blood oxygen saturation (SpO$_2$) is a vital measure of respiratory and circulatory health, essential for detecting hypoxemia in conditions like chronic obstructive pulmonary disease and heart failure. Current non-contact SpO$_2$ estimation methods using remote photoplethysmography (rPPG) struggle with motion artifacts, illumination variability, and limited temporal modeling, hindering their practical use. We propose ACuRE, a novel framework that integrates a two-branch 3D-ResNet-18 for AC/DC signal separation, Liquid Time-Constant (LTC) networks for continuous-time dynamics, and a physics-informed partial differential equation (PDE) loss based on mass conservation. ACuRE overcomes these challenges by isolating pulsatile (AC) and baseline (DC) signals for enhanced robustness, using LTC networks to capture nonlinear physiological dynamics, and applying PDE regularization to ensure signal continuity. This achieves a significant reduction in mean absolute error compared to baselines, with strong performance under motion and illumination stress. Evaluated across multiple datasets, ACuRE demonstrates robust accuracy and generalization, offering a scalable solution for video-based health monitoring in telemedicine and low-resource settings.