Non‑Contact Blood Pressure Estimation from Face Videos via Physiology‑Aware Contrastive Learning
Abstract
Remote photoplethysmography (rPPG) has emerged as a promising foundation for camera-based blood pressure (BP) monitoring, but practical deployment remains limited by strong domain gaps across datasets, scarce and imbalanced labels, and the difficulty of preserving waveform morphology. We present a dual-branch framework that combines raw rPPG segments with handcrafted waveform features and introduces an augmentation-free contrastive pre-training strategy. The approach learns subject-invariant, domain-agnostic embeddings from unlabeled facial videos and aligns them with physiology-inspired descriptors in a shared latent space, while a distribution-aware loss reduces label imbalance. This design integrates the strengths of data-driven representations and handcrafted physiological cues, producing morphology-sensitive features that generalize across acquisition domains. Experiments on multiple datasets demonstrate that the proposed method achieves competitive accuracy under both controlled and in-the-wild conditions, improves cross-dataset transfer, and maintains robust performance when labeled data are limited. Beyond accuracy, the framework emphasizes interpretability by grounding learned embeddings in physiological features, and its reliance on unlabeled videos makes it highly scalable to larger populations without requiring extensive manual annotation. Taken together, these results suggest that bridging representation learning with physiological modeling offers a practical and scalable path toward reliable, non-contact BP monitoring in diverse real-world environments.