|

X-iPPGNet: A novel one stage deep learning architecture based on depthwise separable convolutions for video-based pulse rate estimation.

Researchers

Journal

Modalities

Models

Abstract

Pulse rate (PR) is one of the most important markers for assessing a person’s health. With the increasing demand for long-term health monitoring, much attention is being paid to contactless PR estimation using imaging photoplethysmography (iPPG). This non-invasive technique is based on the analysis of subtle changes in skin color. Despite efforts to improve iPPG, the existing algorithms are vulnerable to less-constrained scenarios (i.e., head movements, facial expressions, and environmental conditions). In this article, we propose a novel end-to-end spatio-temporal network, namely X-iPPGNet, for instantaneous PR estimation directly from facial video recordings. Unlike most existing systems, our model learns the iPPG concept from scratch without incorporating any prior knowledge or going through the extraction of blood volume pulse signals. Inspired by the Xception network architecture, color channel decoupling is used to learn additional photoplethysmographic information and to effectively reduce the computational cost and memory requirements. Moreover, X-iPPGNet predicts the pulse rate from a short time window (2 s), which has advantages with high and sharply fluctuating pulse rates. The experimental results revealed high performance under all conditions including head motions, facial expressions, and skin tone. Our approach significantly outperforms all current state-of-the-art methods on three benchmark datasets: MMSE-HR (MAE = 4.10 ; RMSE = 5.32 ; r = 0.85), UBFC-rPPG (MAE = 4.99 ; RMSE = 6.26 ; r = 0.67), MAHNOB-HCI (MAE = 3.17 ; RMSE = 3.93 ; r = 0.88).Copyright © 2023 Elsevier Ltd. All rights reserved.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *