Telehealth has become a critical way for doctors to provide healthcare while minimising in-person contact during COVID-19. But with phone or video appointments, it’s harder for doctors to get important vital signs from a patient, such as their pulse or respiration rate, in real time. Researchers in the US, however, believe they are closing in on a solution.
A University of Washington-led team has developed a method that uses the camera on a person’s smartphone or computer to take their pulse and respiration signal from a real-time video of their face.
The system is less likely to be tripped up by different cameras, lighting conditions or facial features, such as skin colour; presenting new possibilities as a better system for measuring physiological signals.
Lead author Xin Liu, a UW doctoral student in the Paul G. Allen School of Computer Science & Engineering, explains: “Machine learning is pretty good at classifying images. If you give it a series of photos of cats and then tell it to find cats in other images, it can do it.
“But for machine learning to be helpful in remote health sensing, we need a system that can identify the region of interest in a video that holds the strongest source of physiological information -pulse, for example – and then measure that over time.
“Every person is different. So this system needs to be able to quickly adapt to each person’s unique physiological signature, and separate this from other variations, such as what they look like and what environment they are in.”
The team’s system is “privacy preserving”, running on the device rather than in the cloud and using machine learning to capture subtle changes in how light reflects off a person’s face; which is correlated with changing blood flow. It then converts these changes into both pulse and respiration rate.
The first version of this system was trained with a dataset that contained both videos of people’s faces and “ground truth” information: each person’s pulse and respiration rate measured by standard instruments in the field.
The system then used spatial and temporal information from the videos to calculate both vital signs. It outperformed similar machine learning systems on videos where subjects were moving and talking.
But while the system worked well on some datasets, it still struggled with others that contained different people, backgrounds and lighting.
This is a common problem known as “overfitting,” the team said.
The researchers improved the system by having it produce a personalised machine learning model for each individual.
Specifically, it helps look for important areas in a video frame that likely contain physiological features correlated with changing blood flow in a face under different contexts, such as different skin tones, lighting conditions and environments.
From there, it can focus on that area and measure the pulse and respiration rate.
While this new system outperforms its predecessor when given more challenging datasets, especially for people with darker skin tones, there’s still more work to do, the team said.
“We acknowledge that there is still a trend toward inferior performance when the subject’s skin type is darker,” Liu said.
“This is in part because light reflects differently off of darker skin, resulting in a weaker signal for the camera to pick up. Our team is actively developing new methods to solve this limitation.”
The researchers are also working on a variety of collaborations with doctors to see how this system performs in the clinic.
Senior author Shwetak Pate said: “Any ability to sense pulse or respiration rate remotely provides new opportunities for remote patient care and telemedicine. This could include self-care, follow-up care or triage, especially when someone doesn’t have convenient access to a clinic.
“It’s exciting to see academic communities working on new algorithmic approaches to address this with devices that people have in their homes.”
This research was funded by the Bill & Melinda Gates Foundation, Google and the University of Washington.