Heart Rate Extraction From Neonatal Near-Infrared Spectroscopy Signals

Near-infrared spectroscopy (NIRS) intensity signals provide useful additional physiological information, of which the most prominent one is the pulsatile fluctuation by heartbeats. This allows for the extraction of heart rate (HR), one of the primary clinical indicators of health in neonates. In this study, we propose a novel algorithm, NIRS HR (NHR), for extracting HR from NIRS signals acquired from neonates admitted to the neonatal intensive care unit (NICU). After parental consent, we synchronously recorded NIRS at 100 Hz and reference HR (RHR) at 1 Hz, from ten newborn infants (gestational <inline-formula> <tex-math notation="LaTeX">$\text {age} =38 \pm 5$ </tex-math></inline-formula> weeks; 3092 ± 990 g). The NHR algorithm consists of two main parts. The first part includes four steps implemented once on the whole NIRS measurement: preprocessing; HR frequency bandwidth determination; interquartile range (IQR) computation; and segmentation. The second part includes three steps implemented on each signal segment: motion artifact detection, signal quality assessment, and HR computation. We compared the NHR algorithm with two existing algorithms. The results showed that the proposed NHR algorithm provides a significantly (<inline-formula> <tex-math notation="LaTeX">$p < 0.05$ </tex-math></inline-formula>) higher correlation (<inline-formula> <tex-math notation="LaTeX">$r$ </tex-math></inline-formula> = 99.5%) and lower Bland-Altman ratio (BAR = 3.6%) between the extracted and RHRs, compared to the existing algorithms. Extracting HR from NIRS in a clinical setting of critically ill neonates admitted to neonatal intensive care is feasible. With NIRS and HR combined in a single monitoring system, it is possible to have a perfectly time-synced integrated analysis of cerebral hemodynamics, as well as systemic hemodynamics and autonomic nervous system tone.


I. INTRODUCTION
N EAR-INFRARED spectroscopy (NIRS) is a noninvasive system for the continuous monitoring of brain hemodynamics [1], [2]. It is an ideal system to measure changes in oxygenation and perfusion of the neonatal brain [3], [4]. Although NIRS is not yet part of the standard clinical routine in most neonatal intensive care units (NICUs), it is increasingly used across various clinical settings [5]. NIRS-monitored cerebral oxygenation has been shown to be of clinical benefit in infants during (non-) cardiac surgery, as well as for the assessment of some clinical conditions such as asphyxia, hypoxic-ischemic encephalopathy, anemia, hypoglycemia, hypotension, and hypocapnia [6], [7], [8], [9], [10], [11]. Using NIRS, different types of information can be acquired, where cerebral oxygenation is the most common one used in clinical settings [3]. Another source of information is the raw NIRS-derived intensity signals, which are widely disregarded for clinical purposes. The NIRS intensity signals are contaminated by physiological artifacts such as cardiac pulsation, respiration, and blood pressure [12], which can potentially be harnessed to yield additional physiological information. The most prominent physiological information in the NIRS signals is cardiac pulsation, which allows for the extraction of heart rate (HR) from NIRS [3], [13].
Extracting HR from NIRS signals offers several potential advantages in neonates over traditional methods for HR monitoring, i.e., electrocardiography (ECG) and photoplethysmography (PPG). NIRS is a noninvasive technique that does not require sensors to be adhesively placed on the skin [3]. This could be especially advantageous for extremely preterm infants with vulnerable skin [17], preventing epidermal stripping [14]. In extremely preterm infants during the first days of life, when the skin is at its most fragile state, often no ECG sensors are placed due to the risk of skin injury, and HR (or pulse rate) is extracted from oximetry measurements [15]. The NIRS signals recorded on the forehead are less prone to motion artifacts than the PPG signals recorded in the periphery. This is because head movements are less frequent in neonates than hand movements [16]. Moreover, as reported in the previous studies [17], [18], the HR derived from NIRS might be a richer source of information on the autonomic nervous system than the HR derived from PPG and ECG under stressful conditions. Additionally, NIRS also enables continuous monitoring of cerebral blood flow and oxygenation. Measuring HR with the same system would allow for a better understanding of physiological mechanisms and interventions in newborns with concurrent analysis of cerebral oxygenation and HR [3]. Lastly, NIRS devices are portable and easy to use [10], making them convenient for monitoring HR and other physiological variables in resource-limited settings or when rapid, noninvasive monitoring is necessary.
Several studies have reported the compatibility of the HR extracted from NIRS with the standard reference HR (RHR) derived from ECG and the HR derived from PPG in healthy adults [17], [18], [19], [20]. Scholkmann et al. [21] proposed an algorithm named automatic multiscale-based peak detection (AMPD) for peak detection in noisy periodic and quasiperiodic signalsClick or tap here to enter text. Thereafter, Holper et al. [17] employed the AMPD algorithm for extracting HR from NIRS data recorded from healthy adultsClick or tap here to enter text. Moreover, in the same study, they showed the usability of NIRS-derived HR in the assessment of physical stress in healthy adults. Hakimi and Setarehdan [18] showed the outperformance of the AMPD algorithm in HR extraction from NIRS signals recorded in healthy adults with respect to the existing algorithms proposed for extracting HR from PPG [18]; furthermore, in the same study, they showed the usability of the NIRS-derived HR in the assessment of mental stress.
Extraction of HR from infant NIRS is more challenging than adult NIRS due to a lower signal-to-noise ratio, a higher level of motion artifacts, and a greater physiological HR range [22]. The latter makes that there is a need for a system with a high sampling rate. Perdue et al. [22] reported an algorithm for the extraction of HR from NIRS signals recorded from healthy infants (∼7 months old) in a controlled setting, sampled at 10 Hz. Perdue et al. [23] conducted a longitudinal study investigating the applicability of the extracted HR (EHR) in infants aged 3-12 months old with familial risk for autism spectrum disorder to observe the HR developmental trajectories to speech stimuliClick or tap here to enter text.
Although the existing studies have shown the feasibility of extracting HR from NIRS signals recorded in both infants and adults, the existing algorithms have only been validated on data recorded from healthy subjects in controlled environments. There are, however, challenges existing in analyzing data acquired in a clinical environment in comparison with a controlled environment, including lower data quality, higher contamination by artifacts due to patient movement, intermittent sensor replacement, and ventilator use, more variability of the signal source, and more restriction imposed by patient safety regulations in the hospital. Because of these challenges, the existing algorithms might not be properly accurate in the extraction of HR from clinical NIRS data; therefore, there is a lack of a study to assess the performance of the existing algorithms in extracting HR from NIRS data recorded in clinical settings, especially in neonatal patients in the NICU. The issue of noninvasive monitoring of critical functions of the body is preeminent in very tiny and vulnerable (preterm) neonates with very vulnerable skin [14]. Especially in this patient group, one wants to have as few sensors and lines as possible, and in this issue, to obtain as much critical information of body functions with minimal wiring and burden with the banana-shaped light pathways. The transmitter's receiver distance is 21.5 mm, which provides an approximate penetration depth of 11 mm (half of 21.5 mm) [25], [26]. (b) Photograph of the TOM sensor placed on a baby manikin's forehead, covered with a clinical self-adhesive elastic bandage. on the patients [24]. A neonatal NIRS system with a robust HR extraction algorithm validated in clinical settings would allow for having a complementary view of the autonomic nervous system and systemic circulation, and cerebral hemodynamics within a single device.
In this study, a novel algorithm, NIRS HR (NHR), is proposed for extracting HR from NIRS signals acquired from neonates admitted to the NICU. The algorithm is validated with respect to the RHR measured with a clinically approved patient monitor system. In addition, we compare the performance of the NHR algorithm with two state-of-art algorithms: the Perdue algorithm, introduced by Perdue et al. [22], validated on NIRS data recorded from healthy infants, and the AMPD algorithm, introduced by Scholkmann et al. [21], validated on NIRS data recorded from healthy adults. Since the proposed algorithm has some processing steps involved for signal quality assessment and motion artifact detection, we also investigate whether these processing steps would improve the HR extraction accuracy with respect to a simple version of the algorithm.

A. Data Acquisition
In this study, NIRS signals were recorded by using a cerebral oximetry system [tissue oxygenation monitor (TOM, Artinis Medical Systems B.V., Elst, The Netherlands)]. This device has a sensor with two transmitters firing at two nominal wavelengths of 760 and 850 nm and a receiver at a 21.5 mm distance from the transmitters. It provides optical densities (ODs), as well as cerebral oxygenation, commonly known as rSO2 or StO2, with a 100 Hz sampling rate. We placed the sensor on the neonate's forehead on the opposite side of which the neonate was lying. We used a clinical self-adhesive elastic bandage to keep the sensor in place on the neonate's forehead. Fig. 1 shows a schematic of the configuration of the two transmitters and the receiver providing two NIRS channels in the sensor and a photograph of the sensor placed on a baby manikin's forehead, covered by the bandage. We recorded the RHR with a Philips IntelliVue MP70 (Philips Medical Systems, Best, The Netherlands) patient monitor at a 1 Hz sampling rate.

B. Participants
All participants were admitted to the NICU of the Wilhelmina Children's Hospital, Utrecht, The Netherlands. This  study was approved by the Local Ethics Committee of the University Medical Center Utrecht, with the ethics code 21-098/C. We included ten newborn infants (gestational age = 38±5 weeks; birth weight = 3092 ± 990 g, three female) who already had an indication for NIRS monitoring, as determined by the caring physician or local neuromonitoring protocols. Parents were informed about the study protocol and gave written consent before the data acquisition. We recorded a total of 19 measurements with an average length of 1.5 h from the included neonates without interrupting clinical routines and with no specific experimental paradigm. The analyses in this study were implemented in Python using numpy, scipy, and matplotlib modules [27], [28].

A. Part I
Part I of the NHR algorithm consists of four steps that are implemented once at the beginning of the analyses on the whole NIRS measurement [see Fig. 2(a)]: preprocessing; adaptive HR frequency bandwidth (AHRFB); interquartile range (IQR); and segmentation. 1) Preprocessing: First, the ODs of the two NIRS channels are converted to concentration changes in oxygenated hemoglobin (O2Hb) and deoxygenated hemoglobin (HHb) by using the modified Beer-Lambert law [29] [OD to Hb in Fig. 2(a)]. Second, using the signal quality index (SQI) algorithm [30], the SQI signals of the two channels are computed [SQI in Fig. 2(a)], based on sliding windows of 10 s, overlapping by 50%. Third, the average of the SQI signals is computed separately, and finally, the channel with the greater average SQI is selected [channel selection in Fig. 2(a)]. In the subsequent steps of the algorithm, the O2Hb signal and the SQI signal of the selected channel are used. To have an equal sampling rate, the SQI ratings of all 10-s windows are finally interpolated to 100 Hz using cubic interpolation.
2) Adaptive HR Frequency Bandwidth (AHRFB): The HR frequency bandwidth is adaptively adjusted per measurement according to a predefined bandwidth between 1.25 and 3.5 Hz. The predefined bandwidth was selected according to the range of the RHR (75 bpm < RHR < 210). First, the O2Hb signal is low-pass filtered by using a moving average filter of 1 s and then subtracted from the original O2Hb signal to filter out low-frequency components (∼<1 Hz). Second, the fast Fourier transformation (FFT) magnitude of the filtered signal multiplied by a Hamming window with the same length as the signal is computed. Third, the FFT magnitude within the predefined bandwidth ([1.25, 3.5] Hz) is sorted in descending order. Then the average frequency of the first 50 components (∼50% of all components) is computed. Finally, the average frequency minus 0.5 Hz and plus 0.5 Hz are considered as the minimum and maximum thresholds of the AHRFB that will be referred, respectively to as f low and f high in the subsequent analyses.
3) Interquartile Range (IQR): First, the difference between the third and first percentiles, known as the IQR of the selected O2Hb signal, is computed in sliding windows of 3 s. Second, the IQR values of all 3-s windows are interpolated to 100 Hz using cubic interpolation to have an equal sampling rate. Finally, the IQR signal is normalized by dividing it by the median of the O2Hb signal. 4) Segmentation: On a sliding window approach, the resulting O2Hb, SQI, and IQR signals are segmented into windows of 50 s, overlapping by 75% (shifted every 12.5 s). It is noteworthy that the first HR value will be computed at the earliest at 50 s of measurement time. Fig. 3(a) illustrates a segment of the O2Hb signal acquired from the data recorded from one of the participants (Measurement 6).

B. Part II
Part II of the NHR algorithm consists of three steps that are implemented every 12.5 s on each 50-s segment of the O2Hb, SQI, and IQR signals, computing or updating HR [see Fig. 2(b)]: 1) motion artifact detection; 2) signal quality assessment; and (3) HR computation.
1) Motion Artifact Detection: First, the segmented (normalized) IQR signal is thresholded with respect to a predefined threshold (th IQR ) of 1% and is then converted to a binary signal. The threshold was obtained empirically. The converted IQR signal is zero when the IQR is above the threshold, meaning presumably contaminated with motion artifact. Second, the percentage of the zeros in the binary signal is calculated. If the percentage is above 80%, meaning that more than 80% of the signal in the 50-s window is contaminated with motion artifact, no HR will be computed for the considered window. Otherwise, the algorithm passes to the next step to assess the signal quality of the signal segment. Fig. 3(f) illustrates a segment of the IQR signal computed from the 50-s signal segment shown in Fig. 3(a). After comparing the IQR signal with the corresponding predefined threshold (th IQR ), the binary IQR signal is computed, which is shown in purple in Fig. 3(g). It is observed that the motion artifacts that exist in about the second half of the signal segment have been correctly captured.
2) Signal Quality Assessment: In this step, first, the segmented SQI signal is thresholded with respect to a predefined threshold (th SQI ) of 1.75 and is then converted to a binary signal. The threshold was obtained empirically. The converted SQI signal is zero when the SQI is below the threshold. Second, the percentage of the zeros in the binary signal is calculated. If the percentage is above 25%, meaning that more than 25% of the signal in the considered window has a low signal quality (lower than an SQI of 1.75), then no HR for the considered window is computed. Otherwise, the algorithm passes to compute the HR. Fig. 3(e) illustrates a segment of the SQI signal corresponding to the signal shown in Fig. 3(a). It is observed that the signal quality in the second half of the segment is lower than that in the first half. After comparing the SQI signal with the corresponding predefined threshold (th SQI ), the binary SQI signal is computed, which is shown in green in Fig. 3(g).
3) HR Computation: The segmented O2Hb signal is detrended [detrend in Fig. 2(b)] to force its mean to zero, as well as to reduce its overall linear variation (trend). The detrending is conducted by subtracting the least-squares fit of a straight line from the data; further, the detrended O2Hb signal is multiplied by the product of the binary (or thresholded) IQR and SQI signals [multiplication operator in Fig. 2 Fig. 3(b) shows a segment of the converted O2Hb signal, which corresponds to the O2Hb signal shown in Fig. 3(a). Next, the autocorrelation of the converted signal is computed and is then detrended [autocorrelation and detrend in Fig. 2 Then, the FFT of the autocorrelation signal multiplied by a Hamming window with the same length is calculated [FFT in Fig. 2(b)]. Finally, the HR in the considered window is computed by finding the frequency with the maximum magnitude within the AHRFB ([ f low , f high ]). We multiply the maximum frequency by 60 to obtain the result in beats per minute (bpm) [frequency maximum within AHRFB in Fig. 2(b)]. Fig. 3(d) illustrates the FFT magnitude of the autocorrelation computed from the converted O2Hb signal that was shown in Fig. 3(b). The dashed green and brown lines depict the RHR and EHR, respectively. To show the impact of the conversion on the HR extraction, Fig. 3(c) shows the FFT magnitude of the autocorrelation computed from the original O2Hb signal shown in Fig. 3(a), so without any motion artifact detection and signal quality assessment. It is observed that there is a huge error of approximately 23 bpm (0.38 Hz) when the original O2Hb signal is used.

IV. ALGORITHM PERFORMANCE ASSESSMENT
We assessed the performance of the NHR algorithm by calculating four quantitative measures quantifying the agreement between the EHR and RHR. The quantitative measures employed are: mean of error (ME), root-mean-square error (RMSE), Bland-Altman limits of agreement (LoA) [31], and Bland-Altman ratio (BAR). LoA was computed by calculating 1.96 times the standard deviation of the error. It represents the range wherein 95% of the differences between the EHR and RHR are falling. BAR was computed by calculating the ratio of the LoA to the mean of the pairwise HRs (RHR and EHR). It is described as an efficient way to quantify the agreement between the RHR and EHR signals [32]: the agreement is ranked as good (BAR < 10%), moderate (10% < BAR < 20%), or insufficient agreement (BAR > 20%). Furthermore, we computed Pearson's correlation to determine the linear association between the RHR and EHR, wherein the aforementioned quantitative measures fail to represent.
We assessed the significance of the correlation with a Student's t-test at a significance level α = 5%, by using the Python scipy.stats module [28]. In addition, we counted the number of segments that have successfully passed the steps of motion artifact detection and signal quality assessment to compute the percentage of included segments. The mentioned quantitative measures were computed for each measurement separately, and the average and standard deviation for each measure were also calculated. To have a fair comparison between the EHR and RHR, we have used the average of the RHR in each 50 s segment.
We compared the performance of the NHR algorithm with the performance of two existing algorithms: the "Perdue" algorithm, proposed by Perdue et al. [22], used for extracting HR from NIRS in infants; the "AMPD" algorithm, proposed by Holper et al. [17] and Scholkmann et al. [21], used for extracting HR from NIRS in adults. Furthermore, to investigate the efficacy of the processing steps involved in the NHR algorithm for adaptively setting HR frequency bandwidth, detecting motion artifacts, and assessing signal quality, we also conducted a comparison with a simple version of the NHR algorithm. The simple version was obtained by excluding steps 2 and 3 in Part I and steps 1 and 2 in Part II of the NHR algorithm (see Section III), which will be referred to as the "Spectrum" algorithm. In the Perdue algorithm, the NIRS signals recorded with 10 Hz were upsampled to 100 Hz as a preprocessing step [22]. We exempted this step as the NIRS signals here were already recorded at a 100 Hz sampling rate. In both Perdue and AMPD algorithms, we took the average of the EHR in each 50 s segment, as it is implemented for the RHR signal (see previous paragraph). This was performed to have an accurate comparison between the performance of the existing algorithms and our proposed NHR algorithm.
To quantitatively compare the performance of the algorithms, we concatenated all measurements together and then computed the mentioned quantitative measures, except the included segments' percentage. As another comparison measure between the algorithms, we defined an agreement limit representing the range in which the difference between RHR and the respective EHR is less than 20% of the mean of the pairwise HRs (RHR and EHR) [33]. Furthermore, we applied Student's t-test with a significant level α = 5% to statistically compare the performance of the algorithms against each other. In addition, for a detailed comparison, we computed the quantitative measures (ME, RMSE, LoA, BAR, and Pearson's correlation) per measurement for the Perdue, AMPD, and Spectrum algorithms.
We conducted a sensitivity analysis of the window length parameter for the NHR algorithm. We changed the window length from 10 to 200 s (10-100 in steps of 10 s, and 100-200 in steps of 25 s) and computed EHR for all measurements by each window length selected. Starting from the shortest window length, we compared the EHR signals computed by each window length with the ones computed by the next greater window length (e.g., 10 s with 20 s). We used Student's t-test (α = 5%) to find the significant cases when increasing the window length leads to a significantly different performance compared with the next shorter window length. With this analysis, we determined the optimal window length that provides less delay, as well as statistically better performance than the other ones.
V. RESULTS A. Results of the Proposed Algorithm Fig. 4 illustrates the scatter plots comparing the RHR with the EHR computed by the proposed NHR algorithm for each measurement separately. It is observed that the measurements had different HR ranges that have been properly captured by the algorithm. As an example, Measurement 6 has HRs ranging from ∼75 to ∼100 bpm, while Measurement 12 has HRs ranging from ∼160 to ∼175 bpm. From the scatter plots, it is observed that there is a significant correlation (i.e., linear association) between the RHR and EHR in all measurements ( p < 0.05). Looking at the bias (ME) in the scatter plots, we observe the bias in HR extraction to be negligible in all measurements. Looking at the LoA in the scatter plots, we observe it to be small in all measurements (exceptions: Measurements 4,5,9,12,and 19).
The quantitative measures showing the agreement and linear association between the RHR and EHR signals for each measurement are reported in Table I. The ME (or bias) is lower than 1 bpm in all measurements except for Measurement 4 Fig. 4. Scatter plots showing the association between the RHR and the EHR by the NHR algorithm per measurement. The y and x axes represent the RHR and EHR in bpm, respectively. Each dot represents the HR in a 50 s segment. The blue dashed line represents the identity line (y = x line), while the full purple and red lines represent the bias (ME) and Bland-Altman LoA, respectively. The correlation between RHR and EHR in all measurements is significant ( p < 0.05%). The plot in the bottom right corner is a sample plot to illustrate the identity, bias, and LoA lines.
(ME = 3.7 bpm). The average and standard deviation of ME computed overall measurements are 0.2 and 1.1 bpm, respectively. The RMSE is lower than 4 bpm in all measurements, except for Measurements 4 and 19, which is approximately 6 bpm, with an average and standard deviation of both approximately 2 bpm. The LoA has an average and standard deviation of approximately 3 ± 3 bpm in all measurements, with the largest ones in Measurement 4 (9.2 bpm) and Measurement 19 (9.8 bpm). Likewise, the BAR has an average and standard deviation of 2.5 ± 2.2 bpm in all measurements, with the highest ones in Measurement 4 (7.2%) and Measurement 19 (8.2%). The BAR percentage of each measurement falls within the good agreement category (BAR < 10%) [32] between RHR and EHR. The EHR signal of each measurement showed a positive and significant ( p < 0.05) correlation of more than 74% with the RHR signal. Across all measurements, the average and standard deviation of Pearson's correlation is 91.1 ± 8.2%. The included segments' percentage of each measurement is more than 95%, except for Measurements 8, 15, and 19 (88%, 82%, and 67%, respectively). Across all measurements, the average and standard deviation of the percentage of included segments is 95.2 ± 8.2%, meaning that the NHR algorithm, on average, excluded approximately 5% of the signal segments, during which no HR computation could be performed. Fig. 5 shows an example from one of the NIRS measurements (Measurement 17) of the EHR signals extracted by using our proposed NHR algorithm (in green), the Perdue algorithm [22] (in blue), and the AMPD algorithm [21] (in purple). Our derived EHR signal is a better representation of the RHR (in red) that has properly followed the trend of the RHR signal from the beginning to the end of the measurement. In this measurement, Pearson's correlation between the RHR and the EHR computed by our proposed NHR algorithm is 99.3% ( p < 0.05), while it is 87.5% ( p < 0.05) by using the Perdue algorithm, and 93.2% ( p < 0.05) by using the AMPD algorithm. Fig. 6 illustrates the scatterplots between the RHR and EHR computed by the NHR algorithm [see Fig. 6 5. Example of the EHR signals by using the NHR algorithm (in green), the Perdue algorithm [22] (in blue), and the AMPD algorithm [21] (in purple), compared with the RHR signal (in red). The x and y axes represent the time in min and the HR in bpm, respectively. In this measurement (Measurement 17), Pearson's correlation between the RHR and EHR by the NHR algorithm is 99.3% ( p < 0.05), whereas it is 87.5% ( p < 0.05) by the Perdue algorithm and 93.2% ( p < 0.05) by the AMPD algorithm. Fig. 6. Scatter plots between the RHR and the EHR by (a) NHR algorithm, (b) Perdue algorithm [22], (c) AMPD algorithm [21], and (d) Spectrum algorithm, for all the measurements concatenated together. The y and x axes in each plot represent the RHR and EHR in bpm, respectively. Each dot represents the HR in a 50 s segment. The blue dashed line represents the identity line (y = x line), while the full purple and red lines represent the bias (ME) and Bland-Altman LoA, respectively. The cyan lines represent the defined agreement limit. They are in 20% of the mean of the pairwise HRs (RHR and EHR) distance from the identity line.

B. Comparison With the Existing Algorithms
algorithms are more sparsely distributed around the identity line (y = x line) than the one for the proposed NHR algorithm in this study. Likewise, compared with the NHR algorithm (four points), the Perdue (574 points), AMPD (847 points), and Spectrum (255 points) algorithms have much more data points outside the defined agreement limit (cyan lines), indicating errors of more than 20% of the pairwise HRs mean. In addition, the Perdue, AMPD, and Spectrum algorithms are observed to have a bias in computing HR, e.g., the Perdue algorithm toward a higher value (i.e., more data points below the identity line) and the AMPD and Spectrum algorithms toward a lower value (i.e., more data points above the identity line) than the RHR. On the contrary, this is not the case for the NHR algorithm as the bias seems to be negligible. Furthermore, the Perdue, AMPD, and Spectrum algorithms are observed to be unable to adaptively capture different HR ranges, e.g., the Perdue algorithm in the RHR approximately lower than 100 bpm, and AMPD in the RHR approximately between 100 and 150 bpm. Conversely, the NHR algorithm seems to be adaptable to the different ranges of HR, i.e., leaving the dots close to the identity line in the scatterplot for the whole HR range [see Fig. 6(a)]. The quantitative measures showing the agreement and linear association between RHR and EHR computed by the four algorithms in all measurements concatenated together are reported in Table II. The ME (or bias) is higher for the Perdue (ME = −5.3 bpm) and AMPD (6.8 bpm) algorithms than that for the Spectrum algorithm (1.8 bpm) and NHR algorithm (0.2 bpm). The RMSE is approximately 17, 21, and 10.5 bpm by the Perdue, AMPD, and Spectrum algorithms, respectively, whereas it is as low as 2.4 bpm by the NHR algorithm. The LoA is higher for the Perdue (LoA = 31.5 bpm), AMPD (39.0 bpm), and Spectrum (20.3 bpm) algorithms than that for the NHR algorithm (4.6 bpm); this was also observed in the scatter plots shown in Fig. 6. Looking at the BAR percentages, we obtained a good agreement (BAR < 10%) between RHR and EHR for the NHR algorithm, whereas the agreement between RHR and EHR for the Perdue and AMPD algorithms was insufficient (BAR > 20%) and for the Spectrum algorithm was moderate (10% < BAR < 20%). Pearson's correlation between the RHR and EHR for all algorithms was significant ( p < 0.05); however, the correlation obtained by the NHR algorithm was approximately 28%, 26%, and 15.8% greater than the one obtained by the Perdue (71.8%), AMPD (73.2%), and Spectrum (83.7%) algorithms, respectively. Fig. 7 illustrates the raincloud plot of the absolute error between RHR and EHR signals obtained for the four algorithms. In this figure, we observe that the majority of the errors by the NHR algorithm are about zero, whereas the errors made by the Perdue, AMPD, and Spectrum algorithms are more sparsely distributed toward higher values, leaving an additional peak close to the greatest error. This additional peak in the raincloud plot for the Perdue algorithm is due to the huge error in computing HRs lower than approximately 100 bpm, as was observed in Fig. 6. Conversely, the additional peak in the raincloud plots for the AMPD and Spectrum algorithms is due to the error observed in computing HRs above approximately 100 bpm. The result of the statistical comparison (Student's t-test with α = 5%) between the performance of four algorithms exhibits that there is a significant difference ( p < 0.05) between the performance of the algorithms against each other. Table III summarizes the quantitative measures computed for assessing the performance of the Perdue, AMPD, and Spectrum algorithms per measurement.

C. Determining the Optimal Window Length
The optimal window length in the NHR algorithm was determined by changing the window length of the sliding window approach implemented in the segmentation step. Fig. 8 illustrates the Pearson's correlation computed between RHR and EHR with respect to the window length. It is generally observed that by selecting a greater window length, a greater correlation was achieved. Any window length greater than or equal to 30 s resulted in a correlation higher than 80%. Nevertheless, there was no statistically significant difference in the performance of the NHR algorithm when comparing window lengths greater than 50 s; therefore, we defined the shortest optimal window length to be 50 s, because it statistically performs better than shorter window lengths ( p < 0.05), but not worse than longer window lengths ( p > 0.05). Fig. 8. Pearson's correlation between the RHR and EHR when changing the window length in the sliding window approach implemented in the NHR algorithm. X and Y axes represent the window length in seconds and Pearson's correlation in percentage, respectively. From the shortest window length upward, the outcome of each window was compared statistically (Student's t-test with α = 5%) with the outcome of the next greater window length. The red and blue dots represent the significant cases ( p < 0.05) and the first insignificant case, respectively. The blue dashed line depicts the window length corresponding to the first insignificant case, indicating the shortest optimum window length.

VI. DISCUSSION
In this study, we developed a novel algorithm for extracting HR from the NIRS signals recorded from critically ill neonates admitted to the NICU. To the best of our knowledge, the proposed NHR algorithm is the first HR extraction algorithm developed on neonatal NIRS signals acquired in a clinical environment. The existing AMPD algorithm, introduced by Scholkmann et al. [21], was validated on signals recorded from healthy adults in a controlled environment [17], [18]. The existing Perdue algorithm, introduced by Perdue et al. [22], was developed on signals recorded from infants older than three months and in a controlled environment, whereas our proposed algorithm was developed on signals recorded from newborn infants admitted to NICU, i.e., in real-life clinical settings. Our results showed that the NHR algorithm provides a significantly ( p < 0.05) better performance in HR extraction from neonatal NIRS signals than the Perdue and AMPD algorithms (see Table II and Figs. 6 and 7). The NHR algorithm numerically showed a greater correlation (r = 99.5%, p < 0.05) between the EHR and RHR than the Perdue algorithm (r = 71.8%, p < 0.05), and the AMPD algorithm (r = 73.2%, p < 0.05). Furthermore, the NHR algorithm showed a higher agreement between the EHR and RHR than the Perdue and AMPD algorithms in terms of ME, RMSE, LoA, and BAR (see Table II).
Data acquisition in clinical settings has some challenges that have led us to use some novelties in the NHR algorithm by adding some processing steps in order to achieve a robust algorithm in HR extraction from neonatal NIRS. Compared with a controlled environment, the data recorded in a clinical environment contains more motion artifacts induced by patient movement, sensor replacement, and ventilator use. Hence, in this study, we have performed steps in the NHR algorithm to detect motion artifacts per segment and reform the contaminated NIRS signal (see Section III-B). Fig. 3 shows an example of a 50-s segment of the NIRS signal, which is contaminated with motion artifacts [see Fig. 3(a)]. This figure illustrates the IQR [see Fig. 3(f)], thresholded IQR [see Fig. 3(g)], the converted NIRS signal [see Fig. 3(b)], and the spectra of the converted signal [see Fig. 3(d)] and the original signal [see Fig. 3(c)], i.e., without any motion artifact detection and signal quality assessment. With the conversion, the spectrum was enhanced in the vicinity of the RHR frequency (1.66 Hz), and thereby, a more accurate HR was computed (1.65 versus 1.27 Hz). By using the proposed algorithm, we excluded only those signal segments where motion artifacts were detected in over 80% of the segment. Looking at the results reported in Table I, we are including the majority of the signal segments for the HR computation, i.e., on average, only excluding 5% of the 50-s segments, which were highly (>80%) contaminated with motion artifacts. Another challenge in clinical data is the low data quality, which is partly due to the restrictions of data acquisition in clinical settings. To overcome this challenge, we have performed three steps in the NHR algorithm to select the highest quality NIRS channel (see Section III-A), assess the NIRS signal quality per segment, and include the good quality signals in the analyses (see Section III-B). Another challenge in clinical data is the variability of the signal source, here, the variability of HR (see Fig. 4). To overcome this challenge, we have adaptively set the HR frequency bandwidth in each measurement in the NHR algorithm (see Section III-A). Our results of the comparison between the Spectrum and NHR algorithms showed the outperformance ( p < 0.05) of the NHR algorithm, and therefore, confirmed the efficacy of the processing steps involved in the NHR algorithm in the HR extraction with respect to the simple version of the NHR algorithm (i.e., Spectrum algorithm). For example, the additional processing steps not only improved the linear correlation (99.5% versus 83.7%) but also enhanced the agreement between the EHR and RHR in terms of ME (0.2 versus 1.8 bpm), RMSE (2.4 versus 10.5 bpm), LoA (4.6 versus 20.3 bpm), and BAR (3.6 versus 15.8%).
When an HR extraction algorithm is based on peak detection, such as the Perdue [22] and AMPD [17], [21] algorithms, there will be an unavoidable discretization error in HR computation. The NIRS signals recorded in this study were sampled at 100 Hz, whereas the NIRS signals recorded in the Scholkmann et al. [21] study and Perdue et al. [22] study Click or tap here to enter text.Click or tap here to enter text.were sampled at 10 Hz. The maximum error in peak detection with a 10 Hz sampling rate due to discretization error is 0.05 s, whereas it is 0.005 s with a 100 Hz sampling rate. The peak detection error yields an HR computation error depending on the HR range. For instance, if the HR is 65 bpm (= (60/interbeat interval = 0.92 s)), which is approximately the average resting state HR in male adults reported in [34], then the maximum error in HR computation due to discretization error with a 10 Hz sampling rate will be approximately 3 bpm ((60/0.92)−(60/0.92 + 0.05)), whereas it will be approximately 0.4 bpm with a 100 Hz sampling rate. Although this might not be a considerable error, it will be more pronounced in infant data. As an example, if the HR is 130 bpm, which is the average RHR in the neonates admitted in this study, the maximum error in HR computation with a 10 Hz sampling rate will be approximately 13 bpm, whereas it will be approximately 1 bpm with a 100 Hz sampling rate. This explains the importance of having a high sampling rate in the NIRS system when extracting HR signals in infants.
Another advantage of having the NIRS signal recorded with a high sampling rate (100 Hz) is that it provides the opportunity to have an accurate assessment of the signal quality, as well as motion artifacts. In this study, we have employed the SQI algorithm [30] to assess the NIRS signal quality. This algorithm quantifies the strength of the heartbeat components in the NIRS signals, representing the strength of the optode-scalp coupling. This assessment requires signals recorded with a high sampling rate, e.g., 50 Hz, as has been used for developing SQI. As an additional step, in this study, we have employed IQR to detect the occurrence of motion artifacts. A high sampling rate provides more samples in the window where the IQR is computed rather than a low sampling rate, and thereby, a more accurate representation of motion artifacts (with different frequency components) is obtained. The impact of employing these two assessments is shown in Fig. 3, which has led to a more accurate HR computation.
Although the results we obtained for the proposed NHR algorithm are promising, further improvements are possible. The NHR algorithm updates HR every 12.5 s, whereas the Perdue and AMPD algorithms could compute HR beat-bybeat. This delay in HR computation makes this algorithm ineffective in applications where short-term characteristics of HR are analyzed [35], [36]. The proposed NHR algorithm has been designed to be implemented offline on the NIRS measurements; however, an online version of the algorithm could be established with relatively few adjustments. These adjustments would be in Part I of the algorithm in steps of preprocessing and AHRFB. For instance, in the preprocessing step, when selecting the best channel, instead of computing the SQI for the whole measurement, we could compute it from the beginning of the measurement up to the current moment. This could be repeated at several-minute intervals, which depends on the variability of the signals. Moreover, instead of using SQI values from the beginning of the measurement, we could use the values in the last several minutes. Also, it is possible to store the SQI values computed in each segment within Part II of the algorithm, and then we could reuse them when selecting the best channel. Similarly, in the AHRFB step, the adaptive bandwidth could be calculated using the O2Hb signal either from the start of the measurement up until the present moment or during the last several minutes.
The NHR algorithm was validated in this study on a dataset of a total of 7834 HR samples (19 measurements), shown as dots in Figs. 4 and 6. Eliminating excessive HR monitoring sensors used in clinical settings, however, requires further validation of the algorithm on a more extensive dataset acquired from neonates in the clinical environment, especially extremely preterm infants. Looking at the public datasets, to the best of our knowledge, there is no public dataset, including raw NIRS data with a high sampling rate, as well  as RHR, which were recorded from newborn infants in a clinical environment; therefore, we were unable to validate the proposed algorithm on a larger and more extensive dataset than the one employed in this study. Furthermore, we were unable to use the public PPG datasets as the proposed algorithm with the defined processing steps cannot be applied to the PPG data, i.e., we have used methods in the proposed algorithm for the assessment of signal quality and motion artifacts that were proposed and validated on NIRS data.
Extracting HR from NIRS signals could have several advantages over traditional ECG and PPG methods. First, NIRS is a noninvasive technique that does not require electrodes or sensors to be adhesively attached to the skin. This could be particularly beneficial for fragile newborns, especially extremely preterm infants, in the NICU who have very delicate skin, especially in the first few days after birth [17]. Routinely placing and removing the adhesive sensors on the skin in these fragile newborns are known to induce discomfort, stress, and pain, which are reported to be linked with poor neurodevelopmental outcomes [14], [15], [16]. Second, in neonates, the PPG signals recorded in the periphery are more prone to motion artifacts than NIRS signals recorded on the forehead. This is because hand movements in newborn infants are more frequent than head movements [16]. So that some studies have shown that newborn infants in the supine position tend to lie with their heads turned to the right, accounting for 70%-85% of their lying time [37], [38], [39]. Moreover, it has been shown that the pulse oximeter sensor placed over the forehead has more sensitivity to arterial pulsatile changes under low perfusion conditions than the sensor placed over peripheral body locations [40]. This is due to the thin-skin layer of the forehead, which is coupled with a prominent bone structure that helps to direct light back to the photodetector [41]. Third, two existing NIRS studies of stress assessment on adults have shown that the HR (i.e., pulse rate) derived from NIRS is a richer source of information than the HR derived from PPG and ECG under physical and mental stress conditions [17], [18]. Hence, the HR signal derived from NIRS might be more beneficial in both research and clinical environments for the assessment of the autonomic nervous system; however, more studies need to be conducted to confirm the greater richness of the NIRS-derived HR. Fourth, NIRS provides continuous monitoring of cerebral blood flow and oxygenation [3]; therefore, extracting HR from NIRS could help clinicians to simultaneously measure changes in perfusion, oxygenation, and HR, and therefore, better understand the physiological mechanisms underlying different health conditions and interventions in newborns. Last, NIRS devices are generally portable and easy to use [10], which could make them a convenient option for monitoring HR and other physiological variables in the NICU. This could be particularly beneficial in resource-limited settings or in situations where rapid, noninvasive monitoring is necessary.
This study aimed to harness the richness of high sampling rate NIRS signals that are recorded in clinical settings, i.e., taking advantage of the heartbeat information in NIRS and proposing a robust algorithm for the HR extraction from clinical neonatal data. Although the proposed NHR algorithm has more delay in computing HR than the existing Perdue and AMPD algorithms, i.e., 12.5 s with respect to HR computation beat-by-beat, the NHR algorithm showed significantly ( p < 0.05) a greater accuracy in extracting HR from neonatal NIRS data. A NIRS system with the NHR algorithm provides the opportunity to have synchronized HR and cerebral oxygenation. NIRS-derived cerebral oxygenation is known as a surrogate for cerebral perfusion [42], and HR has been proposed as a considerable alternative for blood pressure in preterm infants [43]; therefore, a high sampling rate NIRS system with the NHR algorithm could be used potentially as a future standalone system for assessing cerebrovascular autoregulation in preterm infants. Furthermore, the feasibility of the NIRS cerebral oxygenation and HR have been supported by a growing body of research in optimizing monitoring in newborns with hypoxic-ischemic encephalopathy, adding to neurodevelopmental outcome prediction and also in assessing sleep stages in infants [6], [7], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54]; therefore, the fusion of the cerebral oxygenation and HR provided by the standalone NIRS system could be investigated in infants as a future study.

VII. CONCLUSION
In this study, we developed a novel algorithm, NHR, for extracting HR from clinical NIRS signals recorded in neonates. The results numerically showed the compatibility of the HR extracted from NIRS with the RHR, measured with ECG, in terms of Pearson's correlation (r > 0.9) and BAR (<5%). The NHR algorithm significantly outperformed two existing algorithms, i.e., Perdue and AMPD algorithms. The promising results achieved suggest that the feasibility of the NHR algorithm could be exploited in clinical applications where a combination of NIRS and HR monitoring can contribute to monitoring the vulnerable newborn brain. Naser Hakimi received the B.Sc. degree from Shahid Beheshti University, Tehran, Iran, in 2015, and the M.Sc. degree from the University of Tehran, Tehran, in 2018, both in electrical and biomedical engineering.
His main research has been on near-infrared spectroscopy (NIRS) signal analysis in adults and infants, including physiological information (e.g., heart rate and respiratory rate) extraction from NIRS, NIRS signal quality assessment, stress, and sleep assessment.
Jörn M. Horschig received the Bachelor of Science degree (cum laude) in knowledge engineering | computer science from Maastricht University, Maastricht, The Netherlands, in 2008, and the Master of Science (bene meritum) and Ph.D. degrees in cognitive neuroscience from the Donders Institute for Brain, Behavior and Cognition, Radboud University, Nijmegen, The Netherlands, in 2010 and 2014, respectively.
After a short-term postdoctoral position with the Donders Institute, he started working as a Software Developer and a Project Leader with Artinis Medical Systems B.V., Elst, The Netherlands, where he is currently a Software Manager and a Supervisor of the Science and Research Team. His research interests include the development of algorithms to extract and predict physiological parameters from near-infrared spectroscopy (NIRS) data.
Thomas Alderliesten received the Ph.D. degree in monitoring the oxygenation of the preterm brain from Utrecht University, Utrecht, The Netherlands, in 2016.
He is a Pediatrician and a fellow in neonatal and pediatric intensive care, and an expert in neonatal neuromonitoring and neonatal neuroimaging. He currently works with the Wilhelmina Children's Hospital, Utrecht, on both the neonatal and pediatric intensive care units.
Mathijs Bronkhorst received the master's degree in biomedical engineering from the University of Twente, Enschede, The Netherlands, in 2016, with a focus on clinical physics.
He was a Visiting Scholar with the University of New South Wales, Sydney, NSW, Australia. He joined Artinis Medical Systems B.V., Elst, The Netherlands, working on the development of near-infrared spectroscopy (NIRS) devices with a special interest in the translation of NIRS to the clinical setting. Among others, he led the development of a neonatal bedside neuromonitoring device (Eurostars, Neuroguard) and an orthostatic hypotension detection device (European Fund for Regional Development (EFRO), ProHealth). He is currently the Manager of Product Development with Artinis Medical Systems B.V. He became one of the leading scientists in the field of near-infrared spectroscopy (NIRS). Currently, he is a CEO with Artinis Medical Systems B.V., Elst, The Netherlands, the world's leading company in wireless and portable NIRS devices.
Jeroen Dudink is currently a Neonatologist, an Associate Professor, and a Clinical Scientist with the Wilhelmina Children's Hospital, Utrecht, The Netherlands. He is an expert in advanced neonatal neuromonitoring and neuroimaging, focusing on neonatal sleep and neonatal cerebellar development.