Respiratory Volume Monitoring: A Machine-Learning Approach to the Non-Invasive Prediction of Tidal Volume and Minute Ventilation

Continuous monitoring of ventilatory parameters such as tidal volume (TV) and minute ventilation (MV) has shown to be effective in the prevention of respiratory compromise events in hospitalized patients. However, the non-invasive estimation of respiratory volume in non-intubated patients remains an outstanding challenge. In this work, we present a novel approach to respiratory volume monitoring (RVM) that continuously predicts TV and MV in normal subjects. Respiratory flow in 19 volunteers under spontaneous breathing was recorded using respiratory inductance plethysmography and a temperature-based wearable sensor. Temperature signals were processed to identify features such as temperature amplitude and mean value, among others. The feature datasets were then used to train and validate three machine-learning (ML) algorithms for the prediction of respiratory volume based on temperature-related features. A model based on Random-Forest regression resulted in the lowest root mean-square error and was subsequently chosen to predict ventilatory parameters on subject test data not used in the construction of the model. Our predictions achieve a bias (mean error) in TV and MV of 16.04 mL and 0.19 L/min, respectively, which compare well with performance metrics reported in commercially-available RVM systems based on electrical impedance. Our results show that the combination of novel respiratory temperature sensors and machine-learning algorithms can deliver accurate and continuous estimates of TV and MV in healthy subjects.


I. INTRODUCTION
The development of respiratory monitoring systems has received continuous attention as respiratory parameters such as rate of breathing are routinely assessed in the physical examination of patients [1], [2]. Abnormal values in respiratory parameters are currently recognized as early signs of patient deterioration, and have long been associated to respiratory failure in non-intubated patients [3], in-hospital cardiac The associate editor coordinating the review of this manuscript and approving it for publication was Essam A. Rashed . arrest [4], and post-anesthesia respiratory depression [5], among others.
Respiratory monitoring is a common practice in intubated patients undergoing mechanical ventilation, but remains an open challenge in non-intubated patients that breathe spontaneously [6]. Current gold-standard methods for non-intubated patients available in the clinical setting are based on capnometry and impedance pneumography systems, which provide continuous estimates of the respiratory rate (RR). However, they require the use of nasal cannulas and chest electrodes that may not be well tolerated by some patients, and that typically require wired connection to external devices for analysis, which has hindered their massive adoption [7]. More recently, RR has been estimated in patients by means of advanced processing of signals acquired from photopletismography [8], [9], video recording [10], [11] and thermal imaging [12], all of which deliver a less invasive and more comfortable experience to the patient. For a review of recent contact-less respiratory monitoring systems see [13].
The large majority of current monitoring technologies for non-intubated patients are focused on the estimation of RR. While useful, RR provides an incomplete assessment of the ventilatory condition of the patient, as it cannot offer a measure of the air available for gas transfer in the lungs. In effect, parameters such as tidal volume (TV), defined as the volume of air during one inspiration/expiration cycle, and minute ventilation (MV), defined as the total volume of air inspired/expired during one minute, are better suited to describe the ventilatory status of the patient [14]. These parameters, which are a standard in patients connected to mechanical ventilation, have shown to be a reliable predictor of adverse respiratory events in non-intubated patients. For example, recent studies have shown that monitoring the evolution of MV, and not RR alone, can be effectively used to anticipate events respiratory depression in patients that are discharged from post-anesthesia care units [15] and hypoventilation in sedated patients undergoing gastrointestinal interventions [16].
The relevance of monitoring TV and MV in delivering early warnings of respiratory complications has motivated the development of respiratory volume monitors (RVMs) for non-intubated patients. Current RVM technologies rely on measuring electrical impedance changes of the thorax, which are then analyzed to estimate the MV and TV [17]. Impedance-based RVMs have been extensively validated in the clinical setting [18], but they require a calibration step using spirometry to enable accurate predictions of TV and MV [17]. Further, impedance-based respiratory monitors are subject to motion artifacts which that may hinder their operation [19]. More recently, respiratory monitoring has been approached using disposable strain sensor [20] and overclothing radio-frequency sensors [21], which have shown promising results in the estimation of respiratory volume in healthy volunteers under spontaneous breathing. However, these systems also require individual calibration to determine model parameters that are specific to the user.
In this work, we present and validate an innovative RVM for non-intubated subjects. To this end, we use a novel non-invasive wearable temperature sensor that allows for the time-continuous acquisition of respiratory signals [22]. Further, we employ state-of-the-art machine-learning (ML) algorithms in the construction of a predictive model for ventilatory parameters. The main motivation behind this work is to develop an accurate and validated RVM system for the continuous estimation of ventilatory parameters in a noninvasive way. Further, the use of ML algorithms seeks to provide accurate ventilatory estimates that are valid for a group of users, rather than for a specific subject, potentially This article is organized as follows. In Section II we describe the sample of volunteers, the experimental setup, and the protocols for the acquisition of respiratory signals. In addition, we describe the ML algorithms considered in this study, and state the error and performance metrics employed in the analysis of the results, In section III we present the main results of the study, along with a performance assessment of the best model for RVM. We close this article in section IV by discussing the main results and comparing them with other studies reported in the literature, as well as analyzing the current limitations and future extensions of this work.

II. METHODS
A schematic summarizing the proposed work and methods is included in Figure 1.

A. RESPIRATORY DATA ACQUISITION
Nineteen healthy human subjects were recruited for this study, signed an informed consent, and completed a protocol approved by the Institutional Ethics Committee of the Pontificia Universidad Católica de Chile. The inclusion criteria were: subjects between 18 and 65 years old, nonsmokers, and no record of chronic pulmonary disease or sleep VOLUME 8, 2020  apnea. The anthropometric data for the subject group is included in Table 1. A non-invasive respiratory sensor, described below, and respiratory impedance pletismography (RIP) bands were installed in all subjects before directing them to lie down in supine position, see Figure 2 for an schematic of the experimental setup. Subjects were then asked to breathe normally for at least 10 minutes, period during which respiratory data was acquired.
Oral and nasal respiratory signals were acquired using the non-invasive temperature-based respiratory monitoring system (TRMS) described in [22]. Figure 2 shows an schematic of the instrumentation setup. The system comprises a wearable respiratory sensor and an external monitor that reports the respiratory signal and other flow parameters in real time. The respiratory sensor was installed below the nose and above the upper lip of the user, and collected information about the airflow by sensing the temperature of the respiration airflow coming in and out the nostrils and the mouth. Signals from the sensor were transmitted to an external monitor for processing and storage. We note that no calibration was necessary before acquiring respiratory signals with the TRMS. In parallel, we measured respiratory activity using a physiological signal acquisition system BioRadio TM for RIP (Great Lakes NeuroTechnologies, Cleveland, OH, USA). To this end, two thoracic bands were installed on each subject before the data acquisition. The calibration of the BioRadio system was carried out on each patient using the spirometry function simultaneously with the RIP bands to measure 3 minutes of spontaneous breathing, after which the mouthpiece was removed. A linear relation between the spirometry and the RIP-bands signals was established using multiple linear regression, which allowed for a continuous volume monitoring based on the RIP signal.

B. FEATURE EXTRACTION
Respiratory cycles and RR were identified from the TRMS signal using a mean-cross algorithm [22]. In brief, a 5-second-window moving average was applied to the respiratory temperature signal to obtain a filtered signal. Time instants at the intersection between the original and filtered signal were determined, to select those points where the original signal increased faster than the filtered signal. Subsequent selected points defined the respiratory cycle period, from which the RR was computed as the inverse of the period. For each respiratory cycle the following eight input variables (features) were determined: nasal temperature amplitude (AC n ), nasal mean temperature (DC n ), nasal expiratory mean rate (MR n ), oral temperature amplitude (AC o ), oral mean temperature (DC o ), oral expiratory mean rate (MR o ), rise time (RT ), and ambient temperature (AT ). The tidal volume (TV ) for each respiratory cycle and the minute ventilation (MV ) were determined from the volume measurements reported by the RIP system. A total of 2252 respiratory cycles were obtained from the subject experiments. This dataset was randomly partitioned into a training dataset containing 80% of the total data (N train = 1801) and a test dataset containing 20% of the total data (N test = 451). A correlation analysis using the training dataset between the eight input variables was carried out to detect colinearity.

C. MACHINE-LEARNING ALGORITHMS AND TRAINING FOR THE PREDICTION OF TIDAL VOLUME AND MINUTE VENTILATION
Three supervised regression algorithms for the prediction of TV were considered: Linear regression (LR), Support vector regression (SVR) and Random forest regression (RFR) [23]. Each model was trained by performing a k-fold crossvalidation (k = 10) using the training dataset. The training performance was assessed in terms of the root mean squared error (RMSE) defined as whereT V i corresponds to the prediction of the model for the i−th cycle, TV i is the target output for the same cycle, and N is the total number of cycles in the training dataset.
After the validation of the 10 cases, the average and standard deviation of the obtained RMSE values were computed for each algorithm. Based on these results, we selected the best algorithm as the one whose model delivered the least RMSE in average. For the selected model, the tuning of hyperparameters was performed by carrying out a k-fold cross validation (k = 10) using the training dataset. Several combinations of hyperparameters were analyzed, from which we computed the mean and standard deviation of the RMSE to select the best hyperparameter combination based on the lowest RMSE.
To further understand how the size of the training dataset affected the model predictions, we performed a learning curve analysis, where the chosen model is trained using different sizes of the training dataset. Using the chosen model for TV, we estimated the minute ventilation for the i−th respiratory cycle aŝ where S i is the set of respiratory cycles that fall within a 30-second window that ends with the i − th respiratory cycle.

D. PERFORMANCE ASSESSMENT OF TV AND MV PREDICTION AND STATISTICAL ANALYSIS
The best model for the TV estimation was assessed using the test dataset. To this end, we computed the RMSE and performed a Bland-Altman analysis both on the prediction of TV and the MV. LetX i be the model prediction for the i − th respiratory cycle and X i be the benchmark value measured by the RIP system, which was considered as the gold standard in this study. We defined the cycle error as Then, the absolute bias, precision and accuracy of the model estimation were determined as Relative bias, precision and accuracy were also computed by redefining the error in relative terms as and using (7) in evaluating (4), (5) and (6). Bland-Altman plots were generated to visually assess the agreement between the selected model with the reference for the TV and MV estimations.

III. RESULTS
The correlation matrix for the input features is shown in Figure 3, where we observe that the absolute correlation between different features does not exceed 0.8. The highest correlation was found for the case of AC o and MR o , followed by the case of AC n and MR n with a correlation of 0.7. Despite this high correlation, we kept MR o and MR n as input features in the model as they are important physiological quantities that inform us about the rate of the breathing process, which is not contained in the AC o or AC n values. As a sensitivity analysis, for the final model we also considered a reduced set of input features selected so that correlation between variables did not exceed 0.5. As a result, the reduced set included input variables AC n , AC o , DC n , DC o , AT .
The RMSE values from the k-fold cross-validation analysis (k = 10) performed for the LR, SVR and RFR models are reported in Table 2. The RFR model resulted in the lowest RMSE value, and therefore was the only one considered for subsequent analysis. Hyperparameter tuning for the RFR model considered varying the sampling method (sampling data with or without replacement, bootstrap = (True, False)), the number of trees in the forest (N est = (10, 20, 50, 100, 200)), the maximum number of levels in each decision tree (max depth = (None, 50, 100, 200, 500)), the minimum number of samples needed to split an internal node (min split = (2, 5, 20, 50, 100)), the minimum number of samples for the leaf nodes (min leaf = (1, 2, 10, 20, 50)), the maximum number of features considered for splitting a node (max feat = (auto, sqrt, log2, None)), and the complexity parameter for minimal cost-complexity pruning (ccp α = (0.0, 1e-6, 1e-4, 1e-2, 1)). In total, we examined 25,000 combinations of hyperparameters. For each parameter set, a k-fold cross-validation analysis (k = 10) was performed using the training dataset, and the RMSE was calculated. The optimal set of hyperparameters found from this procedure is reported Table 3, where we note that the average RMSE was lower than that obtained by the initial RFR model during the model selection step, see Table 2. Further, the optimal parameters and RMSE that resulted from training a RFR model that considers a reduced set of input features (RFR2) are also included in Table 3. Table 4 reports the performance assessment in the estimations of TV and MV based on the trained RFR model, where the bias, precision and accuracy of the prediction of both ventilatory parameters are evaluated using the test dataset. The performance metrics of the RFR2 reduced model are also included. Figures 4 and 5 show the Bland-Altman plots for the assessment of TV and MV predictions, respectively.     6 and 7 show the scatterplot of the predicted and measured TV and MV, respectively. In both cases, the Pearson's correlation coefficient was greater than 0.9, confirming the substantial agreement between predicted and measured ventilatory parameters. The time evolution of the RR, TV and MV for a representative subject along with the predictions of the RFR model are shown in Figure 8.
To understand the contribution of input features in the RFR model predictions, a Random-Forest variable importance analysis was performed [23], see Figure 9. Ambient temperature was the most relevant feature in the prediction of TV, followed by the average nasal and oral temperatures and  the nasal temperature amplitude. These four features added up to 73% of the cumulative normalized importance. We also observed that features AC o , RT and MR o were the least influential features in the model as they accounted altogether for less than 8% of the cumulative normalized importance.
The learning curve that assesses the evolution of training and validation error of the RFR model as a function of  the dataset size is reported in Figure 10. The training error resulted in a small and relatively constant value (RMSE = 1.6e-05 ± 1.3e-06 [L]) that did not depend on the size of the dataset. In contrast, the validation error displayed a decreasing trend as the dataset size was increased, but did not reach the training error level for the largest dataset size.

IV. DISCUSSION
In this work we study the performance of a non-invasive system for respiratory volume monitoring. The estimation of ventilatory parameters was approached using three classical machine-learning methods. After training and validation, the best model was able to deliver a continuous prediction of TV and MV in human subjects, see Figure 8.
Eight features were computed from the temperature signals obtained from the TRMS to estimate the TV and MV.
An initial correlation analysis resulted in pairs AC o , MR o and AC n , MR n being the most correlated ones. This high correlation can be partially explained for subjects under spontaneous breathing in resting conditions, where the respiratory signal approaches a triangular wave form with constant frequency. In this case, the expiratory mean rate can be approximated as the signal amplitude divided by the expiratory time, which establishes a direct relation between AC and MR values. The remaining features did not exhibit high correlations.
During the training process, the RFR model delivered a better performance than the LR and SVR models in the prediction of TV, see Table 2. The LR model resulted in RMSE values that were twice those of the RFR model, which suggests that the relationship between input features and TV was not linear. The hyperparameter tuning step improved the model performance as the RMSE was reduced by roughly 6%, see Table 3.
The performance assessment resulted in bias values for TV and MV of 16 mL and 0.19 L/min, respectively, see Table 4. The relation between predicted and measured variables resulted in a Pearson's coefficient of 0.92 and 0.96 for the TV and MV, respectively, see Figures 6 and 7. Table 5 presents a comparison of these performance metrics with other methods for RVM reported in the literature. Our results compare well with other technologies that predict TV in terms of bias, precision and Pearson's coefficient [20], [21], [24]. However, it is worth noting that the performance reported by other methods is evaluated on systems that include an individual calibration step and are later assessed on the same suject. In our work, the ML model is trained using the group training dataset, which is subject to user variability, and later evaluated using the group test dataset. While the use of aggregate data typically hinders the performance of the predictions, here we have shown that our ML model achieves a performance that is equivalent, and sometimes better, than current state-of-the-art methods based on individual calibration.
When comparing the performance in the estimation of MV, our system also compares well to commercially-available impedance-based RVMs [17], see Table 5. Interestingly, relative bias and precision values obtained in our study (Table 4) are higher than those reported in [17]. This can be explained by the higher range of TV and MV values measured in that study, which ranged between 500-1600 mL and 7-17 L/min. In contrast, the range of TV and MV values analyzed here were 50-1200 ml and 1-20 L/min, see Figures 4 and 5, which include a lower range of values not validated by previous  studies. This result is noteworthy, as the ability to accurately predict MV in the low range is particularly important in the detection of hypoventilation and respiratory depression events in hospitalized patients [15].
During the development and analysis of the ML model, we identified key variables in the prediction of ventilatory parameters from temperature airflow signals. In particular, the average temperature and temperature amplitude were among the most influential input features in the estimation of TV, as shown in Figure 9. It is important to note that the amount of heat transfer occurring at the nasal and oral sensors is affected by the room temperature that surrounds the nasal sensor, which may explain why AT takes such relevance in the prediction of TV by the RFR model. Further, we note that input features that describe oral breathing (ACo, MRo) did not seem to be relevant in the prediction of ventilatory parameters. We note, however, that this result may be due to the use of a limited training dataset, as a posterior analysis of the respiratory signals revealed that most of the volunteers were nasal breathers. Thus, input features related to oral flow should not be necessarily discarded, and data from oral breathers should be included in the training sets in future work. Further, we note that removing input features that display a high correlation with others does not result in better model predictions, as the performance of the RFR and RFR2 models virtually do not differ, see Tables 3 and 4.
The learning curve analysis reported in Figure 10 shows that the validation error decreases with the dataset size, but does not reach the training error for the maximum size available in this study. This result highlights the need of larger datasets to further improve the prediction of the RFR model developed in our work. As already mentioned, not only the number of volunteers should be increased in training datasets, but also a wider variability between subjects that include variations in anatomy and respiratory physiology is key to improve the predictions of the RVM system presented in this work. From this perspective, our study is limited by the fact that the population recruited considered more male subjects (70.6%) than female subjects (29.4%). Further, the study group was predominantly composed by young adults with a small variability in terms of age (33.2 ± 9.5 yrs. old). Future studies should target a wider and more balanced population in terms of gender and age. In addition, future applications of our RVM system in patients with respiratory diseases will necessitate the development of new ML models that are trained with datasets that belong to that population, as the respiratory mechanics and flow dynamics in patients with pulmonary disease can be markedly different than those in normal subjects [25].

V. CONCLUSION
In conclusion, we present and validate a novel approach to the continuous estimation of ventilatory parameters in human subjects under spontaneous breathing. Our work builds upon a non-invasive temperature-based respiratory monitoring system, whose signals are processed by a machine-learning model for the prediction of TV and MV. The main advantage of the proposed solution is the ability to accurately predict ventilatory parameters for a group of subjects based on model training on group dataset. This can potentially eliminate the need of individual calibration when the model training is previously done on a group that is representative for the subject, which is one of the premises of machine-learning techniques [26]. An important limitation of this work is the use of a restricted dataset for model training that is composed predominantly by healthy male young-adult subjects. This has the disadvantage that predictions of ventilatory parameters made by the ML model may not be as accurate for subjects that do not conform to this population. Future developments should include the use of larger datasets that provide a larger intersubject variability that is representative of differences in gender, age, and medical condition. Finally, future versions of our RVM system should be tested under relevant clinical settings to evaluate the effectiveness of RVM in the early detection of medical conditions [27].

ACKNOWLEDGMENT
This work received funding from Fundación Copec-UC through grant 2015.R.557, and from the CORFO-Innova program through grant 17ITE2-72695 awarded to DH and AA.

CONFLICTS OF INTEREST
DH, JC and AA filed a patent application on the temperaturebased respiratory monitoring system used in the experiments for acquiring respiratory temperature signals.