Machine Learning Ranks ECG as an Optimal Wearable Biosignal for Assessing Driving Stress

The demand for wearable devices that can detect anxiety and stress when driving is increasing. Recent studies have attempted to use multiple biosignals to detect driving stress. However, collecting multiple biosignals can be complex and is associated with numerous challenges. Determining the optimal biosignal for assessing driving stress can save lives. To the best of our knowledge, no study has investigated both longitudinal and transitional stress assessment using supervised and unsupervised ML techniques. Thus, this study hypothesizes that the optimal signal for assessing driving stress will consistently detect stress using supervised and unsupervised machine learning (ML) techniques. Two different approaches were used to assess driving stress: longitudinal (a combined repeated measurement of the same biosignals over three driving states) and transitional (switching from state to state such as city to highway driving). The longitudinal analysis did not involve a feature extraction phase while the transitional analysis involved a feature extraction phase. The longitudinal analysis consists of a novel interaction ensemble (INTENSE) that aggregates three unsupervised ML approaches: interaction principal component analysis, connectivity-based clustering, and K-means clustering. INTENSE was developed to uncover new knowledge by revealing the strongest correlation between the biosignal and driving stress marker. These three MLs each have their well-known and distinctive geometrical basis. Thus, the aggregation of their result would provide a more robust examination of the simultaneous non-causal associations between six biosignals: electrocardiogram (ECG), electromyogram, hand galvanic skin resistance, foot galvanic skin resistance, heart rate, respiration, and the driving stress marker. INTENSE indicates that ECG is highly correlated with the driving stress marker. The supervised ML algorithms confirmed that ECG is the most informative biosignal for detecting driving stress, with an overall accuracy of 75.02%.


I. INTRODUCTION
The American Psychological Association has reported that people are currently living with extremely high levels of driving stress, and this is expected to increase in the coming years [1]. Recently, researchers have found that stress plays a major role in the development and progression of cardiovascular diseases [2]. Therefore, the early detection and continuous assessment of stress can improve health and The associate editor coordinating the review of this manuscript and approving it for publication was Ying Xu .
well-being, helping to prevent severe consequences that can lead to hospitalization and even death.
Recent advances in wearable devices, biosignal processing, machine learning (ML), and app development could contribute to providing objective feedback on driving stress. Wearable devices have already been used in several health interventions, achieving promising results [3].
The most used and best understood, biosignal in modern medicine is the electrocardiogram (ECG). Thus, ECG-based wearable devices have a great potential to succeed. Note that most of the available ECG wearable devices on the market are not approved by the Food and Drug Administration. Several studies have investigated ECG signals that detect different reactions to anxiety, reporting contradictory results [4]. However, limited research has been conducted to compare the efficacy of using ECGs with other biosignals in detecting stress. In 2018, El Haouij [5] developed an algorithm to detect stress using the ECG, electromyogram (EMG), Palmar galvanic skin resistance (Palmar GSR), foot galvanic skin resistance (Plantar GSR), heart rate (HR), and respiration (RESP). She reported a 69% accuracy for detecting stress. It is worth noting that El Haouij has investigated the importance of each biosignal; however, not all biosignals were included in her investigation, such as the ECG, which was removed. Recently, Smets et al. [6] developed an algorithm that used an ECG, skin conductance, and skin temperature to detect driving stress, eventually reaching 43% accuracy. However, the study was not focused on questioning the importance of each biosignal or ranking the biosignal based on its driving stress detection accuracy. A recent study by Rizwan et al. [7] used only ECG signals to assess driving stress; however, their analysis was carried out independently without reporting a relative comparison with other biosignals.
To the best of our knowledge, limited efforts have been made to determine the effectiveness of the different biosignals that can be collected by wearable devices, with a focus on the ECG signal as a potential single biosignal for detecting driving stress. In the current study, we will uncover hidden relationships among all biosignals and investigate the importance of each biosignal for assessing driving stress.

A. DRIVING STRESS DATASET
The Stress Recognition in Automobile Drivers (SRAD) database [8] is useful because it contains a collection of multiparameter biosignals collected simultaneously from healthy volunteers. The biosignals were collected using wearable devices while the volunteers were driving on a prescribed route including city streets and highways in and around Boston, Massachusetts, USA. Six different biosignals were collected in the SRAD database: ECG, EMG, Plantar GSR, Palmar GSR, HR, and RESP, all during real time; In addition, the driving stress marker time series was collected as a way to reflect the driver's level of stress. The database is publicly available and can be accessed via a website [9]. All signals were recorded over seven driving periods (or segments) that are as follows: first rest (R1), first drive in the city (C1), first drive on the highway (HW1), second drive in the city (C2), second drive on the highway (HW2), third drive in the city (C3), and final rest (R2). The reason we chose the SARD database is that to the best of our knowledge, there are no other publicly available databases that contain biosignals that were simultaneously collected to assess driving stress. Moreover, scientific evidence suggests that work-related stress, life stress, and driving environment stress impact on driving outcomes [10]. Thus, analysis of the SRAD may be valuable in identifying the optimal biosignal for driving stress assessments in a practical multimodal stress scenario.

B. EXCLUSION CRITERIA
The SRAD database contains biosignals collected from 17 drivers. The experiment was quite complex, and the duration of the driving varied between 50 and 90 minutes. All drivers started with approximately 15 minutes of rest, after which the driver drove the car out of the garage toward a congested city street. Full consistency in collecting data and labeling driving stress was difficult to achieve; unfortunately, not all signals were provided for all the drivers. Moreover, in a few recordings, the marker signals were not clear. Thus, nine recordings were excluded, as shown in Table 1. Eight recordings with complete signals and clear driving stress markers were finally included in the analysis.

1) HYPOTHESIS
It is assumed that an individual's stress level increases when driving in the city and decreases when not driving (at rest) [11]. Two approaches for assessing driving stress must be investigated: longitudinal stress (repeated measurement of the same biosignals combined over three driving states) and transitional stress (switching from state to state, such as city to highway driving). Thus, our hypotheses are as follows: 1) Based on the longitudinal stress analysis, the biosignal most highly correlated with the marker signal (regardless of whether the driver was in the city or on the highway) will be the most valuable and informative in an assessment of driving stress. 2) Based on the transitional stress analysis, the biosignal that provides the highest driving stress classification accuracy will be best used for a driving stress assessment. 3) Based on the multicomparison stress analysis, the biosignal that most separates the three driving states should be used for assessing driving stress. 4) The biosignal that satisfies the previous three analyses will be considered the optimal biosignal for assessing driving stress.

2) LONGITUDINAL STRESS ANALYSIS
The longitudinal stress analysis is the combined repeated measurement of the same biosignals over three driving states. VOLUME 8, 2020 In this analysis, we examined all collected biosignals with and without feature extraction. Note that the biosignals are filtered (bandpass 0.5-7 Hz Butterworth filter). We propose a novel interaction ensemble called INTENSE that aims to combine multiple clustering models to produce robust results. Note that INTENSE will be used for all collected biosignals as is, without feature extraction. Many clustering models have been reported in literature. Each method has a different set of rules for defining ''similarities'' among biosignals. We selected clustering methods that are commonly used for simplicity and effectiveness. INTENSE aggregates results from the following three clustering models, which are based on three different geometrical perspectives: 1) The interaction principal component analysis, a multiinteraction analysis, produces multiple biosignals that interact with each other. This analysis represents a correlation-based eigenvector-based multivariate analysis. 2) Connectivity-based clustering, a bi-interaction analysis, groups every two biosignals based on the similarity measure (pairwise distance) used. 3) K-means clustering, which is a centroid-based interaction analysis, groups all biosignals based on the similarity measure (distances between centroids) used. 4) The ensemble method aggregates the results from the three clustering methods to produce a combined recommendation about the interactivity among all biosignals in the dataset. INTENSE was developed because each clustering method has potential shortcomings around clustering a certain dataset depending on the initial value used, thresholds, similarity measure, etc. The use of only one clustering algorithm may provide insights into the interactions between biosignals, based on the algorithm's geometrical concept. However, if we combine the results from multiple clustering methods, each one affords a unique geometrical perspective. This will allow us to obtain solid findings and to agree upon different clustering methods using a majority voting rule.

3) INTERACTION PRINCIPAL COMPONENT ANALYSIS
The interaction principal component analysis (IPCA) proposed in Ref [13] was used in this study. IPCA is a correlation-based unsupervised machine learning (ML) technique. It finds linearly uncorrelated attributes within a set of observations of possibly correlated attributes (in our case, biosignals). It starts with a decorrelation process that does not requires any prior information or settings about the processed biosignals, followed by Pearson's correlation process. IPCA is a valuable technique for automatically revealing hidden interactions between biosignals, without training or labeling. IPCA analysis provides a level of artificial intelligence for uncovering new behaviors between examined biosignals. The IPCA algorithm involves the following steps: 1) All recordings were combined into one matrix, X , with the dimensions of (764, 483 samples × 7 biosignals).
2) The Min-Max normalization was applied, as follows Z = (X − X min )/(X max − X min ), this was done to ensure that Z had a zero mean.
3) The covariance matrix was obtained C = 1 where f is the number of biosignals. 4) The eigen decomposition of C was performed and the e f eigenvalues and their corresponding eigenvectors, v f , were computed to satisfy the equation Cv = ev. 5) The eigenvalues were sorted in descending order, e 1 ≥ e 2 ≥ . . . . ≥ e 7 , along with their corresponding eigenvectors. Note that the eigenvector with the largest eigenvalue was called the first principal component (PC1) while the last PC associated with the lowest eigenvalue was called PC7. 6) We calculate Pearson's correlation between biosignals and between the biosignals and PCs. 7) We select PCs that show more than two strong correlations (|r > 0.5|) with variables (biosignals), and then return these variables (biosignals). Seven segments (or periods) were extracted from the seven biosignals: a driving stress marker, ECG, EMG, Plantar GSR, Palmar GSR, HR, and RESP. To compare the different driving segments (e.g., driving in the city vs. driving on the highway), the Wilcoxon-Mann-Whitney test (p w ) was used for two independent groups. A p value of < 0.05 was considered significant. Pearson's correlation coefficient was used to calculate the correlation between the biosignals and PCs. We used Matlab 2018b software and Python 3.6.5 software to analyze the data.

4) CONNECTIVITY-BASED CLUSTERING
The connectivity-based clustering (CBC) is an algorithm that connects ''biosignals'' to form ''groups'' based on their distance. It is a different unsupervised ML approach that was utilized to quantify and visualize the dissimilarities among the biosignals. The Euclidean distance was used as a metric to provide hierarchical clustering, which is also called a Dendrogram.

5) K-MEANS CLUSTERING
K -means clustering (KMC) is considered to be the most widely known clustering algorithm as its concept is simple but effective. This algorithm divides the number of factors F into C disjoint clusters. The means of the clusters are referred to as ''centroids.'' The KMC algorithm aims to choose centroids that minimize the distance between each group of factors and its centroid. The minimization process is called ''inertia'' or within-cluster sum-of-squares, and it is defined as follows: ||f i − c j || 2 , where f i refers to all values in factor i, and c j refers to cluster j. Note that applying PCA prior to KMC clustering is highly recommended to speed up the process and avoid computational problems.

6) MULTICOMPARISON TEST
Here, the multiple statistical comparison test was based on the mean square error achieved by the Kruskal-Wallis test.
The multicomparison displays the mean estimates and standard errors with the corresponding state. Each driving state mean is represented by a symbol, and the interval is represented by a line extending out from the symbol. Two driving state means are significantly different if their intervals are disjointed; they are not significantly different if their intervals overlap. Note that the biosignals were not used as is, which is in contrast to the INTENSE analysis. A feature extraction was applied first, as follows: The Skewness statistic measure was used as a feature to compare all the biosignals because a previous study [14] found that Skewness is associated with morphology changes in time-series biosignals. Note that the study [14] compared different features and ranked Skewness as the optimal feature. Skewness is a measure of the symmetry (or the lack thereof) of a probability distribution, which is defined as follows: where µ x and σ are the empirical estimate of the mean and standard deviation of x i , respectively, and N is the number of samples in the biosignal.

7) TRANSITIONAL STRESS ANALYSIS
The transitional stress analysis is defined as switching from state to state, specifically for the following five transitional states: (R1 vs. C1), (H1 vs. C1), (R1 vs. C2), (H2 vs. C2), and (R2 vs. C3). The transitional stress analysis relies on supervised ML algorithms. Using a leave-on-out crossvalidation, 17 different supervised classification techniques were tested: KNN weighted, Tree Fine, SVM Cubic, Logistic Regression, SVM Quadratic, Quadratic Discriminant, and SVM Fine Gaussian. Skewness was applied as a feature to capture driving stress. An algorithm is needed to detect state switching in order to automatically extract biosignals based on the seven driving states. The two event-related moving averages (TERMA) algorithm [12] was used in our study to detect the main spikes in the marker signal.
The ). First, the driving stress marker signal passes through a third-order Butterworth bandpass filter F 1 -F 2 . The resulting signal is squared, and two moving averages (MA 1 and MA 2 ) are applied, followed by a threshold (β to reject noise), this process generates the so-called ''blocks of interest''. By searching for the maximum absolute value within each block of interest, the marker location can be determined. Here, we found that the optimal setting for detecting markers was the following: F 1 = 0.5 Hz, F 2 = 7 Hz, W 1 = 500 samples, W 2 = 1000 samples, and β = 5.

9) DATA AVAILABILITY
The data that support the findings of the current study are publicly available online at https://www.physionet.org/pn3/ drivedb/

III. RESULTS
The overall visual representation of biosignals for drive07 is shown in Fig. 1. It is very difficult to differentiate among the biosignals collected during the city driving, highway driving, and at rest by visual inspection alone. Even if we can identify some patterns or significant changes between the different segments or periods in one recording (or one biosignal), we may not be able to generalize the findings.
There was a need to automatically segment all biosignals based on the driving stress marker signal to statistically compare all segments. Thus, the TERMA method was used to segment the biosignals into seven periods: R1, C1, HW1, C2, HW2, C3, and R2. The TERMA method was applied to detect the switching states within the driving stress marker, as shown in Fig. 2. Based on visual inspection, the detection accuracy of the seven periods achieved on all used records was 100%. This step was important to demarcate the driving activity in all biosignals for comparative purposes. Fig. 3 shows the significance of the seven PCs extracted from the SRAD database, after applying a principal component analysis (PCA) to all biosignals (driving stress marker, ECG, EMG, Plantar GSR, Palmar GSR, HR, and RESP). PC1 explains the most variance (40%), reflecting its relevance and importance. PC2 explains nearly 23% of the variance, while PC3 explains a little more than 17% of the variance. As expected, upon visual inspection, PC1 is the most volatile PC2 is the second most volatile, and PC7 is the least volatile. In fact, PC7 showed no relevance. Fig. 4 shows the heat map of the IPCA longitudinal stress analysis, which is a correlation matrix among all PCs and biosignals. The diagonal entries are all equal to 1. As can be seen, the heat map consists of four 7 × 7 blocks. The top right block is the correlation matrix for the PCs. We know that the PCs are orthogonal. As expected, this block contains zeros (numerically negligible values) in its off-diagonal entries, demonstrating that the PCs are mutually orthogonal (and therefore uncorrelated). The bottom left block is the heat map showing no correlation between all biosignals. Thus, the bottom left block is not informative.  Detection of state switching of the driving stress marker using the TERMA (two event-related moving averages) method [12]. This is an important step for the longitudinal and transitional stress analysis.
As shown in Fig. 4, the bottom right 7 × 7 block contains interesting results. The 7 × 7 block, which is surrounded by dashed black lines, reflects the correlation between each biosignal and each PC. The first column, in the dashed block, shows a significant correlation between PC1 and the marker, ECG, and RESP. PC1 shows that ECG and RESP are correlated with the driving stress marker; however, both features are moving in the same direction of the driving stress marker. Given that PC1 is the PC with the highest eigenvalue, there is a hidden correlation among the ECG, RESP, and the driving stress marker. The second column, which is in the dashed block, shows that the Palmar GSR and Plantar GSR are moving in the same direction, and both are strongly correlated with PC2. This suggests that there is a hidden correlation between the Palmar GSR and Plantar GSR. The third column, which is in the dashed box, shows that the driving stress marker and ECG move in the same direction, and both are strongly correlated with PC3, suggesting that a hidden correlation exists between the driving stress marker and ECG. PC4 is only correlated with HR while PC5 is correlated with EMG and HR. PC6 is correlated with Plantar GSR while PC7 was unable to capture any correlation between biosignals. Interestingly, this analysis highlights the correlation between the driving stress marker with ECG, as captured by PC1 and PC3.
Here, we investigated the hierarchical clustering of biosignals and analyzed the Dendrogram generated as a result of the clustering process. This method builds the hierarchy from the independent biosignals by progressively merging the biosignals to generate clusters. In this case, we have six biosignals. The first step is to determine which two biosignals are close enough to each other to be merged into a cluster.
The biosignals were grouped into clusters using the longitudinal CBC analysis, based on their similarity, as shown in Fig. 5. Visually inspecting Fig. 5, the changes in the driving stress marker are similar to those in ECG, EMG, Palmar GSR, and HR. Interestingly, the first biosignal close to the driving stress marker is ECG. Moreover, the Palmar GSR and Plantar GSR are forming an independent cluster far from the driving stress marker. These results are similar to the findings from the PCA shown in Fig. 4, hence confirming the correlation between Palmar GSR and Plantar GSR. Interestingly, the Dendrogram ranked the biosignals based on how close they were to the driving stress marker in the following order: ECG, RESP, EMG, Plantar GSR, Palmar GSR, and HR.
KMC serves as our third geometrical lens and provides a different clustering perspective. The KMC result, as shown in Fig. 6, is similar to and confirms the findings obtained using the CBC, as shown in Fig. 5. Interestingly, KMC grouped the stress marker with ECG and EMG, suggesting that the marker is correlated with ECG and EMG. Therefore, INTENSE indicates that ECG is consistently correlated with the stress marker, suggesting that ECG is an optimal biosignal for assessing driving stress.
It is necessary to examine the pairwise comparison results using a multiple comparison test for each biosignal in terms of city driving, highway driving and at rest. The longitudinal multiple comparison can generate a graph of the mean estimates and standard errors. Fig. 7 shows the multiple comparison of the means of the extracted feature (Skeweness) for each biosignal. The mean of each status (e.g., at rest) is represented by a symbol, and the interval is represented by a line extending from the symbol.
As can be seen in Fig. 7 the mean of at rest is highlighted and the comparison interval is in blue. As the comparison intervals for driving in the city and on the highway do not intersect with the intervals for the at rest mean, they are highlighted in red, except for in Fig. 7(HR). The intersection among the intervals indicates no significant difference, while the lack of intersection indicates that both means are different than the at rest mean. In other words, if the means of driving in the city and on the highway are significantly different and if their intervals are disjointed, they are not significantly different if their intervals overlap. Fig. 7 shows that HR is the worst signal for differentiating among the three driving statuses. The rest of the biosignals can differentiate being at rest and either driving in the city or on the highway. However, ECG is the only biosignal with no overlap between driving in the city and on the highway. This result suggests that an ECG is the optimal biosignal for capturing stress in different driving statuses. Fig. 7 magnifies the separability among the biosignals, suggesting a need to examine the classification ability of each biosignal. Table 2   test (R1 vs. C1) using the KNN-Weighted classifier. Plantar GSR scored the highest accuracy in detecting driving stress in the test (R2 vs. C3) using the Quadratic classifier, while RESP was the most accurate (R1 vs. C2) using the Tree-Fine classifier.
It is clear that ECGs were able to capture overall driving stress in all possible driving status tests, with an overall accuracy of 75.02%. The second highest accuracy was achieved by Palmar GSR with an overall score of 72.52%, while the lowest accuracy was scored by HR, with an overall accuracy of 63.78%.

A. DRIVING STRESS MARKER
The driving stress marker is a continuous time-series signal that divides various driving zones due to the assumption that driving in city evokes high stress levels, driving on the highway evokes medium stress levels, and that being in a state of rest evokes the lowest stress levels. The driving zone marker in the current study is referred to as the driving stress marker. The driving stress signal is typically used as the gold standard for assessing different driving stress level changes over time for each recording. If the quality of the driving stress marker is high, there will be eight clear outstanding spikes that determine seven driving zones (R1, C1, H1, C2, H2, C3, and R2), as shown in Fig. 2.

B. ECG
An ECG is a time-series signal that reflects the electrical activity of the heart [15]. Usually, three electrodes are placed on the chest to record ECG signals. In the original study, the ECG signals were captured using a modified lead II configuration to minimize motion artifacts and to attain a  good record of the R peak. Previous studies investigated the correlation of different ECG features with stress by using the same dataset, such as the R peak amplitude [16], QT interval [17], and QRS complex [18]. These studies can be considered pilot studies without a clear discussion of how the features were extracted. In addition, no cross validation was used, and no relative analysis to other biosignals was reported. VOLUME 8, 2020 FIGURE 7. Longitudinal stress analysis of the multicomparison test for all biosignals (Skewness with one second window was used as a feature). Two driving periods are significantly different if their intervals are disjointed; the two driving periods are not significantly different if their intervals overlap with a gray color. One can observe that ECG significantly differentiates between three driving states. Note that EMG, RESP, Plantar GSR, and Palmar GSR overlap with driving in the city and highway while HR has a complete overlap with all driving states.

TABLE 2.
Transitional stress analysis using supervised machine learning (ML) algorithms. It compares the classification accuracy among driving states using biosignals. Note that Skewness with one second window was used as a feature. Leave-one-out cross validation was used.
In our study, the ECG was analyzed in relation to other biosignals, and all biosignals were included in the records. The longitudinal IPCA analysis showed that the ECG is correlated with the driving stress marker and RESP. In addition, the Dendrogram showed that ECG was ranked as closest to the driving stress marker. The multiple comparison of means reflected a significant difference for all driving status tests when using ECG, without any overlap between the three driving statuses, as shown in Fig. 7. Interestingly, ECG was ranked first for detecting driving stress based on the overall accuracy (75.02%), as shown in Table 2. To our knowledge, this finding has not been reported in the literature; in fact, some features extracted from ECG signals have led to contradictory results [4] in detecting anxiety. It seems that the morphology of the ECG waveforms contains valuable information and is correlated with driving stress.

C. EMG
An EMG noninvasively measures muscle action potentials during contraction. In the literature, EMGs have been investigated for analyzing six emotional states [19], identifying intensive valence and arousal affective states [20] and detecting driving stress [8], [21]. Our study examined the effectiveness of EMG relative to other biosignals. The IPCA analysis showed that EMGs are not correlated with the driving stress marker; however, EMG was correlated with HR. The Dendrogram showed the EMG was ranked third in closeness to the driving stress marker. The multiple comparison of means reflected a significant difference for all driving status tests using EMGs. However, there was an overlap between the city and highway driving, as shown in Fig. 7. After applying different supervised classifiers, EMG was ranked fourth for detecting driving stress based on the overall accuracy (65.02%), as shown in Table 2. This finding has been confirmed previously in the literature; in fact, a previous study [5] showed that EMGs were the least informative biosignal for detecting driving stress.

D. PLANTAR GSR
Plantar GSR is the galvanic skin response or skin conductance measured by the foot. A GSR measures autonomic nervous system activity. Typically, two electrodes are placed on the foot skin to capture changes in the skin's glands when producing ionic sweat [22]. It worth noting that the Plantar GSR was not considered in the original study [8], [11]. However, a recent study [5] showed that the Plantar GSR was the most biosignal correlated with driving stress. We considered the Plantar GSR in our study and examined its importance. The IPCA analysis showed that the Plantar GSR and Palmar GSR are correlated and both were ranked second based on PC2. Interestingly, the Dendrogram grouped the Plantar GSR and Palmar GSR together, confirming the results in PC2. The multiple comparison of means reflected a significant difference between at rest and driving in the city and on the highway. In fact, there was a complete overlap between driving in the city and on the highway, as shown in Fig. 7, suggesting that the Plantar GSR is unable to capture changes in driving stress (or stress transition). Interestingly, the Plantar GSR was ranked third for detecting driving stress based on overall accuracy (66.28%), as shown in Table 2.

E. PALMAR GSR
Palmar GSR is the skin conductance measured from the hands. Typically, two electrodes are placed on the skin on the hand to capture changes in the skin's glands when producing ionic sweat [22]. Healy and Picard [11] reported that Palmar GSR is correlated with driving stress level. Palmar GSR was not considered in the original study [8], [11]. However, a recent study [5] showed that Palmar GSR was the biosignal most correlated with driving stress. As mentioned above, the unsupervised ML methods revealed that Plantar GSR and Palmar GSR are correlated, as shown in Fig. 4 and Fig. 5. The multiple comparison of means presented similar results as for the Plantar GSR, see Fig. 7, suggesting a significant difference between at rest and driving in the city and on the highway with an overlap between driving in the city and on the highway. This result shows that Palmar GSR is not sensitive for driving stress changes. The supervised ML methods ranked Palmar GSR second for detecting driving stress based on the overall accuracy (72.52%), as shown in Table 2.

F. RESP
RESP provides valuable information about breathing rate, which is considered an indicator of emotional states, including stress, arousal, and mental workload [23]. Typically, the sensor is integrated in a chest belt to measure the inhalation and exhalation by capturing the expansion and reduction of the belt size. The RESP signal has been examined previously to assess the affective state [23]- [25]. A previous study [11] ranked the RESP signal third, while another study [5] ranked it second when compared with other biosignals when it comes to its correlation with driving stress. As mentioned above, PC1 showed that the RESP is correlated with the driving stress marker, as shown in Fig. 4. Moreover, the Dendrogram categorized the RESP near the driving stress marker; but placed it in a separate group (cf. Fig. 5). The multiple comparison of means presented similar results as for Plantar GSR, Palmar GSR, and EMG, as shown in Fig. 7. Note that ECG was able to detect the three driving zones while RESP detected driving in the city and in the highway as one zone. The overlap between driving in the city and on the highway suggests less sensitivity for detecting driving stress. The supervised ML methods ranked RESP fourth for detecting driving stress based on the overall accuracy (65.02%), as shown in Table 2.

G. HR
The variability in HR provides valuable information about the autonomic nervous system, which is considered a marker of driving stress [26]. The HR signal has been examined previously to assess the affective state [23]- [25]. Previous studies [5], [11] ranked the HR signal fourth when compared with other biosignals on the same dataset for detecting driving stress. It is important to note that the HR is extracted from the ECG signal by detecting the R peaks.
As mentioned above, the PCA analysis did not find a correlation between HR and the driving stress marker, as shown in Fig. 4. Moreover, the Dendrogram did not categorize HR near the driving stress marker. In fact, it was the farthest away (cf. Fig. 5). The multiple comparison of means revealed a complete overlap among all driving statuses, as shown in Fig. 7, suggesting that the HR is the least informative biosignal for a driving stress level assessment. The supervised ML methods ranked HR sixth (bottom of the rank) for detecting driving stress based on the overall accuracy (63.78%), as shown in Table 2. Interestingly, given that the HR is derived from ECGs, the ECG signals achieved a higher performance than the HR. This means that there is valuable information within the ECG waveforms that are correlated with driving stress. In other words, the RR intervals are not sufficient enough to capture driving stress. It is worth mentioning that the purpose of this analysis is to rank biosignals based on their abilities to detect driving stress.

H. WHICH BIOSIGNAL IS MOST CORRELATED WITH DRIVING STRESS?
To answer this question objectively, we compared previous studies using the same dataset. Table 3 compares the finding from our study with other studies that used the same dataset. Study 1 [11] and Study 2 [5] confirmed that in general, the GSR is the most informative biosignal for assessing driving stress levels. However, Study 2 reported Plantar GSR, while Study 1 reported Palmar GSR as optimal biosignals for detecting driving stress. Perhaps the reason behind getting two different results is because Study 1 excluded Plantar GSR from the analysis.
Another contradiction between Study 1 and Study 2 pertains to the second most informative biosignal. Study 1 recommended HR as the second best (in combination with Palmar GSR), while Study 2 recommended RESP as second best. A reasoning for this contradiction is that Study 2 used wavelet-based features, while Study 1 used basic statistical features, such as mean and standard deviation. The most interesting finding from our analysis is that ECG was found to be the most informative biosignal among all biosignals after rigorous testing using unsupervised and supervised ML methods. Note that Study 1 and Study 2 did not include ECG in their analysis and that may be why this finding has not been previously reported.
We must question whether the overall accuracy achieved in our analysis is important or not. The ECG signals achieved an overall accuracy of 75.02% using only one feature extracted per second. This result outperformed that of a recent study [5], Study 2 (Table 3), which used 12 different features extracted from the Palmar GSR, Plantar GSR, and RESP, achieving 69% accuracy. It is worth mentioning that Study 2 used two-thirds of the dataset for training and one-third for testing, while in our study, a leave-one-out cross-validation was carried out. Study 2 used only five minutes from each segment, while our study considered the whole segment without removing any data points. Moreover, in the current paper, unsupervised and supervised techniques were used to rank the biosignals, while in Study 2, the findings were based on supervised learning. In addition, in this paper, both the raw data and extracted features were used to confirm the findings, while Study 2 relied on certain features only, which may have negatively impacted the overall driving stress detection accuracy.
We have attempted to objectively identify the optimal biosignal for assessing driving stress. The results consistently showed, using longitudinal, transitional and multicomparison analyses, that ECG is the optimal biosignal for a driving stress assessment. Note that, to our knowledge, this is the first study that has rigorously ranked multiple biosignals using different ML techniques. One must acknowledge several other pertinent factors that faced previous studies to reach to the same finding, such as the exclusion of some biosignal from the analysis, the rigorousness and clarity of the cross validation, the examination of different driving statuses such as (R1 vs. C1), (H1 vs. C1), (R1 vs. C2), (H2 vs. C2), and (R2 vs. C3), examining the time-series data itself, and considering all data points in each driving zone.

V. PRACTICALITIES OF STUDY FINDINGS
The practicality of collecting ECGs and the long-term implications of doing so can only be speculated at this point. It is conceivable that this technology could help individuals to manage stress when driving, which in turn could reduce accident risk.
1) ECG Sensor Placement: Many attempts have been made in the past to monitor a driver's visual and cognitive distractions [27]. Yet, most of the techniques did not become a practical application (i.e., integrated in smart cars). Perhaps this is due to the practicality of placing the ECG sensor in the car and the questionable robustness of detecting stress during driving. We propose mounting the ECG electrodes on the steering wheel to monitor stress while driving. This approach seems to be more convenient and has a good chance of being accepted by The National Highway Traffic Safety Administration as it will not distract the driver. 2) Improve Driving Experience: When a smart car detects stress, with integrated ECG sensor, the car could recommend playing relaxing music or changing the vehicle temperature, as suggested in a previous study [28].
Adding an indication light feature on the dashboard could help with providing feedback to the drivers; perhaps increasing the sense of control they have over their driving state. For example, if the light on the dash board is green, indicating low stress, the driver may then in return feel calmer and more emotionally stable. In contrast, a red light would indicate that the driver is in a high-stress state and the smart car could recommend pulling over while playing calming music. 3) Reducing Insurance Costs: Maintaining a consistently low stress level while driving could help to build a solid credit history over the long-term, which consequently could lower insurance costs. Insurance companies usually offer discounts to policyholders who have not had any accidents or moving violations for a certain period of time. Similarly, increasing stress levels over a certain amount time may lead to further discounts.

VI. LIMITATIONS OF STUDY AND FUTURE WORK
One of the main limitations of the current study is the small sample size. The next step is to test the same methodology on a different database and see if the same results are achieved. However, the focus of the current study was not on improving the detection rate of driving stress. Rather, the focus was on ranking biosignals. Thus, different existing ML algorithms to rank the biosignals were used. One may question the difference between the best performance (75.02% with ECG) and worst performance (64.02% with HR) when it comes to detecting driving stress. At a first glance, this difference does not appear to be particularly significant. However, our aim was to identify which biosignal is more informative and sensitive in detecting driving stress, for example, determining if ECG signal is more informative than HR, or vice versa. Even though the purpose of the present study was to rank the biosignals, the classification results outperformed other recent published works that focused on driving stress detection.
If the purpose of the current study was to improve the detection of driving stress using biosignals, specifically ECG, then more features would need to be extracted, or deep learning would need to be applied. Another approach that could be used to improve the detection of driving stress would be to combine multiple biosignals.
For formulating future studies, we recommend the collection of hemodynamics parameters (e.g., blood pressure levels and dilation of pupils) and to conduct psychological driving stress tests during the collection of biosignals. This will improve our understanding of driving stress levels and their correlation with biosignals changes.

VII. CONCLUSION
This paper has described the biosignal that best determines driving stress, by looking at biosignals such as ECG, EMG, Plantar GSR, Palmar GSR, HR, and RESP. Without prior training or knowledge, the INTENSE method (aggregating three ML methods: IPCA, CBC, and KMC) was able to detect a complex dynamic between different biosignals and the driving stress marker. There is agreement between all ML methods that a significant correlation exists between the driving stress marker and ECG. Moreover, when supervised ML methods were used, ECG scored the highest overall accuracy for detecting stress in different driving scenarios. One interesting result is that the HR did not correlate well with the driving stress marker. These findings are timely because more than ever, ECG-based wearable devices are being used on a large scale in different scientific fields. The results of this study can contribute to recommendations for the use of ECG signals as an informative means of measuring driving stress and anxiety. Using this newfound knowledge together with artificial intelligence, we can better use ECGs to not only monitor cardiac abnormalities, but also for assessing driving stress.