Reliability of EEG Measures in Driving Fatigue

Reliability investigation of measures is important in studies of brain science and neuroengineering. Measures’ reliability hasn’t been investigated across brain states, leaving unknown how reliable the measures are in the context of the change from alert state to fatigue state during driving. To compensate for the lack, we performed a comprehensive investigation. A two-session experiment with an interval of approximately one week was designed to evaluate the reliability of the measures at both sensor and source levels. The results showed that the average intraclass correlation coefficients (ICCs) of the measures at the sensor level were generally higher than those at the source level, except for the directed between-region measures. Single-region measures generally exhibited higher average ICCs relative to between-region measures. The exploration of brain network topology showed that nodal metrics displayed highly varying ICCs across regions and global metrics varied associated with nodal metrics. Single-region measures displayed higher ICCs in the frontal and occipital regions while the between-region measures exhibited higher ICCs in the area involving frontal, central and occipital regions. This study provides an appraisal for the measures’ reliability over a long interval, which is informative for measure selection in practical mental monitoring.

road environment reduces driver's vigilance, further resulting 31 in driving fatigue [2]. To date, various psychophysiological 32 signals have been used to assess fatigue and EEG is a relatively 33 reliable and easily-used indicator for fatigue [3], [4]. When 34 selecting a measure, its reliability over time is important as 35 high reliability ensures that driving fatigue can be correctly 36 and accurately assessed. Previous studies only investigated the 37 reliability of single-region measures during different episodes 38 of fatigue [5], while between-region measures have not yet 39 been investigated. This requires a comprehensive investigation 40 of all measures to address how reliable each measure is and 41 compare the reliability between each of them in terms of 42 identifying driving fatigue. 43 Early EEG studies utilized individual-region measures, such 44 as entropy [ [10], [11] 45 to assess driving fatigue. A study reported decreased sample 46 entropy in the occipital region during driving fatigue [6]. 47 Similar decreases during fatigue were found in central, pari-48 etal, occipital regions using entropy. Considering different 49 frequency bands relevant to driving fatigue, previous studies 50 using EEG spectral power reported distinct changes from alert 51 to fatigue. Spectral power in theta and alpha bands increased 52 during fatigue while spectral power in beta band decreased 53 [8], [9], [10]. Increases of spectral power in theta band 54 were found in frontal, central and occipital regions [4], [11]. 55 Spectral power in alpha band increased in central, parietal, 56 occipital, and temporal regions during fatigue [4], [10], [11]. 57 Decreases of beta band during fatigue were observed in frontal, 58 central, temporal, parietal, and occipital regions [4], [10], [11]. 59 Although changes in delta and gamma bands during fatigue 60 have been reported, more prominent changes were frequently 61 reported in theta, alpha, and beta bands [12], [13]. 62 Between-region measures have been increasingly used and 63 widely applied to diverse neuroimaging studies, such as motor 64 imagery performance prediction [14], schizophrenia identifi-65 cation [15], and fatigue identification [16], [17], [18], [19], 66 [20], [21]. Increases of mean phase coherence in frontal and 67 parietal regions were found in the delta and alpha bands 68 under fatigue [17]. In another study, interhemispheric con-69 nections in alpha band showed an increase while higher con-70 nection strengths were observed for interhemispheric frontal 71 and occipital connections relative to interhemispheric cen-72 tral, parietal, and temporal connections during fatigue [18]. 73 Graph metrics have been utilized to capture the properties of 74 brain functional connectivity during fatigue [19], [20]. In a 75 study using ordinary coherence, total synchronization strengths 76 (in the frequency range of 0.5∼30 Hz) in frontal, central, 77 and temporal regions, mean degree in delta and theta, and region or between-region measures separately, neglecting the 126 comparison between the two categories.   Thirty healthy students, 18 males and 12 females (age: 140 23.17 ± 2.72 years, mean ± standard deviation), were 141 recruited from the National University of Singapore. All 142 subjects reported normal or corrected-to-normal vision, with 143 no history of substance addiction or mental disorders. The sub-144 jects were required to obtain a full night (>7 h) sleep before 145 the day of the experiment. On the day of the experiment, they 146 were required to avoid consuming caffeine or alcohol. Each 147 subject signed a consent form and was trained to familiarize 148 themselves with the driving equipment before the start of the 149 experiment. The driving simulation was conducted using Log-150 itech G27 Racing Wheel set and Carnetsoft Driving Simulator 151 (http://cs-driving-simulator.com) software. The subjects were 152 instructed to drive a car following a guiding car and to brake 153 as soon as the red taillights of the guiding car lit. Each subject 154 completed two identical driving sessions of 90 minutes, with 155 an interval of approximately one week. The experiment was 156 reviewed and approved by the institutional review board of the 157 National University of Singapore. Brain activity was recorded as EEG using wireless EEG 160 recording equipment with 24 dry electrodes (Cognionics, Inc., 161 USA), with a sampling rate of 250 Hz. The impedances 162 of all EEG channels were kept below 20 k. The EEG 163 channels were referenced to the linked mastoids. Preprocessing 164 steps were performed to remove artifacts. Firstly, all EEG 165 channels were rereferenced using common average reference 166 (an alternative reference is infinity [27], [28]). The EEG 167 channels having poor contact with the scalp were removed 168 and then respectively interpolated using the signals from 169 its adjacent channels. The last 5-min portion of EEG was 170 discarded due to the change of the simulation phase into free 171 driving where there was no guiding car. The EEG signals were 172 band-pass filtered at 0.5∼45 Hz. The processed signals were 173 segmented into epochs of a 2-second period. Abnormal epochs 174 containing values with more than 5 times standard deviation 175 from the mean probability distribution were removed using 176 EEGLAB [29]. Based on the self-reported confirmation of 177 fatigue after the experiment and the increased reaction time at 178 the end of the experiment, the epochs between the 0th and 15th 179 minute and between the 70th and 85th minute were considered 180 as alert and fatigue samples respectively. Four subjects having 181 the insufficient number of alert and fatigue epochs in either 182 session after epoch rejection were excluded from further 183 analysis. For the remaining subjects, the remaining epochs 184 were decomposed into components using independent com-185 ponent analysis (ICA). ICA components representing artifacts 186 were removed and the remaining components were used to 187 reconstruct clean EEG epochs. Clean EEG epochs were then 188 obtained in the first session (alert: 391.12 ± 51.81, fatigue: 189 340.35 ± 91.56) and the second session (alert: 377. 15 ± 54.67, 190 fatigue: 363.50 ± 70.93) of the experiment.

191
The exact low resolution brain electromagnetic tomography 192 (eLORETA) [30] was used in this study to transform the EEG 193 signals at sensor level to the cortical current source densities.

194
The head model of eLORETA was based on the Montreal 195 Neurological Institute average MRI brain map (MNI152) [31]. 196 The solution space was restricted to the cortical gray matter The single-region and between-region measures were 255 obtained for alert and fatigue epochs. The average difference 256 between alert and fatigue across epochs of each subject was 257 then computed separately for the first and second sessions. 258 See the box named 'Compute Reliability' depicted in Fig. 1 where N is the number of difference values for each session. 266 x 2 26 are differ-267 ence values for session 1 and session 2, respectively. ICC was 268 set to zero when it was a negative value.

269
To determine whether the ICC distributions of the 270 single-regions and between-regions measures were differ-271 ent, Kruskal-Wallis test was conducted for the sensor level 272 measures and source level measures separately. Wilcoxon 273 signed-rank test was conducted for the comparison between 274 SE and PSD and between PLI and PDC, while Wilcoxon rank-275 sum test was conducted for the comparison between SE/PSD 276 and PLI/PDC.

278
The ICCs of the measures at sensor level and source level 279 were depicted in Fig. 2. Kruskal-Wallis test showed significant 280 differences (p<0.05) among the measures at both sensor level 281 and source level. The results of the post-hoc tests between the 282 measures were shown in Fig. 3 and Fig. 4 for the sensor level 283 and source level measures, respectively.

284
At sensor level, PSD theta had the highest mean ICC and 285 its ICCs were significantly higher than the ICCs of the other 286 measures. PSD alpha had the second highest mean ICC and 287 its ICCs were significantly higher than the ICCs of the other 288 measures except for sample entropy, PSD beta, and PLI alpha, 289 having similar mean ICCs among them. While PLI alpha 290 was significantly higher than PLI theta, PLI beta, and PDC 291 measures, sample entropy and PSD beta were significantly 292 higher than PLI theta and PLI beta only. The lowest mean 293 ICCs were found for PLI theta and PLI beta.

294
At source level, PSD alpha had the highest mean ICC and 295 its ICCs were significantly higher than the other measures. 296 PLI measures had the lowest mean ICCs and its ICCs were 297  the highest ICC, followed by PDC alpha and PDC theta, and 302 the differences were significant.

303
The ICCs of the global and nodal metrics computed from 304 between-region measures were listed in Table I and Table II 305  TABLE I  RELIABILITY OF THE GLOBAL METRICS OF PLI AND  White boxes indicate the non-significant differences while the colored boxes refer to the significant differences. alpha at source level had lower ICCs than the respect metrics 310 at sensor level. The graph metrics of PLI beta showed poor 311 ICCs while the graph metrics of PLI theta at source level 312 generally had higher ICCs than the corresponding metrics at 313    The selected measures at source level were depicted in 346 Fig. 8, Fig. 9, and Fig. 10 for single-region measure, nodal 347 metric, and between-region measure, respectively. In Fig. 8,  In this study, we comprehensively investigated the relia-359 bility of the EEG measures for driving fatigue identifica-360 tion. Our study explored the reliability of measure changes, 361 instead of measure values, to evaluate the consistency of the 362 changes from alert to fatigue. We estimated the reliability 363 across two sessions with a long interval in between, instead 364 of two episodes within a session, since such estimation is 365 closer to the practical use of fatigue detection which requires 366 reliable performance across days of operation. We com-367 pared the reliability of the single-region measures with the 368 between-region measures and discussed the results in detail 369 below.

370
From the single-region measures at sensor level, 371 we observed differences in the ICCs of PSD measures 372 relative to SE. Among the single-region measures, PSD theta 373 (significant, p<0.05) and PSD alpha (not significant, p>0.05) 374 had higher mean ICCs relative to SE while PSD beta had 375 lower mean ICCs (not significant, p>0.05). Previous spectral 376 EEG study also found that EEG activity in theta band had 377 the highest correlation coefficients between two episodes of 378 driving fatigue, followed by that in alpha and beta bands [5]. 379 At source level, the ICCs of PSD alpha were significantly 380 higher than PSD theta, PSD beta, and SE. In this study, 381 higher ICCs were found at lower frequency bands. This might 382 reflect the distinct consistencies of the single-region measures 383 in particular frequency bands during driving fatigue.

384
At both sensor level and source level, single-region mea-385 sures generally had higher mean ICCs than individual connec-386 tions from between-region measures. This observation might 387 indicate the difference between the regional activities and 388 inter-regional interactions in identifying brain state changes. 389 While the consistency of regional activities from alert to 390 fatigue depends only on the individual regions, the consistency 391 of inter-regional interactions relies on the changes involving 392 any two regions. This more complex mechanism in between-393 region interactions might be reflected by their overall lower 394 reliability relative to the reliability of the regional activities.

395
The ICCs of the measures at sensor level were generally 396 higher than those at source level, except for PDC measures. 397 For single-region measures, the higher percentage of regions 398 with ICC > 0.4 were found at sensor level relative to source 399 level, suggesting that single-region measures are more reliable 400 at sensor level compared to those at source level. This finding 401 is in agreement with the previous statement in a study inves-402 tigating the reliability of EEG measures at both sensor and 403 source levels [22], probably due to the volume conduction 404 effect at the sensor level which was highly repeatable across 405 sessions and subjects [24], [37].

406
At sensor level, PLI measures generally exhibited lower 407 mean ICCs except for PLI alpha. In the previous study 408 comparing MEG-based between-region measures [24], phase-409 based measures also showed relatively lower reliabilities. The 410 low ICCs of PLI might be caused by its method of minimizing 411 the volume conduction effect [24] and relying on subtle prop-412 erties of the signals which were harder to estimate and more 413 variable across subjects [37]. Compared to the other between-414 region measures, PLI alpha at sensor level displayed higher 415  median of inter-regional PLI in alpha band relative to that 419 in theta and beta bands [25]. Connectivity in alpha band 420 was also reported as relatively more dominant and reliable 421 compared to that in the other bands for driving fatigue assess-422 ment [12]. The dominance of PLI alpha at sensor level might 423 suggest the high consistency of the measure in identifying 424 fatigue.

465
Particular regions were observed with higher ICCs com-466 pared to the other regions. In PSD theta at sensor level, higher 467 ICCs were found in frontal and occipital regions as shown on 468 the left panel of Fig. 6. At source level, right frontal, right 469 temporal, right parietal and occipital regions of PSD alpha 470 displayed higher ICCs relative to the other regions, depicted 471 in Fig. 8. Based on the nodal metrics, regions from right 472 frontal to right occipital showed high ICC values at sensor 473 level for PLI alpha (nodal efficiency), shown on the right 474 panel of Fig. 6. Frontal, temporal, and occipital regions had 475 higher ICC values relative to the other regions for PLI theta 476 (nodal clustering coefficient) at source level, shown in Fig. 10. 477 Based on the results, higher ICCs were mainly found in frontal 478 and occipital regions. In the previous studies, power increases 479 during fatigue have been reported in occipital region [11], 480 [39], [40], [41] and in frontal region [10], [11], [42], [39]. 481 These regions might be more sensitive to induced fatigue, 482 involved in the pathophysiology of chronic fatigue [43] and 483 cognitive control [44], [45] (frontal) as well as visual processes 484 (occipital).

485
For the PLI alpha at sensor level (see Fig. 7), the connec-486 tions having high ICCs were observed between frontal and 487 central and between central and parietal/occipital regions. The 488 previous study using transfer entropy also revealed connec-489 tivity changes around central and parietal regions during a 490 transition state from high to low vigilance level [44]. These with sustained visual attention [46], [47].

502
To conclude, this study presented the reliability of the 503 proposed measure changes between alert and fatigue states.