Fatigue Detection in SSVEP-BCIs Based on Wavelet Entropy of EEG

Among various types of brain computer interfaces (BCIs), steady state visually evoked potential (SSVEP) based BCIs can provide high information transfer rate (ITR), however the users could suffer serious fatigue that may induce discomfort, health hazards and deterioration of system performance. To overcome the fatigue obstacle, the first step is to detect the fatigue accurately, reliably and quickly. This paper proposes an approach based on the wavelet entropy of the measured EEG to fatigue detection in real time when using an SSVEP-BCI. Specifically, the wavelet analysis is first applied to the EEG, resulting in the approximation and detail components at different levels. The sample entropy values of these components are then calculated to generate features for classification. Experimental results identified the entropy of the lower frequency components (0 – 4.6875Hz) as the most important feature. The proposed wavelet entropy improved the fatigue detection accuracy to 87.7% from 65.1% by the traditional entropy method, when distinguishing subjects’ mental states between alert (before task) and fatigue (after task). Furthermore, the detection accuracy based on the state of art multiple conventional fatigue indices can be improved from 91.9% to 96.5% by replacing the delta band amplitude with the new wavelet entropy feature.


I. INTRODUCTION
Brain-computer interfaces (BCIs) provide alternative communication methodologies between human brain and external devices. The steady state visually evoked potential (SSVEP) based BCIs have advantages such as relatively high signal to noise ratio (SNR), high information transfer rate and low training requirement [1]- [4] compared with other types of non-invasive BCIs.
Nevertheless, there are still several disadvantages of the SSVEP-BCIs that need improvements, such as users' The associate editor coordinating the review of this manuscript and approving it for publication was Alessandra Bertoldo.
fatigue. In order to elicit SSVEP signals, users need to focus on the flickering visual stimuli. As a consequence, the visual stimulation can easily induce fatigue to the users [5]. Fatigue brings health related hazard [6]- [8] and may lead to degradation of the BCI performance [9], [10] as well.
Some techniques have been applied to reduce users' fatigue in the SSVEP-BCIs, such as the use of high frequency or high duty cycle visual stimuli. Nevertheless, there is usually a tradeoff between higher system performance and lower users' fatigue [11]- [14]. For a better balance between fatigue alleviation and improving system performance, it is important to detect/evaluate users' fatigue. VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Among various types of methods for fatigue detection or evaluation, the subjective fatigue scales are the most accepted ones. Although the subjective fatigue assessment is effective [15]- [18], it is not suited for online BCI applications. Objective fatigue detection/evaluation via physiological signals such as EEG is the most popular choice for the scenarios. Some studies already proposed fatigue indices based on spectral analysis, such as the fast Fourier transformation (FFT). When using the FFT, it generally assumes that the EEG signals are stationary. However, in practice, the EEG signals are usually non-stationary due to the changes of the subjects' physiological states, meaning that this assumption is not always fulfilled. Fatigue indices based on FFT were proposed by many literatures for fatigue evaluation/detection in driving [19]- [23] or when watching 3D TV [24]- [26]. In [27], the fatigue indices based on spectral analysis of EEG were investigated in the SSVEP-BCIs application. However, it is noticed that among various studies, the behaviors of EEG power spectral components are inconsistent, which has been noticed in studies [25] and [28]. This phenomenon makes the fatigue indicators based on the spectral analysis of EEG quite unclear, especially on the individual level.
The entropy calculation on the other hand, does not require the signal to be stationary. Complexity of EEG measured by entropies has been reported to be associated with fatigue change in driving scenario or in 3D TV experiments [29]- [31]. Roughly speaking, when subjects become fatigued, the complexity of their EEG decreases. While the indices based on entropies showed consistence across studies, the accuracy of fatigue detection seems not high enough for actual application at individual level. To increase the detection accuracy, some methodologies have been applied to aid or improve the entropy calculation. The study [28] applied the multiscale entropy (MSE) to evaluate fatigue in a SSVEP-BCI task and achieved an accuracy rate up to 97%. In [32], multiple entropy calculation methods combined EEG and EOG were applied to the driving fatigue classification, resulting an average accuracy rate round 99%, [33] combined multiple EEG channels and various entropy calculation together, hit the highest accuracy of 97.5% for driving fatigue detection. Those methodologies can provide good classification performance. However, the computation for the MSE is usually expensive, for instance, the calculation can take a few minutes, while the other two approaches [32], [33] require more than one EEG channel or even EOG measurement.
This paper aims to improve the performance of applying the complexity of EEG to detect users' fatigue in SSVEP-BCIs tasks. The complexity of EEG is measured by the sample entropy, which is especially designed to process physiological signals (such as EEG) as proposed in literature [31].
Different frequency bands showed different relationship with fatigue [21], [24], [25], [27], especially for the lower frequency bands, such as theta and alpha bands. Instead of using the FFT, we applied a wavelet decomposition technique to deal with nonstationary EEG signals. Therefore, in this study the EEG signals were decomposed into approximations and details by wavelet in 6 different levels. The performance of fatigue detection based on these approximations and details of EEG in different levels is explored in an SSVEP-BCI experiment. Results show that the new feature can improve the performance of fatigue classification. The highest classification accuracy reached 96.5%, while only one channel of EEG signal is required, and the calculation cost is remarkably lower compared to the previous work [28].

II. METHOD AND MATERIAL
A. SUBJECTS 11 students from the University of Macau, aged from 21 to 29 volunteered to participate in this experiment. The participants had no history of psychiatric or neurological disorders, no addiction to drugs or alcohol, no abuse of psychotropic medication and normal or corrected to normal vision. Before the experiment, the possible consequences and privacy issues were explained to the participants and a written consent was obtained from each one of them. The protocol of the experiment was designed according to the Declaration of Helsinki (except for preregistration) and approved by the Research Ethic Committee (University of Macau).

B. EXPERIMENTAL PROTOCOL
An SSVEP-BCI experiment was designed to induce users' fatigue. One single flickering visual stimulus was applied to elicit SSVEP signals with chosen frequencies. The stimulus was presented on an LCD screen with the refreshing rate of 120Hz, located in front of the participant about 50cm away. The resolution of the LCD is 1680 × 1050 and the size is 52.6cm × 32.4cm. The stimulus was white, with the size of 120 × 120 pixel and appeared on the screen as a white square with the side length of circa 3.7cm on the black background. Three seconds before the stimulus was presented, a ''+'' was shown in the middle of the stimulus area to remind the participant of the stimulus. The duty cycle of the stimuli was set at 50 percent. During the experiment, the subject was told to minimize the body movement including eyes blinking. The whole process was observed to make sure that the subjects focused on the stimulus.
The experiment procedure consists of 6 independent blocks. Each block begins with a subjective fatigue level evaluation session, followed by a stimulus session and ends with another subjective fatigue level evaluation session. Between adjacent blocks, the subjects get enough rest to recover from fatigue, usually several minutes long according to the subjects' self-report. The stimulus session within each block lasted for 180 seconds during which one stimulus with a certain frequency was presented to the participant. The stimulus frequencies in the 6 blocks were 8Hz, 10Hz, 12Hz, 15Hz, 20Hz and 30Hz, respectively. These choices covered the range of most popular stimulus frequencies and made sure that the LCD's refreshing rate (120Hz) can be divided by all the frequencies. A frequency indicator was attached to the screen, with which the correct generation of the frequencies was ensured. Each stimulus session consists of 30 trials. Each trial consists of a 3 seconds long stimulation interval followed by a 3 seconds long rest interval. The subjects' states in the first and last 5 trials were regarded as ''alert'' and ''fatigue'' respectively. The concise procedure is depicted in Figure 1. As for each of the 11 subjects there are 6 blocks of SSVEP-BCI task, in total there are 66 pairs of values for every fatigue index, which are 66 values in the alert state and 66 values in the fatigue state. Later in this study, the comparison between the alert and fatigue states are based on the data from these 66 blocks. With the effect size chosen as 0.5 and α <0.05, the total sample size should be larger than 54, this value is calculated with the software G-Power. Which means the 66 pairs of data are enough in this research.
The experiment was conducted in a circa 5m×4m office room. There were no intense noise or light on site. Items or behaviors that may extract the subjects' attention were kept minimum. The subjects were sitting in a comfortable chair during the whole experiment procedure, so that they were able to minimize the body movement as required.

C. REFERENCES OF FATIGUE LEVEL
In order to evaluate the subjects' fatigue level, two basic evaluation approaches were applied. The first one is a subjective fatigue scale [15], which consisted of seven questions: ' Each question could be rated by four choices, which are ''better than usual'' (score 0), ''no more than usual'' (score 1), ''worse than usual'' (score 2) and ''much worse than usual'' (score 3). The total fatigue score of this scale varies between 0 and 21, a higher score indicates a higher level of fatigue. The fatigue scores were applied as a subjective evaluation of participants' fatigue in this study.
The second evaluation approach is the SNR of SSVEP signals, which is defined as: where n was chosen as 10, z was the amplitude spectral of the EEG calculated by Fast Fourier Transform (FFT) and f was the frequency of the stimulus. The reduction of the SNR works as an indicator of the degradation of system performance, which is related with the increased fatigue level. This statement is supported by literatures [34]- [36], that the subject's attention level, arousal, mental state and fatigue have remarkable influence on the amplitude and SNR of the elicited SSVEP signal [34]- [36]. Decreased arousal level or concentration caused by fatigue can significantly worsen the signal quality [9], which is usually measured by SNR. The study [27] explored the relationship between the fatigue level and the SNR of SSVEP and found out that the significant decrease of SNR is associated with the rising fatigue.

D. EEG MEASUREMENT
The EEG signal was measured in Oz site. The location of the electrode is determined by the international 10 -20 system. The reference and ground were placed at left earlobe and middle of the forehead, respectively. The signal was recorded through a g.tec amplifier (g.USBamp, Guger Technologies, Graz, Austria) and filtered by a 50Hz notch filter. The sampling frequency was 600Hz. The impedance was kept below 10k . The data were examined manually in case of any unexpected artifacts. Specially, if more than 30% of the data points are larger than a predefined threshold, i.e., 150µV (as most of the points are smaller than 50µV), the whole segment will be recognized as artifacts and finally removed. The isolated data points which are larger than 150µV were recognized by a threshold as artifacts and eliminated.

E. SIGNAL PRE-PROCESSING
The Daubechies wavelet of order 4 (db4) was chosen to decompose the EEG signal because of its efficiency [26]. The number of decomposition levels was 6 as shown in Figure 2.  [8], [21], [27], which lie mainly within the regions of D6, D5 and partly in A6. The most important frequency range for SSVEP-BCIs is between 6Hz and 40Hz, which lies within the regions of D6, D5 and D4. Therefore, the current level of decomposition is enough for our application. Additional decomposition could be interesting, nevertheless it leads to shorter data length and may cause some problems in sample entropy calculation. The reason that we only used the EEG signals with the stimulation time is that the subjects were allowed to blink or move their body a little within the resting intervals. The sample entropy values of EEG components within the alert and fatigue states (either contains 5 trials) were calculated as fatigue indices. As shown in Figure 2, the data length for the calculation is 1800 points, as the length of the stimulation interval within each trial is 3 seconds. In order to minimize the influences from random factors, the indices values were averaged over 5 trials within either state. As stated before, the first and last 5 trials in each block were recognized as ''alert'' and ''fatigue'' respectively. The average value of the index in the 1 st to 5 th trials is used as the index value in the alert state, while the 26 th to 30 th as the fatigue.
Some conventional fatigue indices based on FFT were also calculated with the same data in the alert and fatigue states, the data length for the FFT calculation is therefore 1800. As the data length is not a power of 2, the window length of the FFT calculation is chosen as 2048. The part with insufficient data length is filled with 0 by MATLAB automatically. The spectrum analysis-based indices were calculated as = {13,30} for the beta band. A notch filter was applied to suppress the amplitude of EEG signals at the stimulus frequencies, so that the influences from the SSVEP amplitude can be minimized.
A concise description of the signal processing is presented in Figure 3.
The purpose of this study is to propose a new wavelet entropy index A6 and evaluate the performance of fatigue detection in SSVEP-BCI tasks using the A6. In the results section, the III.A part provides evidence that the subjects' fatigue level did increase after the tasks. Further, the III.B subsection shows the results from paired sample t test for the indices A1, . . . , A6 and D1, . . . , D6, the III.C presents the performance of solely using the wavelet entropy indices for fatigue detection, the III.D compares A6 with conventional fatigue indices, and III.E proposes a classifier for fatigue detection applying both A6 and conventional fatigue indices.

A. REFERENCE OF FATIGUE CHANGE
The Cronbach's alpha test is an indicator of the reliability of questionnaires [15], [38]. We applied it to the pre-block (nonfatigue state) scores for our 7-question scale. The alpha value is 0.753, which implies that the internal consistency of our  The SNR of the SSVEP signals is a generally accepted indicator of system performance. As we know that the system performance declines as the users' fatigue level increase, the SNR was applied as a secondary fatigue reference. The SNRs in the alert and fatigue states were calculated and presented in Figure 4(b). Specifically, the average SNR is the mean value of all the 66 blocks.
Paired sample t-test was applied to the fatigue scores before and after the SSVEP-BCI tasks, to assess the significance of the subjective fatigue increase. The mean difference between the fatigue scores evaluated before and after the SSVEP tasks was −4.788, with a standard error of 0.681 and the significance level of p > 0.001.
Paired sample t-test was also applied to the SNR in the alert and fatigue states (definition of ''alert'' and ''fatigue'' see in subsection II.B.), the mean difference of the SNR is 0.654, with a standard error of 0.163 and the significance level of p > 0.001 .

TABLE 1. Classification accuracy when single index is applied for the distinguishing between alert and fatigue.
The normal distribution of the data was verified by PP-plot via the software SPSS.

B. PROPOSED WAVELET ENTROPY FEATURES
The average sample entropies of approximations A1, A2, . . . , A6 in the alert state were significantly lower compared to the entropy values in the fatigue state in total 66 blocks.
Paired sample t-test was applied to the sample entropies calculated with approximations A1, A2, . . . , A6, the result shows that all the values of A1, A2, . . . , A6 changed significantly, with p < 0.001. The normal distribution was verified by PP plot using the software SPSS.

C. ALERT/FATIGUE CLASSIFICATION
A support vector machine (SVM) classifier was applied to assess the performance of the fatigue features and to evaluate the efficiency of applying the features to distinguish users' states between alert and fatigue. The SVM classifier was built with MATLAB, with 10-folds cross-validation applied, which means 90% data for training and 10% data for testing.
The classification accuracy using features A1, A2, . . . , A6 as well as D1, D2, . . . , D6 is shown in Table 1. The sample entropy of original EEG data was also tested. We also tested the classification accuracy using two indices combined. Some results were presented in Figure 7.     An artificial neural network (ANN) based classifier was applied to evaluate the classification efficiency of the features D1, D2, . . . , D6 plus A6. The ANN was built with MATLAB, which contains 7 input channels, 1 hidden layer with 10 neurons and 2 output states, which are alert and fatigue. The reason of choosing ANN instead of alternative methods such as deep learning is that we are using features and not   . . , D6, A6 in III.C; delta, theta, alpha, beta, (theta+ alpha)/beta, theta+alpha+beta, theta/beta in III.D; A6, theta, alpha, beta, (theta+alpha)/beta, theta+alpha+beta, theta/beta in III.E.
The classification accuracy is 87.7% (standard deviation is 10.9%) based on a 10-fold cross validation. The MATLAB function ''crossvalind'', which can distribute the data randomly into different groups was chosen to select the training and testing data.

D. COMPARISON WITH CONVENTIONAL FATIGUE INDICES
A comparison between the proposed feature and conventional fatigue indices based on FFT was conducted.
The accuracy values listed in Table 2 were calculated with the same methodology as the values in Table 1. The conventional fatigue indices were also applied as inputs of an ANN, which have the same settings as in Figure 8 presented to explore their performance of fatigue classification. The 10-folds cross validation is also applied in the same way. The inputs are delta, theta, alpha, beta, (theta+alpha)/beta, theta+alpha+beta as well as theta/beta. The inputs indices (theta+alpha)/beta, theta+alpha+beta and theta/beta were chosen because of their relatively higher classification accuracy. The classification accuracy is 91.9% (standard deviation is 5.91%) based on a 10-fold cross validation.

E. PROPOSED NEW FEATURE A6 AND CONVENTIONAL FATIGUE INDICES COMBINED
To investigate the performance of the proposed feature and the conventional indices combined, another ANN test was conducted, in which the inputs are A6, theta, alpha, beta, (theta+alpha)/beta, theta+alpha+beta as well as theta/beta. The delta was replaced by A6 as their frequency range are mostly overlapped. All the other settings of the classifier were the same as the previous classifier. The classification accuracy is 96.5% (standard deviation is 5.49%) based on a 10-fold cross validation.

IV. DISCUSSION
As shown in Subsection III.A, the increased fatigue scores as well as decreased SNR both indicated that the subjects' fatigue level increased over the experiments. The fatigue features proposed in this study also changed significantly as shown in Subsection III.B. The sample entropy of EEG's approximations A1, A2, . . . , A6 as well as details D1, D2, . . . , D6 decomposed by Daubechies wavelet of order 4 showed significant decrease as the user's fatigue significantly increased over the SSVEP-BCI task. It supported the hypothesis that these indices associate with subjects' fatigue increase in the SSVEP-BCI task. This result is also consistent with studies [29]- [31], [37], in which the decreased entropy values of EEG signals were found associated with fatigue.
Due to large individual differences that exist in both the BCIs' performance and the habits of scoring questions, it is very hard to compare the degree of fatigue across subjects, we can only conclude from these results that generally speaking, the subjects' fatigue level increased after performing the SSVEP-BCI task. Therefore, in this study only two states were applied for the fatigue level classification, as more refined fatigue level difference is not reliable.
A major problem of detecting fatigue using physiological signals is also the large individual difference. Although in most cases it is possible to determine a threshold between the alert and the fatigue states for each subject individually, a general or standard threshold for multiple subjects is difficult to find. For instance, as judging by individuals, in most cases the entropy values decreased, however, some subjects' ''alert'' entropy values are lower than other subjects' ''fatigue'' entropy values. This phenomenon could be recognized in Subsection III.B. As shown in Figures 4 and  5, the standard deviations of the indices are relatively large while the paired sample t tests show quite small p value. Therefore, the overall classification accuracy assess by SVM is not high, nevertheless, still reached around 85% when A6 and D6 are used. Given that the classification accuracy applying the original EEG signals is only around 65%, the proposed method has remarkably increased the performance of fatigue detection.
The difference among the approximations A1, A2, . . . , A6 is not remarkably large, as the common part of them is the lowest frequency interval, which is A6. While the details D1, D2, . . . , D6, which do not cover the frequency range of A6 show much lower classification accuracy (see in Table 2). According to these results, the most important information needed to distinguish between alert and fatigue lies in the lower frequency bands -mostly in A6 approximation, which corresponds to the frequency range between 0 -4.6875 Hz. It is shown in Figure 6 that the classification accuracy increased when A6 aided D1, D2, . . . , D6 as inputs of the SVM. This phenomenon suggests that the most important change of EEG happens in the low frequency interval such as delta band. According to studies [39]- [41], delta band activity is associated with sleepiness. Result in current study suggests that the subjects' fatigue feeling is mainly or at least partly caused by sleepiness, which is similar to the feeling caused by sleep deprivation in studies [41] and [42]. In this study, the subjects' sleepiness increased after only 3 minutes on task. Similar findings were also reported in study [43], that in an attention required mental task, the subjects' sleepiness level along with performance fluctuated in a cycle of a few minutes (3 min. or slower). In a future work, the existence of a similar fluctuation of sleepiness in SSVEP-BCI tasks could be explored in a newly designed experiment.
A feedforward neural network with one hidden layer has the potential of learning any input-output relationship given enough neurons in the hidden layer. Generally, more difficult tasks require more neurons. As in our case, the task is to distinguish between two states based on 7 inputs, the current structure should be enough.
In the subsection III.D, the conventional fatigue indices based on FFT showed good performance with an ANN fatigue classifier, while it is still improved by replacing delta band amplitude with A6 feature. It suggests that the entropy change in higher frequency range does not contribute as much as conventional indices in these bands, while in lower band, such as delta band, wavelet entropy provides better performance compared with FFT based conventional indices.
Different frequency bands of EEG are found associated with different kinds of brain activities, for instance, delta band is related to sleepiness while theta and alpha band are related to attention and cortical arousal level [39], [47], [48]. The entropy value of the raw EEG signal, which covers all frequency bands is affected by too many factors. Separating the EEG into different frequency bands can therefore benefit the analysis. The FFT and wavelet decomposition are both typical methodologies for frequency dividing. As the FFT calculation requires the signals to be stationary, which means it requires the state of the brain needs to be stable. While the subjects were struggling with sleepiness, the state of the brain varies. In this case, the entropy calculation, which does not require the signals to be stationary could provide better performance.
Another advantage of applying entropy value in lower frequency bands as fatigue indices is the relatively low computation time. As the number of data points is reduced by wavelet decomposition, the entropy value in A6 could be calculated within one second on a standard office desktop computer, while the index proposed by the previous work [28] required a few minutes on the same computer. This improvement could benefit the implementation of an online fatigue monitoring system in practice. In the future work, this method will be tested with frontal electrodes, in order to adapt the needs of patients, as they tend to have their heads against something.

V. CONCLUSION
This study explored the performance of fatigue indices based on wavelet sample entropy of EEG in SSVEP-BCIs tasks. In a large scale of frequency range, the sample entropy values of the EEG signal showed significant difference between the alert and fatigue states, while the lowest frequency range contains the most important information for fatigue detection. Applying A6 (0-4.6875 Hz) and D6 (4.6875 -9.375 Hz) can provide highest classification accuracy up to 85%, additional input channels with higher frequency bands may help little more in fatigue detection. The combination of the feature A6 and conventional fatigue indices as inputs of ANN classifier can improve the fatigue detection accuracy up to 96.544% in a 10-folds cross validation test.
YUFAN PENG received the M.S. degree in electrical engineering from Dresden University of Technology, Dresden, Germany. He is currently pursuing the Ph.D. degree with the Department of Electrical and Computer Engineering, University of Macau. He is concurrently teaching at Beijing Normal University, Zhuhai. His research interests include brain-computer interface and biomedical signal processing.
CHI MAN WONG received the B.S. degree in electronics information engineering from Jinan University, Guangzhou, China, and the M.S. degree from the University of Macau, where he is currently pursuing the Ph.D. degree with the Department of Electrical and Computer Engineering. His research interests include brain-computer interface and biomedical signal processing. FENG WAN (Senior Member, IEEE) received the Ph.D. degree in electrical and electronic engineering from The Hong Kong University of Science and Technology, Hong Kong. He is currently an Associate Professor with the Department of Electrical and Computer Engineering, Faculty of Science and Technology, and also a member of the Centre for Cognitive and Brain Sciences, University of Macau, Macau. His research interests include biomedical signal processing, brain-computer interface, neurofeedback training, computational intelligence, and intelligent control. VOLUME 9, 2021