Decoding Multi-Class EEG Signals of Hand Movement Using Multivariate Empirical Mode Decomposition and Convolutional Neural Network

Brain-computer interface (BCI) is a technology that connects the human brain and external devices. Many studies have shown the possibility of using it to restore motor control in stroke patients. One specific challenge of such BCI is that the classification accuracy is not high enough for multi-class movements. In this study, by using Multivariate Empirical Mode Decomposition (MEMD) and Convolutional Neural Network (CNN), a novel algorithm (MECN) was proposed to decode EEG signals for four kinds of hand movements. Firstly, the MEMD was used to decompose the movement-related electroencephalogram (EEG) signals to obtain the multivariate intrinsic empirical functions (MIMFs). Then, the optimal MIMFs fusion was performed based on sequential forward selection algorithm. Finally, the selected MIMFs were input to the CNN model for discriminating four kinds of hand movements. The average classification accuracy of thirteen subjects over the six-fold cross-validation reached 81.14% for 2s-data before the movement onset and 81.08% for 2s-data after the movement onset. The MECN method achieved statistically significant improvement on the state-of-the-art methods. The results showed that the algorithm proposed in this study can effectively decode four kinds of hand movements based on EEG signals.


tively decode four kinds of hand movements based on EEG
B RAIN computer interface (BCI) is a technology that 29 establishes a communication system between the human 30 brain and external devices [1]. It has been applied in many 31 fields, such as stroke rehabilitation [2], [3], prosthetic con-32 trol [4], quadcopter control [5], speech synthesis [6], and emo-33 tion recognition [7]. The techniques aiming to reconstruct hand 34 motor function have been extensively studied [8], [9], [10] 35 and are expected to enable rehabilitation training for stroke 36 patients. It's proved that the task variability can improve the 37 performance in the retention session of learned motor skills 38 and increase the generalization of learning to new skills [11]. 39 Hence, decoding fine hand movement intention is of great 40 value for rehabilitation training of hand function. 41 As is well-known, the recognition performance of hand 42 movements using surface electromyographic (sEMG) signals 43 or angle sensor data can achieve an accuracy of closed to 100% 44 in some previous studies [12], [13], [14], [15], [16], [17]. 45 However, for some severe stroke patients, they may not have 46 sufficiently high levels of muscle activity and are hard to 47 achieve normal hand movements of daily living. Then, the 48 sEMG signals or angle sensor data cannot provide the adequate 49 information to decode different hand movements. Hence, the 50 EEG signals were used to classify different hand movements 51 instead of sEMG signals and angle sensor data. 52 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ signals. The remainder of this paper is organized as follows: 111 Section II introduces the experiment and the proposed algo-112 rithm in detail. Section III presents the results of the proposed 113 algorithm on the collected data. Section IV describes the 114 discussions. Section V concludes this paper. 115 II. METHODS 116 A. Experimental Protocol and Data Acquisition 117 Sixteen subjects (1 female and 15 males: 23.1 ± 2.6 years 118 old) participated in this experiment. They are neurologi-119 cally healthy and right-handed. Written informed consent was 120 obtained from all subjects before the experiments and the study 121 was approved the 26 March 2020 by the Institutional Review 122 Board at Xi'an Jiaotong University, China, ref. 2020-620. 123 Due to the poor quality of EEG or EMG, the data of three 124 subjects were excluded from further analysis (1 female and 125 2 males).

126
EEG experiments were carried out in a quiet room. The 127 subjects were seated in a comfortable chair with their right 128 arm resting on the table in front of them and they were approx-129 imately 0.5 m distance from the screen. The experimental 130 session of each movement task consisted of 3 blocks, as shown 131 in Fig. 1. At the beginning of each block, a five-second 132 window on the computer screen with words Start indicated 133 that this block was about to begin, and subjects were reminded 134 to prepare for the coming trials. One trial included a four-135 second cue part with words Ready for Tip Pinch (or Multiple 136 Tip Pinch, Hand Close, Hand Open), a four-second execution 137 part instructed subjects to execute the movement displayed on 138 the screen and a six-second rest part in which subjects were 139 allowed to relax, blink, and swallow. Each subject performed 140 60 trials in a session, with a 5-min rest between each block 141 of 20 trials.

142
EEG signals were recorded from 30 scalp electrodes and 143 EMG signals were recorded simultaneously from 4 Ag/AgCl 144 electrodes (Neuroscan Systems, Compumedics, Charlotte, NC, 145 USA) . The EEG electrodes were placed at FP1, FP2, F7, F3, 146  FZ, F4, F8, FT7, FC3, FCZ, FC4, FT8, T7, C3, CZ, C4, 147  T8, TP7, CP3, CPZ, CP4, TP8, P7, P3, PZ, P4, P8, O1, 148 OZ, O2 locations according to international 10-20 system. 149 The reference electrode was placed on right mastoid and 150 ground on AFz. The EMG electrodes were set at extensor 151 digitorum, extensor carpi radialis, flexor carpi ulnaris, and 152 palmaris longus. The reference was placed at the bone of the 153 wrist. The cue and the data acquisition were carried out on 154 MATLAB2015a and Curry8.0 of two computers respectively. 155 For synchronization, MATLAB called the parallel port to send 156 a trigger signal and it was recorded by Curry8.0 on another 157 computer, when the participant was prompted to execute an 158 action. Signals were sampled at 1000 Hz and filtered with a 159 passband of 0.1-100Hz with a notch filter at 50Hz to remove 160 power-line artifacts by EEGLAB toolbox in MATLAB. The 161 moving time window method [21] was applied for EMG 162 signals to determine the starting point of the movement. 163 According to this time point, EEG signals were segmented 164 to obtain the effective movement-related EEG data before and 165 after movement onset. Open four movements. One session consisted of three blocks, each of which comprised one start stage lasting 5 seconds and twenty trials lasting 14 seconds. A trial was consisted of a 4s-cue which indicated subject the movement they were going to perform, a 4s-execution which required subject only execute the specific movement and a 6s-rest which allowed subject to relax, blink, and swallow. There was a 5-min rest existed between the continuous blocks to prevent fatigue of the subject. The movement onset was finally determined by EMG using the sliding window method.  the CNN features would be concatenated after flattened [44].

173
The optimal MIMFs fusion was selected by SFS strategy    features [42].

209
The specific network structure is presented as Table I were flattened into one-dimension and the flattened data were 223 fused by concatenation layer. In the subsequent selection of 224 the optimal MIMFs fusion, the number of MIMF selected 225 would be the number of constructed CNN. The CNN contained 226 3 fully connected layer with nodes 1024, 512, and 4. Softmax 227 function was used in the last fully connected layer as the 228 activation function to output the results of 4-class movement 229 classification, and the Rectified Linear Unit (Relu, f (x) = 230 max(x, 0)) was used as the activation function of other fully 231 connected layers. Considering that over-fitting occurs due to 232 small sample size, the batch normalization layers were added 233 between the convolution layer and the activation function [49]. 234 Additionally, the dropout probability of the first two fully 235 connected layers was set at 0.  . Sequential forward selection (SFS), sequen-260 tial backward selection (SBS), and bidirectional search are 261 commonly used strategies in generation procedure [53] and 262 SFS was determined for the optimal MIMF combination 263 selection in this study. The evaluation function is particularly 264 important during the search process and classification accuracy 265 was chosen. The stopping criterion was such that when the 266 accuracy of the MIMF combination in the validation set was 267 the highest, the MIMFs fusion was regarded as the optimal 268 combination and the search stopped. Otherwise, next MIMF 269 (from MIMF1 to MIMF7) would be added to calculate the 270 accuracy. 271 4) Performance Assessment: In order to improve the gener-272 alization ability of the model, the pre-processed data were first 273 divided into training set, validation set, and test set randomly 274 of rate 3:2:1. The training set data were used to train the 275 CNN and the validation set data were used for determining the 276 optimal MIMFs fusion and the optimal model. The algorithm 277 was tested on the test set and the final accuracy was averaged 278 over the six-fold cross-validation.
Since the amount of data were too small for deep learning As we all know, the sensorimotor cortex area of the brain 299 is activated during ME, so it is possible to classify hand 300 motions by [29], [40], and [54] analyzing the EEG signals 301 during ME. However, in recent years, studies [55] have shown 302 that the action intention-related signals before ME also contain   Fig. 3(a) meant that the accuracy was calculated by 326 2s-data before movement onset and AM in Fig. 3(b) was 327 related to 2s-data after movement onset. From this figure, 328 it could be observed that when the number of convolution 329 kernel was 4 and the convolution kernel was selected as 330 1 × 10 ("4-10" in abscissa, Fig. 3(a)), the average accuracy 331 (75.21±9.79%) was the highest among the nine network 332 structures using data before movement onset. The accuracy 333 of "4-10" (77.47±8.66%) in Fig. 3(b) was only 0.8% lower 334 Fig. 3. CNN network structure selection before (a) and after (b) movement onset. The abscissa represents the network structure. The number before "-" represents the number of convolution layers and the number after "-" indicates the convolution core size. For instance, "4-10" represents a CNN with four convolution layers and each convolution core size is 1 × 10. BM stands for before movement onset and AM stands for after movement onset. than that of "4-20" (78.29±8.95%). Considering that the 335 computation time increased as convolution ernel became large, 336 the network structure was confirmed to be "4-10" in this study. The decoding results of multi-class hand movements for 340 13 subjects are shown in Table II. The classification accuracies 341 of different EEG signals were compared, including 2s data 342 before the movement onset (BM-2s), 1s data before the 343 movement onset (BM-1s), 1s data after the movement onset 344 (AM-1s), and 2s data after movement onset (AM-2s). It could 345 be seen that the accuracy of the four-class hand movements of 346 BM-2s achieved 81.14%±6.76%, and the accuracy of AM-2s 347 reached 81.08%±7.83%, which was slightly lower than that 348 of BM-2s but there was no significant difference between 349 them. The accuracies were 73.64±8.62% and 73.85±8.78% 350 for BM-1s and AM-1s respectively. The highest classification 351 accuracy for single subject exceeded 90% and the lowest one 352 was more than 65% when using 2s-long data, while the subject 353 with the best classification performance had accuracies of 354 nearly 90% and the worst accuracies were more than 55% 355 using 1s-long data.   the results before and after movement onset, there was little 414 difference between these two kinds of data with the same 415 length of time, indicating that the EEG signals prior to the 416 movement onset already contained information that could be 417 specifically classified. This is meaningful for the development 418 of a low-latency prosthetic control or hand rehabilitation 419 system in future research.

420
Modern neurophysiological research had shown that the 421 movement of human limbs could lead to the enhancement 422 of brain signals activity in the corresponding areas of the 423 sensorimotor cortex, which are mainly located in the parietal 424 region [58]. In the EEG signal acquisition stage, we collected 425 30-channel EEG data of the whole brain. To investigate the 426 influence of EEG channels on movement classification, Fig. 4    The number of selected MIMFs was counted during optimal 477 MIMFs fusion when using the 2s data before the movement 478 onset. The number of times each subject's 7 MIMFs were 479 selected are summarized in Table III. Since six-fold cross-480 validation was performed in the algorithm and one MIMF 481 could be selected at most once in each fold cross-validation, 482 each MIMF could be selected up to 6 times in total. It could 483 be observed from this table that MIMF1 was selected the 484 most times, followed by MIMF2. MIMF1 and MIMF2 of 485 per subject were selected at least 5 times in the 6-fold cross-486 validation, which implied that MIMF1 and MIMF2 contained 487 a lot of useful information which facilitated feature extrac-488 tion and classification. However, MIMF7 had been selected 489 16 times, second only to the number of times MIMF2 had 490 been selected. In order to explore the contribution of different 491 MIMF to the results, single MIMF was input to CNN to 492 get the corresponding classification accuracy. As shown in 493 Fig. 6(a), MIMF1 reached the highest accuracy (79.4±7.05%) 494 and MIMF7 reached the lowest accuracy (31.53±1.62%). 495 Furthermore, as shown in Table III, except for S4 and S13, 496 MIMF7 was selected only 1 or 2 times for other subjects. 497 These indicated that there may be some information in the 498 frequency band where MIMF7 was located on some sub-499 jects, but the information contained in MIMF7 was not very 500 lower. These demonstrated that the information of hand move-508 ments would be more contained in the gamma frequency 509 band, which was consistent with previous literatures [63], [64], 510 [65], [66], [67]. 511 Furthermore, one limitation of the study was that the 512 algorithm proposed in this paper was performed on healthy 513 subjects, and this algorithm may not be equally applicable to 514 stroke patients. In order to apply hand movement intention 515 decoding to the rehabilitation training, we are going to collect 516 the EEG signals from stroke patients and transfer learning 517 would be considered to generalize the results to patient groups 518 in the following study. This study proposes a novel MECN algorithm to decode 521 multi-class EEG signals of hand movement. The results 522 showed that the accuracy obtained by the CNN with different 523 layers and different convolution kernels were different. For 524 the data collected in this experiment, the CNN with 4 con-525 volution layers and 1 × 10 convolution kernel reached the 526 best accuracy. The more the channels of EEG signals were, 527 the more the useful information would be contained. The 528 longer the length of EEG signals, the better the classification 529 accuracy. The best accuracy of 81.14%±6.76% was achieved 530 for 30 channels, 2s data before movement onset, which 531 was significantly better than that of the traditional MEMD-532 CSP and FBCSP method. In addition, MIMF1 and MIMF2 533 were selected in a high percentage in MIMFs fusion, and 534 they mainly referred to frequency band of 30-100Hz. This 535 demonstrates that the movement-related EEG signals may be 536 contained in the gamma band and the features of this band 537 can be more considered in future research.

ACKNOWLEDGMENT 539
The authors acknowledge the support by the HPC platform 540 of Xi'an Jiaotong University.