A Novel Classiﬁcation Framework Using the Graph Representations of Electroencephalogram for Motor Imagery Based Brain-Computer Interface

—The motor imagery (MI) based brain-computer interfaces (BCIs) have been proposed as a potential physical rehabilitation technology. However, the low classiﬁcation accuracy achievable with MI tasks is still a challenge when building effective BCI systems. We propose a novel MI classiﬁcation model based on measurement of functional connectivity between brain regions and graph theory. Speciﬁcally, motifs describing local network structures in the brain are extracted from functional connectivity graphs. A graph embedding model called Ego-CNNs is then used to build a classiﬁer, which can convert the graph from a structural representationto a ﬁxed-dimensionalvectorfor detecting critical structure in the graph. We validate our proposed method on four datasets, and the results show that our proposed method produces high classiﬁcation accuracies in two-class classiﬁcation tasks (92.8% for dataset 1, 93.4% for dataset 2, 96.5% for dataset 3, and 80.2% for dataset 4) and multiclass classiﬁcation tasks (90.33% for dataset 1). Our proposed method achieves a mean Kappa value of 0.88 across nine participants, which is superior to other methodswe comparedit to. Theseresultsindicatethatthere


I. INTRODUCTION
M OTOR imagery (MI) is the process of imagining the movement of some parts of the body without sensory stimulation [1]. Motor imagery (MI) classification can be used in the control of brain-computer interfaces (BCIs) [2]. Specifically, the BCI attempts to decode the motor intention of the user without the user making any physical movement [3]. MI-based BCI system can help patients with motor dysfunction to control external rehabilitation equipment [4], such as, but not limited to, wheelchair start and stop functions and exoskeleton rehabilitation training systems.
However, identifying neural markers of motor imagery in the brain is not an easy task and usually requires the use of specific feature extraction methods and classification algorithms [5], which must be sufficiently robust to deal with the complexity of the EEG signals [6]. Common Spatial Patterns (CSP) [7] is one of the most effective methods for extracting features for MI classification and, consequently, many extended versions of the original CSP algorithms have been developed [8]- [10].
For classifying MI tasks, traditional machine learning methods like support vector machines (SVMs), canonical correlation analysis (CCA) [11], k-means clustering [12], and Fisher's discriminant analysis (FDA) [13] may be applied to the EEG. Although these methods can be effectively used in traditional BCI systems to a certain extent, they may neglect some critical information in the raw EEG signals which can limit their performance. In one attempt to address this, a growing number of researchers are turning to measures of brain network connectivity for classification of MI activity from the EEG [14]- [16].
The human brain is a large complex network containing many billions of neurons [17], and its function depends on the real-time dynamic interaction between many spatially This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ distributed regions [18]. Network science, and in particular graph theory, has the potential to describe these interactions and, consequently, has become increasingly used in the fields of neuroscience and neurology. Network neuroscience [19] is a relatively new area of research, which provides researchers with a unique opportunity to evaluate, quantify, and ultimately understand the characteristic information embodied by complex brain networks when individuals perform cognitive tasks. It may also be used to construct BCI systems based on characteristics of the brain networks while users perform specific cognitive tasks. The resulting brain network characteristics can be sent to a classifier as features, potentially resulting in improved classification performance. Connectivity in the brain is measured in three ways, functional connectivity, anatomical connectivity, and effective connectivity [20].
Functional connectivity evaluates the statistical relationships between different brain regions, it may provide a new perspective that helps people to understand the neural mechanism underlying motor execution (ME) and motor imagery (MI). A large number of methods have been proposed to measure functional connectivity in the brain via functional magnetic imaging (fMRI) and EEG. For example, an algorithm based on K-means clustering of the functional connectivity graphs which were obtained from the phase-locking value(PLV) metric has been applied to high-resolution EEG to study the brain networks properties modify during visual tasks [21]. Functional connectivity can be seen as a neurophysiological biomarker to assess Alzheimer's disease in patients [22]. Ge et al. found that functional connectivity between the medial temporal lobe, the lateral parietal, and lateral temporal regions increased after the process of motor imagery training [23]. Gonuguntla et al. analyzed the network mechanisms related to motor imagery tasks based on PLV in the alpha frequency band of the EEG [15]. In reference [14], Daly et al. pointed out that prior to motor execution and motor imagery there is an increase in the level of PLV-measured functional connectivity at the Mu rhythm and that this may be used successfully as the control signal for building a highly accurate BCI.
Graph theory has great advantages in studying the working characteristics of brain networks. In [16], the author combined spectral graph theory and a quantum genetic algorithm to find an effective set of channels for motor imagery classification. Stefano Filho et al. modeled interactions among EEG electrodes during a motor imagery task and classified the signals using LDA and SVM classifiers [6]. Gao et al. explored the network connectivity differences between core brain regions during ME and MI through conditional granger causality and In-Out degrees [24]. Xu et al. explored and compared the functional connectivity between ME and MI by calculating betweenness centrality (BC), a measure of graph theory [25]. In [26], the author used the spectral decomposition of a graph defined by a geometrical distribution of electrodes to achieve EEG signal dimensionality reduction. In [27], a graph-based method for EEG biometric identification was proposed, which consisted of a network estimation module and a graph analysis module.  Fig.1(a) shows the weighted connections of five nodes and Fig.1(b) shows the adjacency matrix. In this example, the strength of connectivity is divided into three levels.
To solve the low performance of EEG-based MI classification, we propose a novel classification framework for identifying MI tasks using the graph representations transferred from EEG signals. In this paper, we use functional connectivity analysis to covert the time series of EEG signals to graph data. We then use the resulting graph data for classifying different motor imagery states. In our method, the graph built by measuring functional connectivity is an undirected weighted graph and we extract the distinguished local structure through a local structure extraction matrix E L S . To identify the structure in the graph during different motor imagery tasks, we choose a graph embedding model, called Ego-CNNs [28] which is used for learning discriminative features from graphs.

A. Graph Representation
A weighted undirected graph can be defined as G = (V, E), in which V represents the set of nodes with the number of |V | = N and E denotes the set of edges connecting these nodes. The term e i j ∈ E denotes the edge connection between node v i and node v j . If the connection between node v i and node v j exists e i j = 1 otherwise e i j = 0. The structure of a graph also can be represented by an adjacency matrix, in which each cell represents the connection between a pair of nodes. The adjacency matrix of the undirected graph is symmetric. Different node pairs may have different connectivity strengths and, in this case, c i j denotes the strength of connectivity between node v i and node v j . Figure.1 illustrates an example of a weighted undirected graph with five nodes and the strength of connectivity between node pairs, as well as the adjacency matrix associated with the graph. In Fig.1(a), different sizes of edges represent the different strengths of connectivity between pairs of nodes, whereas Fig.1(b) illustrates the corresponding adjacency matrix of the graph.
In the case of EEG, the recording electrodes can be seen as the nodes, but the methods to measure the connection strength between different nodes are diverse. In this paper, we focus on the characteristic of functional connectivity of EEG signals in the time domain. Therefore, we use mutual information to measure connection strength between nodes.

B. Normalized Mutual Information
The normalized mutual information quantifies the amount of information that two signals share with each other and is based on information theory [29].
Let ρ (x) = P r {X = x} and ρ (y) = P r {Y = y} be the probability density functions of random variables X and Y . The joint probability density is defined as ρ (x, y) = P r {X = x, Y = y}. The Shannon entropy H (X) and H (Y ) measure the average information obtained from the observations of random variables X and Y [30]. They are defined as: The joint entropy H (X, Y ) is: The conditional entropy of X given Y is defined as: where ρ (x | y) = P r {X = x | Y = y} is the conditional probability. The joint entropy is similar in form to the Shannon entropy. The conditional entropy can be represented by measures the amount of shared information between X and Y : It can also be expressed as: The mutual information needs to be normalized to measure connectivity between variables [31]: The normalized mutual information between two EEG electrodes may be used as a measure of the strength of functional connectivity between those electrodes. Consequently, the structure of the graph describing brain connectivity is defined as follows. The nodes are electrodes and the weighted edges are the normalized mutual information between the nodes.

C. Selection of Nodes and Edges
In general, when the number of electrodes is large, if we use all electrodes to build an adjacency graph, the graph is large and, as a result, contains many redundant edges and nodes. We aim to select a subset of edges and nodes which is most informative for discriminating between different motor imagery tasks. When participants perform different motor imagery tasks, the connectivity pattern of the brain looks different [6]. We choose a subset of weighted edges (calculated by measuring functional connectivity) with a large difference between different imagery tasks.
For one trial recorded during one MI task, the adjacency matrices obtained from all electrodes are defined as: A q p ∈R N all ×N all , which represents the q th trial of the p type motor imagery task. Here, N all denotes the number of all electrodes, V all represents the set of nodes, and |V all | = N all .
A 1 represents the adjacency matrix of one motor imagery task, we assume the number of trials for one type of MI task(class1) is T 1 and the number of trials for the other type (A 2 ) of MI task (class2) is T 2 . We assume c l i j ∈ A 1 , l = 1, 2, . . . , T 1 follows the distribution P 1 , and c r i j ∈A 2 , r = 1, 2, . . . , T 2 follows the distribution P 2 . The differential functional connectivity network graph is defined as: where w i j is the Wasserstein distance of c i j in different MI tasks. This is defined as: where γ ∈ [P 1 , P 2 ] is joint probability distribution of P 1 and   (13) Under the function of matrix E L S , important structures of the graph, containing all nodes and all edges, can be preserved, while the redundant nodes and edges are eliminated. A∈R N all ×N all denotes a generic graph containing all nodes and edges, A S ∈R N all ×N all represents the motif extracted from the graph with an important structure. In formula (13), the calculating symbol • denotes the Hadamard product [32]. A S is a symmetric sparse matrix, and it can then be transformed to A F ∈ R N node ×N node by eliminating rows and columns with all zeros. The dimension of A F is the number of selected nodes N node , which depends on the number of selected edges and structure formed by the selected edges. Even if the number of selected edges (N edge ) is the same, the number of selected nodes (N node ) may be different, because different participants exhibit different patterns of connectivity. However, the number of edges is the same.
A F denotes the adjacency matrix, which can represent the motif with the most informative structures. The motif is extracted from the graph with all nodes and all edges by E L S . The next step is to build a model that can classify the graph data described by A F . Fig. 2 describes the processing of selection of edges and nodes.

D. Ego-CNNs Classification Model
The graph embedding algorithm converts graphs from structural representations to a fixed-dimensional vector for detecting critical structures in the graph. The graph described by A F typically has a distinct structure, so we decided to use a graph embedding model to build the classifier. Ego-CNN is a novel graph embedding model [28], that employs ego-convolutions at each layer and stacks up layers using an ego-centric approach to detect precise critical structures efficiently. The Ego-CNNs are a generalization of a node embedding model called a 1-head-attention graph attention network (1-head GATs) [33]. One challenge for identifying critical structure is that critical structure is task-specific and participant-specific. In other words, the shape and location of the critical structure may vary from person to person. For one participant, every A F describes the graph during different tasks, and has the same location but different structure.
Given a graph G = (V, E), V and E represent the set of nodes and edges, respectively. The dimensionality of feature embedding is D and the number of convolution layers is L. If a node n has K neighbors, the features of node n can be represented by the features of these neighbors. The Ego-CNN we use is developed from the Patchy-San model [34], which can detect precise critical graph structures at a local scale. In the Patchy-San model, the neighborhood of node n at the input layer is defined as the K × K adjacency matrix of the K nearest neighbors of the node.
Let Nbr (n, k) be the k-th nearest neighbor of node n in the graph G. The graph embedding output at the l-th layer by D filters W (l,1) , · · · W (l,D) is defined as H (l) ∈ R N×D . For l = 1, · · · , L, filter W (l,d) scans through the adjacency matrix of a node n to generate a graph embedding as: The term E (n,l) ∈ R (K +1)×D is a matrix denoting the neighborhood of node n at the l-th layer, b d is the bias term, is the Frobenius inner product defined as X Y = i, j X i, j Y i, j , and σ is the activation function. In this study, we use the edge weights to determine the K nearest neighbors of a node n.
The term H n,: ∈ R K defines the adjacency vector between node n and its K nearest neighbors.

III. EXPERIMENTS
In this section, we describe our experiments on four publicly available MI EEG datasets that are commonly used in EEG MI classification to evaluate the effectiveness of our proposed graph representation method. We used dataset 1 to test the effectiveness and performance of our proposed method under low-density channel conditions. We then used dataset 2, dataset 3 and dataset 4 to evaluate the performance of our method under high-density channel conditions.

A. Dataset 1
Dataset 1 is taken from the BCI competition IV Dataset IIa [35] and contains EEG signals recorded from nine healthy participants, recorded via 22 electrodes. During the recording process, the participants were instructed, with visual cues, to perform one of four motor imagery tasks: the imagination of movement of their left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Every MI task contained 72 trials, each participant performed 288 trials in total. The EEG signals were sampled at 250Hz and band-pass filtered between 0.5Hz and 100 Hz. The data set can be download from the website: http://www.bbci.de/competition/iv /.

B. Dataset 2
Dataset 2 comes from the BCI competition IV [36] provided by the Berlin BCI group. All data were recorded from seven healthy participants via an Ag/AgCl electrode cap with 59 channels. The experimental paradigm used was a standard MI paradigm without feedback. For every participant, the signal was selected from two kinds of MI tasks (left-hand MI and right-hand MI). Two runs of the experiment were performed. Each run contains 100 trials. The data was downsampled to 100 Hz. The data set can be download from the website: http://www.bbci.de/competition/iv/.

C. Dataset 3
Dataset 3 is taken from the BCI Competition III Dataset IVa [37], which was recorded from the 5 healthy participants via 118 electrodes. During the recording process, the participants were instructed to perform one of two motor imagery tasks: right hand and foot MI. In total, 280 MI trials were requested from each participant, and all EEG data was down-sampled to 100Hz. The data set can be download from the website: http://www.bbci.de/competition/iii/.

D. Dataset 4
Dataset 4 is a popular open-access motor movement/ imagery dataset, available in Physionet Resource [38,39]. It consists of 64-channel EEG data recorded at a 160 Hz sampling rate from 109 volunteers. We removed the data from participants S88, S89, S92, and S100 because of the damaged recording with multiple consecutive "rest" sections. As a result, only 105 participants' data were used in this experiment, and each participant had 42 or 44 trials (the mean value is 43.6) with a balanced ratio in the right and left fist motor imagery conditions. The data set can be download from the website: https://physionet.org/content/eegmmidb/1.0.0/.

E. Data Processing
All EEG data used in the four datasets were band-pass filtered using a fifth-order Butterworth filter from 8Hz to 30Hz. Because of the differences among the paradigms in the three datasets, different time windows were used in this study: 0.5-3.5s for Dataset 1, 0.5-4s for Dataset 2, 0.5-3.5s for Dataset 3 and 0-2s for Dataset 4.
In the process of selecting nodes and edges from the graph with all nodes and all edges, the key factor is N edge (the number of edges selected), which determines the extracted local structure and the number of selected nodes N node . Taking into account the differences in the number of electrodes used in different data sets, the different N edge values were as follows for each of the datasets N edge = 20 for Dataset 1, N edge = 30 for Dataset 2 and Dataset 4, and N edge = 40 for Dataset 3. The weight c i j is rounded to an integer and used as the label of the edge between node v i and node v j . The labels of the nodes are set to the order of electrodes for recording the EEG data.

F. Model Setting
In this study, we want to demonstrate the ability of the graph represented by A F , to distinguish between different types of MI. The network architecture of our Ego-CNN implementation used in this study follows the recommendations by Tzeng and Wu [28] and remains the same for all datasets. The Ego-CNNs contains 1 node embedding layer (Patchy-San with 128 filters and K = 10), 2 Ego-Convolution layers (both with 128 filters and K = 10) and 2 fully connected layers. Fig. 3 describes the structure of the Ego-CNNs model. The Dropout (drop rate 0.2) and Batch Normalization were used in the input and Ego-Convolution layers. We apply the Adam algorithm to train the Ego-CNNs with a learning rate of 0.0001. If the node has less than K neighbors, zero vectors will be used to denote non-existing neighbors. Since the amount of data from one participant contained in each dataset is different we set different cross-validation parameters for each dataset. For datasets 1 and 4, we use a 6-fold cross-fold training and testing strategy, for datasets 2 and 3, we use the test accuracy from a 10-fold cross-fold training and testing strategy. Figure 4 describes the general method proposed in this paper. The graph dataset derived from the EEG dataset was divided into training, validation, and testing set. Table I shows the details of the datasets and the test accuracies achieved with the default parameters. The details listed include the number of participants (N part ), the number of channels used in recording the EEG signals (N channel ), the number of trials for one participant (N trail ), the number of folds for cross-validation (N f old ), the number of edges selected (N edge ), the average number of nodes selected (N node ), and the average test accuracy (T est ACC) for each dataset. Note, for dataset 1, we performed both two-class and four-class classification experiments. The two-class classification result shown in the table is calculated from only left-hand motor imagery (72 trials) and right-hand motor imagery (72 trials). As can be seen from Table I, for two-class classification experiments, the proposed method can achieve more than 90% average test accuracies on first three datasets. But the dataset 4 only obtained 80.2% test accuracy, this is caused by the too-small number of trials. For the multi-class classification task, the proposed framework also can achieve 90.33% average accuracy. The results shown in table I illustrate the effectiveness of the method proposed in this paper on different datasets (with different numbers of electrodes and different numbers of trials).   5 illustrates the 6-Fold CV test accuracies and the numbers of nodes determined for every participant for dataset 1. Fig. 5(a) shows the test accuracy and the number of nodes determined in binary classification. Only participant 'S4' has "bad" performance, with a classification accuracy lower than the 70% that is often described as the threshold necessary to control practical BCIs [40]. Fig. 5(b) shows the test accuracy and nodes setting in multiclass classification. The number of nodes is similar across all participants.

A. Generic Performance of Classification
The relationship between the training loss and the number of iterations is shown in Fig. 6. After each training step, we also use validation set to evaluate the Ego-CNNs model, and the classification accuracy curve of validation set is overlaid in Fig.6. As we can see, the training loss, the training loss is less than 0.1 after 20 iterations, and the accuracy curve of validation set converges to around 0.9.

B. Comparison Results
Dataset 1 and dataset 3 were tested with existing classification models. We compared our proposed method with preexisting motor imagery EEG classification methods. We used the same preprocessing steps outlined above. We first made a comparison with the classic MI classification algorithm  (common spatial patterns) CSP [7]. CSP uses diagonalization of the covariance matrix to find a set of optimal spatial filters for classification. Then we compare our proposed method with the state-of-the-art methods (filter bank common spatial patterns) FBCSP [41], which is extended from CSP. The FBCSP algorithm addresses the problem of selecting the appropriate operational frequency band to extract optimal CSP features. We also compare several deep learning approaches with various model structures and feature embedding strategies. Specifically, a deep neural network (DNN) model, as reported in [42], is used. This model uses an adaptive method to determine the classification threshold. We perform a further comparison with the EEGNet [43] approach. The EEGNet encapsulates well-known EEG feature extraction concepts for BCI to construct a uniform approach for different paradigms. Lastly, we compare our proposed model with the wavelet-CNN (W-CNN) model [44]. The wavelet transform is introduced to generate the input images for the CNN model. Table II summarizes the overall comparison results. The last row of the table presents the ρ-values obtained from the paired t-test between the results of our proposed method and other methods we compare against our method. It can be seen from. Table II that our method outperforms other existing methods on the evaluation datasets in terms of mean classification accuracy. According to the results, our proposed method yielded an average improvement of 13.21% (on dataset 1) and 14.64% (on dataset 3) in terms of mean classification accuracy. The improvement in the classification accuracy for some participants such as 'S2'and 'av' is substantial (around 32.6% and 38.7%). This showed that our proposed method is capable of extracting distinctive local structure features for classification and that these features improve classification accuracy substantially.
To evaluate the effectiveness of our proposed framework in the multi-class classification task, we compared our framework with machine learning methods [45]- [54] and deep learning methods [55]- [57]. Table III displays the mean Kappa value using the proposed framework and other existing methods at each subject from Dataset 1. The last columns present the average kappa value, standard deviation, and the t-test p-value between our proposed method and other approaches. Although the performance is different between the nine participants, our proposed framework is superior to existing methods in general.
The primary reason that our proposed method surpasses traditional algorithms such as CSP, and FBCSP is the multiple nonlinear transformation processes, which is an advantage of deep learning methods. When compared with other deep learning methods, our proposed model has three main advantages that help to achieve superior performance. The first one is that we covert the EEG time series to graph data by first measuring functional connectivity between channel pairs. Compared with the classical deep learning methods which directly apply the convolutional operation on the raw EEG signal, like EEGNet, our proposed graph representation makes use of the spatial relationships and strengths of connectivity of EEG nodes, which facilitates the neural network in differentiating the different mental states. The second advantage of our proposed local structure extraction method is that it can reduce the redundant nodes and edges from the graph obtained from the raw EEG signals. The local structure extracted from the whole graph can hold distinctive structural features, which aids the classification process. Finally, the third advantage of our proposed method is the graph embedding method (Ego-CNNs), which employs the ego-convolutions at each layer and stacks up layers in an ego-centric way to detect important critical structures efficiently.

A. Rationality of the Local Structure Extraction Matrix
The graphs identified from the raw EEG signals have a lot of edges and nodes, whose structures cannot be differentiated easily under different motor imagery tasks. To solve this problem, we designed the local structure extraction matrix E L S . Under the action of E L S , the redundant edges and nodes are removed and the most informative structures are preserved. Fig. 7 represents the connectivity characteristics of the whole graph and the local structure extracted by the E L S . Different colors denote different strengths of connectivity between pairs of nodes (electrodes). The value indicated by the color represents the functional connectivity between the corresponding nodes. Red indicates high levels of functional connectivity, blue indicates low levels of connectivity. We randomly selected one participant to plot the connectivity characteristics under different motor imagery tasks. Fig. 7(a) and (b) represent the connectivity relationship (the adjacency graphs produced without extracting the local structure) between all electrodes under the two kinds of motor imagery tasks from the three datasets. Fig.7(d) and (e) represent the local structures extracted by E L S from the Fig.7(a) and (b). Fig.7(c) represents the connectivity difference between different tasks. It can be seen that Fig.7(a) and (b) represent different tasks that are too similar to distinguish, but Fig.7(d) and (e) represent local structures extracted by E L S that can be easily distinguished. From Fig. 7, we can also see the effectiveness of using E L S to extract a local structure for classification under different motor imagery tasks. Simultaneously, the dimensionality of the graph is reduced, which is beneficial for the construction of a classification network. Both low-density and high-density electrode montages can make use of the local structure of the graph to represent features for classification.

B. Feature Distribution
Discriminative feature of graph features can be extracted from raw EEG data by the Ego-CNN model. In order to further illustrate the validity of the model, we use data visualization technology to illustrate the extracted features. T-distributed Stochastic Neighbor Embedding (t-SNE) [58] is used to illustrate connectivity graph for each participant, as shown in Fig. 8 and Fig. 9. In the t-SNE scatter plot, the classification categories are represented by different colors and different shapes. Fig. 8 shows the feature distributions of two-class MI EEG features from Dataset 2. It can be seen that the points of the same category are close and the points of different categories are clearly separated. Fig. 9 shows the feature distributions of four-class MI EEG features from Dataset 1. In the four-class classification task, some data from different categories is hard to identify. For example, the MI EEG feature distributions of left-hand and foot imagery tasks from participant 'S2' have some overlapping parts. However, in general, the distributions of four-class MI EEG features are clearly separated.

C. Comparison With Other Graph Neural Network
We compared our model with classical graph neural network models Graph Convolutional Networks (GCN) [59] and Graph Isomorphism Networks (GIN) [60]. The experiment results are presented in the Table IV. It can be seen from Table IV that the Ego-CNNs model we used in this study has better performance than GCN and GIN. The traditional GCN and GIN  models extract the features of the graphs by aggregating the characteristics of adjacent nodes. They pay more attention to the attributes of nodes and ignore the structural characteristics of connected edges. The Ego-CNNs model employs novel egoconvolutions to learn the latent representations at each network layer, and can efficiently extract task-dependent, important structural features of connected edges. In this study, we convert EEG signal to graph data and extract distinguishable local structures from the graph for classification tasks. In the process of extracting local structures, we preserve the important structural features of the connected edges. Therefore, in terms of graph data used in this study, the Ego-CNNs model is more suitable for the classification task of local structures than GCN and GIN.

VI. CONCLUSION AND FUTURE WORK
In this paper, we proposed a novel method for motor imagery tasks classification. The method can transform the EEG data to graph data by calculating functional connectivity between pairs of electrodes. A measure of the local graph structure is then extracted by E L S and used by an Ego-CNNs model for classification. We tested our method on four datasets and the results showed that our proposed method can achieve more than 90% classification accuracy in the first three datasets and more than 80% in dataset 4. This high classification performance is most likely due to the following reasons: 1) The graph representation derived from the EEG signals can describe the synchronous collaboration between different regions of the brain. Functional connectivity is only related to the degree of coupling without direction or causality.
2) The local structure extraction matrix E L S can effectively extract important local structures under different motor imagery tasks, which can be used as features and then sent to the Ego-CNNs model for classification.
3) The Ego-CNNs model can detect critical meaningful structures in the graph and this results in good performance in task classification.
In the future, we will optimize our method on a small training set and continue to study functional connectivity during motor imagery tasks. In this paper, the local structural difference between multi kinds of motor imagery tasks was only constructed by measuring connectivity strength (edges in the graph), but the information from the electrodes (the nodes of the graph) wasn't considered. Therefore, in future work, we will consider the information contained in the node itself and continue to seek improvements to our proposed method.