Graph Neural Network-Based EEG Classification: A Survey

Graph neural networks (GNN) are increasingly used to classify EEG for tasks such as emotion recognition, motor imagery and neurological diseases and disorders. A wide range of methods have been proposed to design GNN-based classifiers. Therefore, there is a need for a systematic review and categorisation of these approaches. We exhaustively search the published literature on this topic and derive several categories for comparison. These categories highlight the similarities and differences among the methods. The results suggest a prevalence of spectral graph convolutional layers over spatial. Additionally, we identify standard forms of node features, with the most popular being the raw EEG signal and differential entropy. Our results summarise the emerging trends in GNN-based approaches for EEG classification. Finally, we discuss several promising research directions, such as exploring the potential of transfer learning methods and appropriate modelling of cross-frequency interactions.


I. INTRODUCTION
Electroencephalography (EEG) is a non-invasive technique used for recording electrical brain activity with a wide range of applications in cognitive neuroscience [1], clinical diagnosis [2,3], and brain-computer interfaces [4,5].However, analysing EEG signals poses several challenges, including a low signal-to-noise ratio, nonstationarity resulting from brain dynamics, and the multivariate nature of the signals [6,7].In this review, we focus on the classification of EEG, such as emotion recognition, motor imagery recognition or neurological disorders and diseases.
Traditional feature extraction methods for EEG classification, such as common spatial patterns [6], wavelet transform [8], and Hilbert-Huang transform [9], have been commonly employed.These methods aim to extract meaningful features from EEG signals [10,11], with key features like power spectral density (PSD) [7] to characterise brain states.However, relying on such manually defined features to train machine learning classifiers has several limitations.Subjectivity and biases in feature selection, along with time-consuming engineering and selection processes, limit scalability and generalisation [7,12].Automated feature extraction methods are needed to overcome these limitations, improve efficiency, reduce bias, and enhance classifier adaptability to different EEG datasets.
Deep learning architectures, such as convolutional neural networks (CNN) and long short-term memory (LSTM) networks, have also been explored for EEG analysis [13,14].However, they face challenges in effectively capturing the spatial dependencies between electrodes and handling the temporal dynamics of EEG signals [7].Modelling the complex sequential and spatial relation- * Correspondence to: fei.he@coventry.ac.uk ships in EEG data is crucial for more accurate classification and analysis.
Network neuroscience offers an alternative approach to EEG modelling by framing the signals as a graph.The brain exhibits a complex network structure, with neurons forming connections and communicating with each other [15].Analysing EEG data as a graph enables the study of network properties, including functional connectivity, providing insights into brain function and dysfunction [12,16,17].Graph-based analysis facilitates the examination of network features, node importance, community structure, and information flow, offering insights into brain organisation and dynamics.Such graph-theorybased features were shown to be powerful predictive features for EEG classification [12,[17][18][19][20][21][22].However, these features have the same limitations as manually defined features based on traditional EEG analysis methods introduced above.
Graph Neural Networks (GNNs) emerge as a powerful tool for modelling neurophysiological data [23], such as EEG, within the network neuroscience framework [7,24].GNNs are specifically designed to operate on graphstructured data.They can effectively leverage the spatial structure within EEG data to extract features, uncover patterns and make predictions based on the complex interactions between different electrodes.Designing GNN models for EEG classification will likely improve classification tasks and potentially uncover new insights in neuroscience.
Motivated by the potential of GNNs and an increasing number of recent papers proposing GNN for various EEG classification tasks, there is an urgent need for a comprehensive review of GNN models for EEG classification.The main contributions of this paper include: • Identifying emerging trends of GNN models tailored for EEG classification.
• Reviewing popular graph convolutional layers and their applicability to EEG data.• Providing a unified overview of node feature and brain graph structure definitions in the context of EEG analysis.
• Examining techniques for transforming sets of node feature embeddings into a single graph embedding for graph classification tasks.
By addressing these essential aspects, this review paper will provide a comprehensive and in-depth analysis of the application of Graph Neural Network (GNN) models for EEG classification.The findings and insights gained from this review will serve as a resource to navigate this emerging field and identify promising future research directions.

II. OVERVIEW OF GRAPH NEURAL NETWORKS
Graphs are widely used to capture complex relationships and dependencies in various domains, such as social networks, biological networks, and knowledge graphs.The problem of graph classification, which aims to assign a label to an entire graph, has gained attention in recent years.GNNs offer a promising solution to this problem by extending the concept of convolution from Euclidean inputs to graph-structured data.GNNs have been successfully applied in a wide range of fields, such as biology [23], bioinformatics [25], network neuroscience [26], chemistry [27,28], drug design and discovery [29,30], natural language processing [31,32], recommendation systems [33,34], traffic prediction [35,36] and finance [37].
In graph classification problems, the input is a set of graphs, each with its own set of nodes, edges, and node Compared to other deep learning models, GNNs offer several advantages.First, GNNs were specifically designed for graph-structured inputs.This means that GNNs can adapt to irregularly structured inputs, i.e. graphs with varying numbers of nodes, compared to traditional deep learning, such as CNN, that require fixedsize inputs.Next, GNNs can simultaneously learn information from node features and the graph structure by accepting two inputs: node feature matrix and graph structure.Such simultaneous integration is not possible with traditional deep learning methods.
Multiple types of GNNs have been well introduced in [38,39].In this survey, we briefly introduce the two main branches of GNNs, namely, spatial and spectral GNNs (Fig. 2).Other types of GNNs, such as attention GNNs [40], recurrent GNNs [41], and graph transformers [42], can be viewed as special cases of spatial GNNs, and thus we will not provide detailed discussion in this survey.Both spatial and spectral GNNs aim to extend the convolution mechanism to graph data.For a detailed review of their similarities and differences, see [43].Moreover, for a comparison of GNNs in terms of computational complexity, see [38].
Spatial GNNs aggregate information from neighbour-ing nodes, similar to traditional convolution applied to image data aggregating information from adjacent pixels.Stacking multiple spatial GNN layers leads to information aggregation from various scales going from local to global patterns being captured in early and later layers, respectively.In contrast, spectral GNNs perform information aggregation in the graph frequency domain, with low-frequency and high-frequency components capturing global and local patterns, respectively.However, both approaches learn to capture local and global patterns within the graph, i.e. high and low-frequency information in the spectral domain.The advantage of spectral GNNs is their connection to graph signal processing, allowing for interpretation from the perspective of graph filters.However, spectral GNNs do not generalise well to large graphs since they depend on the eigendecomposition of graph Laplacian.In contrast, spatial GNNs can be applied to large graphs since they perform only local message-passing.On the other hand, spatial GNNs may be challenging to interpret and prone to overfitting because of over-smoothing, where embeddings of all nodes become similar.

A. Spatial GNNs
Spatial GNNs directly operate on the graph structure via the adjacency matrix operator.Given a set of nodes and associated features, spatial GNNs perform neighbourhood aggregation to derive node embeddings.This process is referred to as message passing.Intuitively, nodes connected by edges should have similar node embeddings, i.e. local node similarity.Message passing implements this idea by updating node embeddings with aggregated information collected from the node's neighbourhood.Formally, the node update equation in l th layer of spatial GNN with L layers is defined as follows: where h i is the node embedding vector, or when l = 1, this is the input node feature vector, σ is the activation function, is the aggregation function, N (v i ) is the neighbourhood of node v i , W ∈ R d1×d2 is a learnable parameter matrix projecting node embeddings from input dimension d 1 to hidden dimension d 2 and e ji is the edge weight (e ji = 1 for unweighted graphs).
A single spatial GNN layer aggregates information from the 1-hop neighbourhood.Thus, to increase the reception field of the model, L spatial GNN layers can be stacked to aggregate information from up to L-hop neighbourhoods.A disadvantage of spatial GNNs is the difficulty of training deep models with many layers.With an increasing number of layers, the node embeddings become increasingly smooth, i.e. variance among embeddings of all nodes decreases.This happens when the messages already contain aggregated information from the whole graph; continual message passing of such saturated messages leads to oversmoothing, i.e., all node embeddings becoming essentially identical.

B. Spectral GNNs
Spectral GNNs can also be applied to EEG classification tasks by leveraging the spectral domain analysis of graph-structured data.The EEG graph is transformed into the spectral domain using the Graph Fourier Transform (GFT) and Graph Signal Processing (GSP) techniques.For a detailed review of spectral GNN methods, please refer to [44].
The graph spectrum is defined as the eigendecomposition of the graph Laplacian matrix.The GFT is then defined as Ĥ = U T H, its inverse as H = U Ĥ, where U is the orthonormal matrix of eigenvectors of the graph Laplacian L and H ∈ R N ×D is the matrix of node feature vectors with N and D being the number of nodes and dimensionality of node features, respectively.The graph Laplacian is defined as L = D − A, but often the normalised version is preferred: (A and D are the adjacency and degree matrices, respectively).
Spectral GNN is then typically defined as the convolution ( * ) of a signal defined on graph H and a spatial kernel g in the spectral domain, thus becoming an element-wise multiplication (⊙): Generally, U T g is defined as a learnable diagonal matrix G = diag(g 1 , ..., g V ) spectral filter [44].
However, the full spectral graph convolution can be computationally expensive.A popular approximation is the Chebyshev GNN (ChebConv) [45], which performs localised spectral filtering on the graph.The node embedding update equation of a ChebConv is defined as: where is the largest eigenvalue of L, often approximated as λ max = 2).The K parameter controls the size of the Chebyshev filter.
However, spectral GNNs are limited to input graphs with a fixed number of nodes.This is because of the explicit use of the graph Laplacian.This is in contrast to spatial GNNs, which do not rely on explicitly materialising the adjacency matrix.
FIG. 2: Illustration of core mechanisms of spatial and spectral GNNs.A) An undirected featured graph is given as an example input graph with node features shown as node labels and colours.B) Spatial GNNs operate in the graph domain directly using message passing to update node embeddings. 1) Messages, i.e. transformed node features or embeddings, are sent along edges.For simplicity, we show only one direction of the flow of messages.2) The collected messages at each node are aggregated using a permutation-invariant function and are fused with the original node embedding to form an updated node embedding.Thus, one spatial GNN layer results in node embeddings containing information about the 1-hop neighbourhood of a given node.Thus, L layers are required for node embeddings to access the information from the L-hop neighbourhood.C) In contrast, spectral GNNs operate in the graph spectral domain.1) Node features are treated as signals on top of a graph and are deconstructed into graph frequencies given by the eigendecomposition of the graph Laplacian.Graph frequencies can be interpreted as variations of the signals.
2) The contribution of each graph frequency is weighted by the set of learnable kernels G that effectively function as graph filters.3) Node embeddings are then obtained by aggregating the filtered graph frequencies and transforming them back to the spatial graph domain.Thus, full spectral GNNs can access information from N -hop neighbourhoods where N is the number of nodes of a given graph.However, in practice, approximations such as Chebyshev graph convolution restrict this to the chosen hop size.

III. SURVEY RESULTS
This survey is based on a review of 63 articles.These articles were selected by title and abstract screening from a search on Google Scholar and ScienceDirect queried on November 1st, 2022.The search query for collecting the articles was defined as: ("Graph neural network" OR "Graph convolutional network") AND ("Electroencephalography" OR "EEG").Both peer-reviewed articles and preprints were searched and utilised.All types of EEG classification tasks were included.We summarise the various types of EEG classification tasks identified in the surveyed papers in Fig 3 .The most common classification tasks are emotion recognition, epilepsy diagnosis and detection and motor imagery.However, the type of classification task should have a relatively minor effect on the GNN architecture design.Thus, we do not analyse and discuss this in detail.Instead, we survey the various GNN-based methods for EEG classification, intending to systematically categorise the types of GNN modules and identify emerging trends in this field independent of the specific classification task.
In the remaining portion of this paper, we report the categories of comparisons we identified in the surveyed papers.These are based on the different modules of the proposed GNN-based models.Specifically, these are: The following sections will provide further details on these categories, and the paper will conclude by discussing trends and proposing plausible directions for future research.

IV. DEFINITION OF BRAIN GRAPH STRUCTURE
The first part of the input to a GNN model is the brain graph structure inferred from the EEG data itself (Fig. 1A).We summarise the methods for defining the brain graphs in Table I.These methods can be generally categorised as learnable or pre-defined.
An alternative categorisation of the brain graph structures is the functional (FC) and the "structural" connectivity (SC).Generally, SC graphs are pre-defined, whereas FC graphs can be both pre-defined and learnable.SC in the classical sense of physical connections between brain regions is not possible to obtain using EEG signals since these are recorded at the scalp surface.Instead, we use the term to describe methods that construct brain graphs based on the physical distance between EEG electrodes.In contrast, FC refers to pairwise statistical relationships between EEG signals.
SC graph is pre-defined such that electrodes are connected by an edge in the following way: where e ij is the edge weight connecting nodes i and j, d ij is a measure of distance between EEG electrodes, and t is a manually defined threshold controlling the graph sparsity.Such an approach offers several advantages.First, the SC graph is insensitive to any noise effects of EEG recording since it is independent of the actual signals.Second, all data samples share an identical graph structure, provided the same EEG montage was utilised during the recording.This offers explainability advantages when combined with spectral GNN since the graph frequency components defined by the eigenvectors of graph Laplacian are fixed.On the other hand, the SC graph is limited to short-range relationships.Thus, it might not accurately represent the underlying brain network.Some papers propose to overcome this limitation by manually inserting global [53,[56][57][58]62] or inter-hemispheric edges [46,54,87].
In contrast, an FC graph can be obtained from either classical FC measures (FC measure in Table I or learnable methods (e.g.feature concatenation/distance and attention methods in Table I).We refer to all of these methods as FC because they all measure the degree of interaction between two nodes, thus falling within the traditional definition of FC.Unlike SC, the FC graph is unique for each data sample and can contain both shortand long-range edges.On the other hand, since it is derived directly from EEG signals, it might be sensitive to noise.
Learnable FC based on node feature distance or feature concatenation are generally computed as: (5) respectively, where θ 1 (•) and θ 2 (•) are neural networks with input-output dimensions of R : d → 1 and R : 2 × d → 1, respectively; | • | denotes absolute value; ∥ denotes concatenation and h i is the node feature/embedding of node i.We discuss the attention-based graphs together with the types of graph convolutional layers in Section VI and thus skip these methods in this section.Special cases of brain graph definition are the sharedmask methods.These methods defined a matrix of learnable parameters with the same shape as the adjacency matrix of the input graphs that acts as a mask/filter by multiplying it with the adjacency matrix.This learnable matrix is a part of the model.Thus, the same mask is applied to all input graphs.However, a shared mask limits the size of the input graphs, i.e. the number of nodes must remain fixed so that the adjacency matrix can be multiplied with the shared mask.In the current stage, which method should be preferred for brain graph classification tasks is unclear.Some authors attempt to avoid this issue by combining multiple methods.However, we instead suggest that the researchers carefully consider each of the presented methods in the context of the given classification task, as each method poses its unique set of strengths and weaknesses.

V. NODE FEATURE DEFINITIONS
The second part of the input to a GNN model is the node feature matrix (Fig. 1A).We summarise the various definitions of node features in Table II.We categorise these definitions based on which domain they are computed, i.e. time, frequency and graph domains.
The time-domain methods are the most commonly used in the current literature.In particular, these are the differential entropy (DE) and raw signal methods.The popularity of DE is given by the fact that many of the open EEG datasets include this feature, such as the SEED [108] emotion recognition dataset.DE describes the complexity of a continuous variable and is defined as: where X is a random continuous variable and f (x) is the probability density function.Many papers define the node feature as the raw EEG signal.However, the raw signal can be too long for a GNN to process effectively.Thus, it is often coupled with node feature pre-processing module and spatio-temporal GNNs (See V A and VI respectively) to either reduce the dimensionality or to extract the temporal patterns contained within the signal effectively.An alternative to the raw signal node feature is descriptive statistics, such as mean, median or standard deviation.
Frequency-domain node features are usually defined as the Fourier frequency components obtained by the Fourier transform or the power spectral density.Both of these methods attempt to quantify the strength of various frequency components within the EEG signal.An advantage of these representations is their relatively low dimensionality compared to the raw signal described previously.
Finally, graph-theoretical features can be utilised to describe the nodes, e.g.mean node weight [65] and be- tweenness centrality [65,73].A severe limitation of this method is that the graph structure needs to be defined prior to node feature extraction.Thus, this node feature type is incompatible with learnable brain graph methods.

A. Node Feature Preprocessing
An optional next step after node features construction is some kind of node feature pre-processing module (NFP) (Fig. 1B).We summarise the types of NFPs in Table III.
Most of the NFPs are integrated within the GNN architecture, thus allowing the model to be trained in an end-to-end manner.The exceptions are methods that utilise a pre-trained feature extraction neural network implemented as a bidirectional LSTM [76] or a CNN [64].
The surveyed NFPs are all based on a neural network.In most cases, these are variants of a CNN and multilayer perceptron (MLP).These modules aim to (1) reduce the dimensionality of the node features and (2) enhance the node features, including potentially suppressing noise or redundant information.

VI. TYPE OF GRAPH CONVOLUTIONAL LAYER
A core part of a GNN model are the graph convolutional layers (GCN) (Fig. 1C).We summarise the utilised types of GCNs in Table IV.We further categorise them based on the type of GNN as introduced in Section II, i.e. spatial, spectral.Additionally, we add the temporal category, which is not a type of standalone GCN layer but must be combined with spatial or spectral GCN.
Interestingly, ChebConv is used in the majority of the surveyed papers (counting both ChebConv and spectral spatio-temporal GNN in Table IV).Since EEG typically uses 128 electrodes in high-density montages, the size of the brain graphs is relatively small.In such cases, even a full spectral GNN would not be too computationally expensive for EEG classification.Therefore, it remains unclear why many authors opt for the ChebConv approximation of spectral GNN.We speculate that the influence of classical signal processing tools in EEG analysis might also serve as a sufficient argument for using spectral GNNs for EEG classification.
On the other hand, the other half of the surveyed papers experiment with a wide range of spatial GNNs.The (simplified) GCN is a popular method amongst these, which is equivalent to a 1st-order ChebConv (K = 1).A special case of spatial GNN is the graph attention network (GAT).GAT allows for adjusting the graph by reweighting the edges using an attention mechanism.Generally, the attention mechanism for computing the new softmax-normalised edge weight e ij is defined as follows: where w and W are the learnable parameters of the model, σ is an activation function, h is the node feature vector/embedding, and N (i) is the set of nodes connected to node i.The resulting edge weights can then be passed to Equation 1.
Next, the spatio-temporal GNNs were tested for EEG classification in several instances.A spatio-temporal block consists of one GCN layer and one 1D-CNN applied temporally.This structure allows the model to extract both spatial (i.e.graph) and temporal patterns.There are both spatial and spectral variants of spatio-temporal GNN, and there is no indication as to which one should be preferred as no comparative study exists to date.
Finally, several papers adopt multi-branch architectures.These methods utilise multiple GCN layers applied in parallel to allow the model to focus on various aspects (also views) of the input graph.An example of such a model utilises two-branch GNN to learn from both FC-and SC-based brain graph structure [63].Alternatively, the individual frequency bands of EEG signals can be used to construct various graph views [85].

VII. NODE POOLING MECHANISMS
In some instances, reducing the number of nodes in the graph might be desirable.This can be achieved with a node pooling module (Fig. 1D).We summarise the node pooling modules utilised in the surveyed papers in Table V.
There are both learnable and non-learnable node pooling modules in the literature.Please see the corresponding papers for a detailed description of these methods (Table V).Node pooling modules remain a relatively unexplored topic in the EEG-GNN classification models.Node pooling can (1) remove redundant nodes, (2) reduce the size of the graph embedding in a setting where the concatenation of node embeddings forms it, and (3) aid in the explainability of the model by identifying node importance with respect to the classification task.

VIII. FROM NODE EMBEDDINGS TO GRAPH EMBEDDING
The output of the graph convolutions is a set of learned node embeddings.Node embeddings in this form are suitable for tasks such as node classification and link prediction.However, for graph classification, the set of node features needs to be transformed into a unified graph representation (Fig. 1E).We summarise the methods for this transformation in Table VI.
The most straightforward method to form a graph embedding is to simply concatenate the node features.This approach poses a few limitations.First, the resulting graph embedding grows with the number of nodes, thus, the classification layer requires a large number of parameters.Second, all input graphs need to have the same number of nodes, limiting the model's generalisation to other datasets.Finally, such an approach is likely to include redundant or duplicated information in the graph embedding since GNN produces node embeddings by aggregating information from neighbouring nodes.
A readout function is one of the methods to form a graph embedding that addresses these issues.A readout forms the embedding by passing the node features through a permutation-invariant function.A general definition of a readout to obtain graph embedding of a graph G i from a set of V node embeddings H = [h 1 , ..., h V ] is given by: where can be any permutation-invariant function.In the surveyed papers, these functions were sum, average and maximum.A few papers also experiment with attention-weighted sum to attenuate the role of unimportant nodes within the graph embedding [88].An interesting alternative is to apply CNN-style average or maximum pooling node-wise [105].
Alternatively, researchers explored various neural network models to obtain graph embeddings, such as CNN [52,69,78], (bi-)LSTM [51,83,84,99,100], Transformer [89] and capsule networks [73].Additionally, graph pooling methods, such as DiffPool [109], SAGPool [110], iPool [111], TAP [112] and HierCorrPool [113] can be used for this purpose.This survey categorises the proposed GNN models in terms of their inputs and modules.Specifically, these are brain graph structure, node features and their preprocessing, GCN layers, node pooling mechanisms, and formation of graph embeddings.This categorisation allows us to provide a quick and simple overview of the different methods presented in the EEG-GNN literature, appreciate the current state of the art in this field and identify promising future directions.

A. Limitations of Surveyed Papers
Surprisingly, we have identified the least variety and innovation in the category of GCN layers (Table IV).A significant proportion of the surveyed papers utilise either ChebConv or "vanilla" spatial GCN.This might be due to the relative novelty of the EEG-GNN field, and thus, many papers explore other areas of model design, such as node features and brain graph definitions.A few papers seem to successfully experiment with more complex types of GCN layers [47,50,91] and multi-branch architectures [58,63,80,92,97,100].
A major limitation of most surveyed papers is the lack of generalisability to external datasets that might use a different number of EEG signals.This is caused by (1) the use of ChebConv and (2) forming graph embedding by node feature concatenation [47, 55-60, 64, 66, 67, 70, 74, 77, 80, 81, 86, 87, 90-93, 98, 100-102, 104].(1) can be addressed by utilising spatial GCN layers as suggested above, and (2) can be solved by using a readout function or a suitable node pooling mechanism, which coarsens the graph to a fixed number of nodes.Additionally, there is a general lack of transfer learning experiments for EEG-GNN models, which might be a promising direction for future research.
Finally, we have identified an interesting gap in EEG-GNN research: the lack of utilising frequency band information in a more complex way.A few papers train separate models for each frequency band in isolation [46,47,65].Alternatively, they propose concatenating the graph embeddings generated from the frequencyband-GNN branches [52,87,101].

B. Future Directions
Several promising directions can be identified in the rapidly evolving landscape of EEG-GNN research.First, a comprehensive comparison of the various GCN layers (e.g.spatial GNN, ChebConv, GAT and graph transformer) with respect to their influence on classification performance should be carried out to address this crucial design question in a systematic manner.
Second, enhancing the generalisability of models by addressing issues related to the varying number of EEG signals/electrodes and exploring transfer learning approaches can open new avenues for research.For instance, pre-trained GNN models on cheap-to-obtain large datasets, such as open databases for emotion recognition or BCI applications, would allow the application of complex GNN architectures to problems with limited data availability due to the high costs or small populations (e.g.clinical data, rare diseases and disorders).Focusing on these issues would likely improve the generalisability of the models when evaluated on a diverse set of EEG datasets and different classification tasks.
Lastly, the rich frequency information of EEG signals should be explored more.For instance, we suggest a plausible utility of integrating cross-frequency coupling (CFC) approaches into EEG-GNN models.There is growing evidence in the literature concerning the advanced brain functions (e.g.learning, memory) enabled by CFC [114].Thus, integrating findings from neuroscience research into the EEG-GNN design promises both performance and explainability gains.

C. Limitations of Our Survey
It is worth noting that this paper does not follow a systematic review methodology; therefore, we do not assert that our findings are exhaustive.Instead, our objective is to offer a succinct and cohesive overview of the current research on EEG-GNN models to facilitate the development of innovative approaches and assist researchers new to this field.
One of the major parts of EEG-GNN models we omit in this survey is the model explainability.We suggest that a survey paper is not well suited for comprehensively covering this aspect of research.Instead, we suggest a comparative experimental study to be better suited to explore the various explainability options of GNN explainability.However, to maintain the comprehensiveness of this survey, we list the papers that report the use of certain methods of model explainability: [50,55,89,105,106].

X. CONCLUSION
In conclusion, this survey examined the current research on EEG-GNN models for classifying EEG signals.
Various GNN-based methods have been proposed for tasks such as emotion recognition, brain-computer interfaces, and psychological and neurodegenerative disorders.The surveyed papers were categorised based on inputs and modules, including brain graph structure, node features, GCN layers, node pooling mechanisms, and graph embeddings.
GNNs offer a unique method for analysing and classifying EEG in the graph domain, thus allowing the exploitation of complex spatial information in brain networks that other neural networks do not.Additionally, GNNs can be easily extended with CNN and recurrent network-based modules at various stages of the GNN architecture, such as for node feature pre-processing, node embedding post-processing and graph embedding formation.
However, limitations and areas for improvement were identified.There is a lack of variety and innovation in GCN layers, with many papers utilising ChebConv or "simple" spatial GCN without clear justification.Generalisability to external datasets with varying numbers of EEG electrodes is limited.Transfer learning experiments and integration of cross-frequency coupling approaches are potential future research to enhance the performance and explainability of GNN.

FIG. 1 :
FIG. 1: General architecture of a graph neural network model for classification of EEG.(A) The input to the model consists of node features and a possibly learnable brain graph structure.(B) Optionally, the node features can undergo pre-processing via a neural network.(C) Next, the node features are passed to a block of graph convolutional layers, where node embeddings are learned.(D) Then, a node pooling module can be utilised to coarsen the graph.Node pooling may contain learnable parameters as well.(E) Finally, the set of node embeddings forms a graph embedding, which can be used to predict the outcome.
features.Let G = (V, E, H) denote a featured graph, where V represents the set of nodes, E represents the set of edges connecting the nodes, and H represents the V × D matrix of D-dimensional node features.In the case of EEG, the EEG channels are the nodes, and edges represent structural or functional connectivity between pairs of nodes.Each graph G is associated with a label y, indicating its class.The goal is to learn a function f (G) → y that can predict the class label y given an input graph G.A general structure of a GNN model for EEG classification is presented in Fig 1.
Definition of brain graph structure • Type of node features • Type of graph convolutional layer • Node feature preprocessing • Node pooling mechanisms • Formation of graph embedding from the set of node embeddings
IX. DISCUSSIONDespite most of the surveyed papers being relatively recent, a wide range of GNN-based methods has already been proposed to classify EEG signals in a diverse set of tasks, such as emotion recognition, brain-computer interfaces, and psychological and neurodegenerative disorders and diseases (Fig3).This recent rise in popularity of GNN models for EEG might be attributed to (1) the development of new GNN methods and (2) advances in network neuroscience inspired an extension of this framework to deep learning.GNNs offer unique advantages

TABLE I :
Overview of methods for obtaining the brain graph structure.

TABLE III :
Overview of node feature pre-processing before GNN layers.

TABLE IV :
Overview of graph convolutional layers.

TABLE V :
Overview of node pooling mechanisms.

TABLE VI :
Overview of methods for the formation of graph embedding from a set of node embeddings