EEG Signal Epilepsy Detection With a Weighted Neighbor Graph Representation and Two-Stream Graph-Based Framework

Epilepsy is one of the most common neurological diseases. Clinically, epileptic seizure detection is usually performed by analyzing electroencephalography (EEG) signals. At present, deep learning models have been widely used for single-channel EEG signal epilepsy detection, but this method is difficult to explain the classification results. Researchers have attempted to solve interpretive problems by combining graph representation of EEG signals with graph neural network models. Recently, the combination of graph representations and graph neural network (GNN) models has been increasingly applied to single-channel epilepsy detection. By this methodology, the raw EEG signal is transformed to its graph representation, and a GNN model is used to learn latent features and classify whether the data indicates an epileptic seizure episode. However, existing methods are faced with two major challenges. First, existing graph representations tend to have high time complexity as they generally require each vertex to traverse all other vertices to construct a graph structure. Some of them also have high space complexity for being dense. Second, while separate graph representations can be derived from a single-channel EEG signal in both time and frequency domains, existing GNN models for epilepsy detection can learn from a single graph representation, which makes it hard to let the information from the two domains complement each other. For addressing these challenges, we propose a Weighted Neighbour Graph (WNG) representation for EEG signals. Reducing the redundant edges of the existing graph, WNG can be both time and space-efficient, and as informative as its less efficient counterparts. We then propose a two-stream graph-based framework to simultaneously learn features from WNG in both time and frequency domain. Extensive experiments demonstrate the effectiveness and efficiency of the proposed methods.


EEG Signal Epilepsy Detection With a Weighted
Neighbor Graph Representation and Two-Stream Graph-Based Framework
Epilepsy detection is traditionally conducted by first extracting handcrafted features from raw EEG signals [6], [7], and then feeding these features into a classifier to determine whether the given signal contains epileptic segments [8], [9]. Such methods are suitable for cases which the different classes can be relatively distinguished between. However, EEG signals often come with complex waveforms with reach underlying semantics that cannot be easily characterized by handcrafted features [10]. With the development of deep learning techniques, researchers resort to deep learning models that can effectively learn complex features, especially when large amounts of data are available [11], [12].
Recently, researchers have used deep learning models directly to learn latent features from raw EEG signals [13], [14]. However, these raw EEG signals are highly random and heterogeneous. In particular, relationships of two or more data points can be of great value for epilepsy classification [15]. That limits the performance of deep learning models without explicit transformation of data representation [10], [16]. Using such information can be hard for deep learning models to interpret classification results from raw EEG signals.
For a given single-channel signal, each vertex in its corresponding graph represents a data point in the signal, and each edge represents the pairwise relationship between two data points [20], and multiple edges represent the relationship of the multiple vertices that are connected by them. This method provides a new feature for EEG signals, namely the relationship feature between data points. Compared with the classic VG, limited penetrable visibility graph (LPVG) [21], horizontal visibility graph (HVG) [22], and limited penetrable horizontal visibility graph (LPHVG) [23], the Overlook Graph (OG) and Weighted Overlook Graph (WOG) have the weighted directed edges and the distinguishing ability of different categories have been significantly improved [24]. However, the time complexity and memory requirements of these methods are very large. It is only suitable for offline EEG classification tasks.
Existing Graph-based methods suffer from the following two problems.
1. High time and space complexity. For some existing graph representations [19], [21], it is required that each vertex traverse all other vertices to construct the graph structure, which leads to O(n 2 ) (n is the number of data points in the raw signal) or high time complexity. Meanwhile, to maximally retain semantic information, some existing graphs [24] resort to dense graph representations that lead to high memory costs.
2. Inability to simultaneously learn from time and frequency domain signals. Raw EEG signals reside in time domain. However, a recent work [25] has highlighted the need for learning from graphs built upon frequency domain signals. While both time and frequency domain streams can be derived from one to the other, their different forms indicate that different semantic information can be better encoded in them. Therefore, it can be assumed that simultaneously learning from time and frequency domain graphs can lead to superior results. However, there exist very few deep learning models that can undertake this task.
In response to these problems, we make the following contributions in this paper.
1. Weighted Neighbour Graph (WNG). We propose a novel graph representation called Weighted Neighbour Graph (WNG), which is essentially a lossy compression of a state-ofthe-art dense graph representation called Weighted Overlook Graph (WOG) [24]. Reducing the redundant edges of the WOG, WNG can strike a balance between the amount of information retained. Moreover, unlike previous graphs with high time and (or) space cost, WNG has both linear time and space complexity.
2. A two-stream graph-based framework for time and frequency domain graphs. Inspired by the two-stream convolutional neural network [26] for video analytics, we present a novel two-stream graph-based framework, which utilizes two parallel branches within a single GNN to simultaneously learn from time and frequency domain graphs. Note that this framework is generic. It is compatible with any graph representation and any GNN model that can learn from this representation.
3. Experimental evaluation. We conduct several real-world experiments to demonstrate the effectiveness and efficiency of the proposed methods.
For the rest of the paper, we introduce our WNG representation and the two-stream graph-based framework in Section II. Afterward, we report our experiments and results in Section III and IV. We discuss our framework in Section V. Finally, we conclude the paper in Section VI.

A. Preliminaries
We first introduce the preliminaries required in the section. We denote an EEG signal with n data points as T = t 1 , t 2 , . . . , t n , where t i is the value of ith point in the signal. A graph representation of T ∈ R n can be written as a graph G = (V, E), where V is the vertex set with n vertices, each corresponding to a data point in T . E is the edge set, in which e i, j ∈ E denotes the edge connecting vertices v i and v j . e i, j is the pairwise relationship between data points t i and t j .
Definition (Weighted Overlook Graph (WOG)): Given a single-channel EEG signal T = t 1 , t 2 , . . . , t n , its WOG representation can be written as a graph G = (V, E). Given two arbitrary data points t i and t j in an EEG signal T , an weighted, directed edge e i, j is created if it suffices that: Let A T = {a i, j |i = 1, 2, . . . , n, j = 1, 2, . . . , n} be the adjacency matrix of the time domain G, where In this paper, we restrict our discussion to directed graphs.

B. Graph Representation
The graph representation method for EEG signals has a very large time complexity and memory requirements. The existing graph representation method needs to traverse all other vertices when building the graph structure, so its time complexity is large. The existing graph representation methods also require a large number of edges to represent the topology information of signals. In fact, the information of some edges in these graph structures can be inferred from the relationship between other edges, which increases the memory requirements of the graph (see II-B.2 for details). For example, the graph structure generated by the WOG method with the highest classification accuracy currently has a large number of redundant edges, which also greatly increases the time complexity of graph generation and memory requirements. Therefore, our goal is to design a weighted directed graph representation that can maintain the classification accuracy of EEG signals and minimize time complexity and memory requirements.
1) Weighted Rand Graph (WRG): We design a random probability p ∈ [0, 1], which means that among n vertices in the graph structure, p × n points will be reconnected randomly [27].
WRG Connection Rule: Given one arbitrary data point t i in an EEG signal T ∈ R n−1 , a weighted, directed edge e i,i+1 is created if it suffices that: t i > t i+1 . Where t r i = t r 1 , t r 2 , . . . , t r p×n points will be disconnected and reconnected.
For example, the vertex t i (i, t i ) disconnects the original connection (a i,i+1 = 0 and a i+1,i = 0), randomly generates a random connection (r i ∈ [1, n] ∩ r i ̸ = i), and rebuilds the connection (a i,r i = t i − t r i and a r i ,i = t r i − t i ) for the vertex t i (i, t i ). This representation do not increase edges, but effectively reduces the characteristic path length between vertices.
2) Weighted Neighbour Graph (WNG): In the extreme case of the graph ( p = 0), that is, each vertex only constructs edges with vertices adjacent in the signal. We call it the Weighted Neighbour Graph (WNG), as shown in Fig. 2.
WNG Connection Rule: Given one arbitrary data point t i in an EEG signal T ∈ R n−1 , a weighted, directed edge e i,i+1 is created if it suffices that: where the weight is It is worth noting that here a i+1,i = t i+1 − t i , taking the opposite number of the values of two vertices at a diagonally symmetrical position along the adjacency matrix is a redundant operation. This is for the vertex t i+1 (i + 1, t i+1 ) to also aggregate the information of the vertex t i (i, t i ). The concept of positive or negative value means the direction of the edge between vertices. We reduce the time complexity of the graph representation method from O(n 2 ) to O(n). This reduces the time complexity of the graph generation. The memory requirement of the generated graph is greatly reduced due to the reduction of the edges. We found the edge of WOG is a i, j = can be represented by edges in a WNG by a finite number of addition operations. a i−1,i , a i−2,i−1 , . . . , a j, j+1 are the adjacent edges of WNG. When we get a i,i+1 , we can represent all the edges of WOG. The graph representation designed in this paper can effectively represent the information of signal with fewer edges.
3) Frequency Domain Graph Representation: These graph representation methods usually represent time domain signals and rarely represent the frequency domain information of EEG signals. There are many limitations in the time domain EEG signals. For example, it is difficult to align the phase of the signals, which results in a large variance of different EEG signals of the same category. The frequency domain feature can alleviate this problem as an essential signal feature [25], [28]. In addition, frequency domain graphs can provide interpretability for classification results (see V-B for details). The signal can be converted to the frequency domain and aligned strictly according to the frequency arrangement. The steps of the frequency domain graph are: 1) convert the signal to the frequency domain through Fast Fourier Transform (FFT); 2) represent the frequency domain signal into a graph structure.
We convert a time domain EEG signal into the frequency domain graph 1 is shown in Fig. 3. First, we transform the time domain EEG signal T = t 1 , t 2 , . . . , t n into the frequency domain using FFT. The signal is Each f k corresponds to the frequency k for all input time domain signals as long as they have the same sampling rate and length. Then, we build the frequency domain graphs for the EEG signals. Specifically, we create a vertex for each data point in the signal and create edges based on some connection rule. We use the connection rule of the graph representation methods. However, we note that our frequency domain method is compatible with other graphs. The connection rule of the graph is as follows: For one data point | f i |, if it suffices that: We create an edge connecting the vertices for | f i | and | f i+1 |. Each edge indicates the relationship between two frequencies in the graph, which can be written as a n × n adjacency matrix A F .
We design a two-stream Graph-based framework to extract features of the time and frequency domain graph representations.
The READOUT function of the graph neural network uses vertex aggregation (such as sum, average, and max) for graph classification [29], [30]. This is because there is only a connection relationship between vertices in the graph classification task, and there are no sequential features. However, the graph of signals contains sequential features between vertices. In addition, these graph neural network models used for graph classification extract the topological structure features of the graph by aggregating the features of neighbour vertices. However, for all graphs in the WNG, the topological structure of the chain structure is challenging to distinguish the differences between different categories. To this end, we designed a two-stream architecture based on deep learning models that simultaneously extracts the sequential features between the time and frequency domain graphs of EEG signals.
For the existing graph classification methods of graph neural networks, 1) multi-hop neighbourhood aggregation, 2) readout aggregation at the graph part [30], [31]. Through multiple iterations of the first step, the vertex obtains the characteristics of neighbour vertices. Through the second part, to finally get the characterization of the whole graph. However, the current graph neural network cannot extract the sequential information between the vertices in the graph. This feature will cause the performance of the existing graph neural network to deteriorate in classifying our graphs. Because there is a sequence relationship between vertices in a graph, these sequence features are essential and cannot be easily discarded.
For this reason, we propose a new graph-based framework. The purpose is to preserve the sequential information between the vertices in the graph while extracting the structure information of the graph.
We first involved two identical aggregation operations for the time and frequency domain graph structures, which have been used to aggregate each vertex in the time and frequency domain graph structures simultaneously. Specifically, the information between vertices is aggregated K (1 ≤ K ) times.
After completing the aggregation of the vertices in the time and frequency domain graph structures, respectively, the learnable weight is used to extract the features of the vertices. Because the graph structure in time and frequency domain has sequential features, we still introduce the operation of sequential graph convolution to extract the sequential features of the vertices in time and frequency domain graph structures.
The time and frequency domain graph G = (V, E) containing n vertices are input into the model respectively in Fig. 4 (Input data), where n is the number of the time and frequency domain sampling points. We perform vertex aggregation operations on graphs in the time and frequency domain, respectively, shown in Fig. 4 (vertex aggregation). In this part, K-hop aggregation is performed on each vertex of the graph, and K iterations are performed. Specifically, where the sequence of the vertices is the same as the sequence of the vertices in the input graph.
After the time and frequency domain graphs are aggregated, the feature vectors corresponding to the time and frequency domain graphs are input into the sequential convolution module of the vertex. This operation aims to extract the sequential features of the vertices of the feature vector through sequential convolution. As is shown in Fig. 4 (vertex sequential convolution), We first separately design time and frequency domain initialization learnable parameters W = {w 1 , w 2 , . . . , w n } to weight each vertex of the feature vector.
After completing the weighting of the time and frequency domain feature vectors, respectively, we get the weighted We set ω t i and ω f i ∈ R m as the weight parameter in the time and frequency domain. The size of m is the kernel size of the deep learning model, b t and b f are the bias parameter in the time and frequency domain, and σ is the ReLU activation function [32]. The stride size of the kernel is 1. The kernel size in the time domain and the frequency domain can be set independently. The size of the kernel means the different vertex information of the feature vector of the graph, and the vertex sequential convolution operation is performed P times.
Then we get the feature vector y t and y f ∈ R ⌊n⌋ . We flatten the time and frequency domain data separately, and we fuse the output of the time and frequency domain: We fuse the output y t and y f asŷ, and then flatten theŷ as the input y f c (1, width×height×2×channel) to the fully-connected layer. The corresponding fully-connected layer output can be expressed as follows: where ω f c i and b f c are the trainable parameters (weight and bias). The superscript fc indicates the fully-connected layer. In this part of the model, the fully-connected layer can extract the associated features of the time and the frequency domain simultaneously. The fully connection layer is performed Q times. Finally, we use Softmax [33] to obtain the classification result Y ∈ [0, 1], which means seizure or non-seizure.
The algorithm pseudocode is shown in Algorithm 1, and our source code is available at https://github.com/anonymous2020source-code/WNG-TS-Model/.

III. EXPERIMENTS
In this section, we conduct experiments to demonstrate the effectiveness and efficiency of the proposed methods.

A. Dataset
We use three epileptic EEG signal datasets in our experiments.
The Bonn dataset is a classic epilepsy detection dataset collected at the Bonn University [34]. It contains five subsets, called Set A to Set E. Each subset consists of 100 EEG signals. Each signal has 4097 data points and is sampled at 173.61Hz. Among the five subsets, Sets A to D are epileptic-free EEG signals collected under different conditions, of which only subset E is the epileptic EEG signal. We set up the Bonn dataset as four experiments (A vs. E, B vs. E, C vs. E, D vs. E.) for comparison with existing methods. In our experiments, each EEG signal is divided into 256 sampling points with non-overlapping.
The SSW dataset is a special type of epilepsy, "absence epilepsy", which is a dataset without any convulsive symptoms in clinical practice. It needs to be detected by identifying spike and slow waves in the EEG signal. It is a dataset annotated by neurologists from Xinhua Hospital affiliated to Shanghai Jiaotong University. The sampling frequency is 200Hz. The original signal is divided into segments. Each segment contains 200 sampling points, corresponding to 1 second of recording. We choose this duration according to domain knowledge, a full SSW usually lasts less than 1 second [35]. Neurologists manually screened and marked each segment, ensuring that all segments marked positive contained at least one intact SSW. The dataset includes 10473 positive samples and 10473 negative samples.
The CHB-MIT dataset is the EEG signals of epilepsy patients collected at the Children's Hospital Boston [36], from 23 subjects, with a total of 24 epilepsy records. The sampling frequency of EEG signals is 256Hz, and each data usually contains 23 channels of signals. The method studied in this paper only uses single-channel EEG signals. We only use data from the first channel. In the CHB-MIT dataset, we collected the epileptic EEG signal of the first channel of each subject, and selected a considerable amount of non-epileptic data from it, and segmented the EEG signal into 256 sampling points in length. Each segment corresponds to a 1-second recording. This duration was chosen to use the same data preprocessing scheme as the comparison method. In this paper, the dataset includes 9255 positive samples and 12200 negative samples. More details about the dataset can be found in [37].
For all datasets, we use the average of five-fold crossvalidation as the classification result. In addition, for WRG experiments, we fix r i to ensure that the WRG in each experiment adopts the same random reconnection.

B. Setting
In the experiment, we use the same indicators as other methods for evaluation, namely Accuracy (Acc), Specificity (Spe), and Sensitivity (Sen).
As parameter settings, in this paper, we use cross-entropy loss function and Adam optimizer, epoch set to 100, and Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. We verify the scheme proposed in this paper in two parts: The first part is the performance evaluation of the graph of the data. We will evaluate the performance of the time complexity and memory requirement of the graph method of EEG signals. In the performance evaluation section, we will compare the performance of VG [19], HVG [22], LPVG [21], LPHVG [23], OG, and WOG [24] on single-channel EEG signals.
The second part is the classification performance evaluation. We will verify different two-stream models (TS-MLP, TS-GNN, TS-1DCNN, and TS-SGCN). We will also compare the classification performance with the current best baseline method.

TS-MLP:
The TS-MLP model has no sequential convolution structure and only performs vertex aggregation once. It is verified whether the sequential convolution operation and the vertex aggregation produce gains for epilepsy detection tasks.

TS-GNN:
The TS-GNN model has no sequential convolution structure. It is verified whether the sequential convolution operation produces gains for epilepsy detection tasks.
TS-SGCN: The TS-SGCN includes vertex aggregation and sequential convolution operation. A two-stream sequential graph convolutional network model based on EEG signal conversion in the time and frequency domain.
TS-1DCNN: The TS-1DCNN only performs vertex aggregation once. To replace the vertex sequential convolution operation in TS-SGCN, we directly perform one-dimensional convolution on the adjacent matrix of the time and frequency domain graph structure.
In order to make the results of the classifiers comparable, the parameters of the same module of each model are consistent with the parameter settings of the corresponding module of TS-SGCN. For the deep learning model, the number of layers is the same. For example, each model consists of 1 aggregation layer and 6 layers in Table I. The number of aggregations K is set to 2 in TS-GNN and TS-SGCN (set to 1 in TS-MLP and TS-1DCNN). Note that all models have performed at least graph vertex aggregation once. Therefore, all the above models are graph neural networks.

A. Analysis of the p Setting
Before starting the experiment, we need to determine the value of the random probability parameter p in the WRG. We select the range of p from 0-50% for experiments.  It should be pointed out that when p is 0, it is a special case of WNG.
We have carried out classification experiments under different p values. From the experimental results, we can find that the classification accuracy of WRG is not affected as the p value increases, as shown in Fig. 5. Even some of the models have a decline in their classification results. For example, in the TS-GNN experiment, with the increase of p, the classification performance shows a decreasing trend.
We further conduct experiments on the efficiency of WRG when p goes to different values. As shown in Fig. 6, we can clearly observe that the curve of time complexity and memory requirement increases continuously with the increase of p.
Increasing p cannot improve classification performance, but it will cause a drop in efficiency. Therefore, in the classification task, we choose a smaller p value to reduce the consumption of computing resources. In the subsequent experiments, we choose two smaller values of p, 0 (WNG) and 10% (WRG).

B. The Performance of the Graph
The performance of graph representation consists of time complexity, memory requirements, and classification performance.
The time complexity of the EEG signal graph representation is the time to transform the EEG signal into a graph. The memory requirement of the EEG signal graph representation is the size of the storage space occupied by the EEG signal graph. In addition, we verify the classification performance. Our purpose for testing classification performance is to evaluate the optimal combination of graph representation and classification model. The experiments are carried out on the Bonn, SSW, and CHB-MIT datasets.
1. The time complexity of the graph representations. We compare the graph representation method proposed in this paper with other existing methods. As shown in Table II, we only use one indicator (Seconds, S). The experiments are all carried out in the same experimental environment. Each graph representation method performs graph representation on all samples in the dataset, then averages the total time to calculate the generation time of each method for a single sample. The method proposed in this paper has the shortest generation time among all graph representations. WOG is a graph with the highest classification accuracy for existing epileptic EEG signals. Compared with WOG, WRG is 56 times faster than WOG on the CHB-MIT dataset (WNG is 125 times), WRG is 37 times faster than WOG on the SSW dataset (WNG is 100 times), and WRG is at least 42 times faster than WOG on the Bonn dataset (WNG is at least 122 times). The time complexity of both WRG and WNG methods is significantly lower than other graph representation methods. This shows that our method only needs less time to convert the EEG signal into the graph representation. To further verify the graph representation time efficiency, we randomly selected from the epileptic EEG signal dataset containing 256 sampling points as the unit of EEG signal segments, and gradually increased the length of EEG signals, and observed the graph representation methods of different EEG signals in these EEG signals. A total of 16 EEG signal lengths were set in the experiment, and EEG signal segments ranged from 256 sampling points to 4096 sampling points. Each graph representation method calculates 10 generation times on each length of EEG signal segment, and takes the average of these 10 generation times as the final result.
It can be seen from Fig. 7 that the generation time of VG and LPVG is the largest, which is O(n 3 ). The time complexity of OG and WOG methods is O(n 2 ). The time complexity of HVG and LPHVG methods is O(n log n). The least time-consuming methods are WRG and WNG methods, and the time complexity is O(n). This also intuitively verifies the fact that the graph representation method proposed in this paper has the least time complexity.
2. The memory requirements of the graph representations. We compare the graph representation method proposed in this paper with other existing methods. As shown in Table III, we only use one unit (Kilobyte, KB). Each graph representation method is the total memory requirement obtained by performing graph representation on all dataset samples. The memory requirements of OG and WOG are large, which is consistent with the hypothesis of this paper. These methods build many edges. WOG occupies more memory requirements than OG. WOG contains a huge number of edges, and each edge has a weight, which takes up more space. However, it can be found that the graph representation proposed in this paper indicates that the memory requirement is significantly lower than that of OG and WOG. WRG is also 51 times smaller than WOG on the CHB-MIT dataset (WNG is 61 times), WRG is also 48 times smaller than WOG on the SSW dataset (WNG is 53 times), and WRG is at least 56 times smaller than WOG on the Bonn data set (WNG is at least 60 times). The memory requirements of both WRG and WNG methods are significantly reduced. Specifically, the graph representation proposed in this paper indicates that the memory requirements are between HVG and LPHVG. However, it is worth pointing out that HVG and LPHVG are undirected and unweighted graphs. With the same memory requirement, HVG and LPHVG contain more redundant edges than WRG and WNG.
3. The classification performance of the graph representations.
In terms of classification performance, we use four classification models: TS-MLP, TS-GNN, TS-1DCNN, and TS-SGCN to compare the performance of different graph representations in the time and frequency domains and also evaluate the performance of the different models. The purpose is to verify the classification performance of the graph representation proposed in this paper and evaluate the classification performance of the classification model proposed in this paper.
As shown in Table IV, we can observe: In the experiments of the same classification model, the WRG and WNG proposed in this paper have achieved better results in experiments with different datasets, except for TS-GNN, which cannot effectively extract sequential features of the vertices in WRG and WNG with the aforementioned TS-GNN. This shows that the WRG and WNG contain important information in the EEG signal and can effectively improve the performance of the classification model.
In the experiments with the same graph representation, we find that the classification performance of the TS-1DCNN is better than that of TS-MLP, TS-GNN, and TS-SGCN. This result further demonstrates that our model can better extract features from graph representations.
Through experiments, we found that WNG-TS-1DCNN is the best-performing method at present.

C. Comparing With Time and Frequency Domain Graph Representation Methods
To verify the effect of the two-stream structure, we conducted experiments using only the time domain (T) or frequency domain (F) model to compare with the two-stream model (TS), as shown in Table V. Compared with time and frequency domain, the classification results of two-stream are better, indicating that extracting time and frequency domain features is better than using only time domain or frequency domain methods, which is also intuitive. Except for A vs.E and SSW, WNG is more accurate than WRG in almost all classification tasks.

D. Comparing With Exsiting Methods
To further verify the performance of the method proposed in this paper, we compare our method with the existing methods, and the experimental results are also very competitive, as shown in Table VI. We use time complexity (TC), memory requirement (MR), accuracy (Acc), specificity (Spe), and sensitivity (Sen) as evaluation metrics.
In terms of classification results, the accuracy of our method is comparable to or even better than the existing methods.
Compared with CT-LS-SVM [42], WNG-TS-1DCNN achieves better classification results, which illustrates the superiority of our method over traditional non-deep learning methods in the classification task of epileptic EEG signals. Compared with AdaBoost [40] in the SSW dataset and NCOV [38] in CHB-MIT, the classification accuracy of WNG-TS-1DCNN is also better than that of traditional non-deep learning models. In the Bonn dataset, the classification accuracy of WNG-TS-1DCNN is better than that of LSTM [2] and AMWCNN [11], which further shows that the signal graph representation method can promote the classification of deep learning models. The experimental results show that our method performs better in classification than methods using only deep learning model. Compared with SeizNet [13] in the SSW dataset, and compared with TF-CNN [39] in CHB-MIT, the classification accuracy of WNG-TS-1DCNN is also better than that of using only the deep learning model.
Compared with VG-F-GIN [25], WNG-TS-1DCNN achieves better classification results, which indicates that the proposed method is more effective than graph neural networks using only graph aggregation. It should be pointed out that GIN is only suitable for undirected graph structures, we only use VG for the graph representation of EEG signal.
Compared with VG-SGCN [25], the classification accuracy of WNG-TS-1DCNN is significantly improved, especially in the SSW and CHB-MIT datasets experiments. The results further verified that WNG-TS-1DCNN brought gains for epilepsy classification. The memory requirement of WNG is bigger than that of VG, but WNG is the sum of the memory requirements represented in both time and frequency domains, while VG is only the memory requirement of the frequency domain.
Compared with WOG-2DCNN [24], WNG-TS-1DCNN has comparable or even better classification accuracy. However, it is worth pointing out that WNG is more lightweight than WOG, and the number of parameters of TS-1DCNN is much smaller than that of 2DCNN. With similar classification Compared with TF-CNN [39], WNG-TS-1DCNN has higher classification accuracy. It is worth noting here that the method proposed in this paper only uses the information of a single-channel of CHB-MIT, while the TF-CNN method uses 23 channels of data. This further shows that the framework in this paper can still maintain a high classification accuracy in the case of limited information.
In summary, WNG-TS-1DCNN is a classification graph-based framework for epileptic EEG signals that combines accuracy, low time complexity, and low memory requirements.

A. Critical Difference
In order to further illustrate the performance of our method proposed, we analyze the performance of the EEG signal graph representation and the two-stream graph-based framework. We use the critical difference diagram (CDD) [44] (α = 0.05) to compare the accuracies across all datasets. We use their average accuracies over five runs.
We first evaluated the performance of different graph representations. As shown in Fig. 8 (a), we find that WOG and WNG are superior to all baseline methods with TS-1DCNN. However, WOG contains a large number of redundant edges, so WNG is comprehensively considered to be the graph representation method of EEG signals taking into account classification accuracy, time complexity, and memory consumption.
We evaluate the performance of different two-stream graphbased frameworks. As shown in Fig. 8 (b), we find that WRG and WNG have the best classification performance with TS-1DCNN. This also verifies that the two-stream graphbased framework with only one aggregation and sequential convolution is superior to other ablation baseline methods in the graph representation of EEG signals.
We also evaluated the classification performance of WRG and WNG in time domain, frequency domain, and simultaneous extraction time and frequency domain experiments. As shown in Fig. 8 (c), we find that there is little difference between the time domain, frequency domain, and the method of extracting time and frequency domain simultaneously in WRG. The best performance is the frequency domain WRG, which shows that the random edges do not improve the performance of the graph representation of EEG signals. As shown in Fig. 8 (d), in WNG, simultaneous extraction of time and frequency domain features is significantly better than the other two methods, which also verifies the effectiveness of simultaneous extraction of time and frequency domain features.

B. Interpretability
We look into the interpretability of our method. As was previously explained, our interpretability comes from the vertex aggregation and learnable weight vector. Concretely, Fig. 9.
Learnable weight vectors of the two-stream graph-based framework on all datasets, which can be interpreted as the importance of each frequency to epilepsy detection.
each weight corresponds to a specific frequency and can be seen as the importance of that frequency to epilepsy detection.
On the CHB-MIT dataset and the SSW dataset, we found that the weight distribution is on the frequency bands 30Hz and 60Hz (gamma rhythm), which is consistent with the conclusions of the paper [45]. On the Bonn dataset, we found that the weight distribution is between 3-4Hz (A vs. E, C vs. E), 8-12Hz (D vs. E), and 30-100Hz (A vs. E, B vs. E, C vs. E, D vs. E), which is consistent with the conclusions of the paper [15], [45]. These conclusions further provide explanatory evidence for the classification of epilepsy using EEG signal graph representations and graph-based framework.

C. Robustness of Graph Representation
In order to illustrate the robustness of our graph representation, We analyze the training curve of each EEG signal graph representation with different two-stream graph-based frameworks. The experiment contains a total of five categories. The tasks are CHB-MIT, SSW, A vs. E, B vs. E, C vs. E, and D vs. E.
It can be seen from Fig. 10: 1.
In the experiments of TS-MLP, TS-MLP has the weighted feature extraction ability. Therefore, WOG and WRG have the highest classification accuracy among all graphs. This shows that WOG and WRG have better information representation abilities.
2. In the experiments of TS-GNN, the stability of the TS-GNN curve is poor, and the classification accuracy is the lowest among the four models, especially in the SSW and CHB-MIT datasets. This shows that the TS-GNN model without sequential feature extraction capability has very limited classification capability for graph representations of EEG signals.
3. In the experiment of TS-SGCN, the curve is stable, and the classification accuracy converges higher, which further verifies the importance of sequential feature extraction.
4. In the experiment of TS-1DCNN, the curve is stable, and the classification accuracy converges the highest. After the aggregation operation, compared with TS-SGCN, we find that the graph has a further improvement, and TS-1DCNN only uses vertex aggregation once.
Based on the above research, it can be known that the graph representation method of EEG signals proposed in this paper is the method with the least time complexity and memory requirements at present, which greatly accelerates the generation speed of graphs. It can be found from the training curve experiment of the deep learning model that this graph representation method can improve the classification accuracy of the model. In addition, the TS-1DCNN has shown stable performance in the classification task of the epileptic EEG signals graph representation. In summary, it can be known that the graph representation method proposed in this paper and the designed two-stream graph-based framework have important significance for the classification of the graph representation of epileptic EEG signals.
It should be noted that the method proposed in this paper is mainly used to optimize the epilepsy classification task of single-channel EEG signals. In future work, we will design a graph representation method based on multi-channel EEG signals to complete the epilepsy classification task.

VI. CONCLUSION
In this paper, we propose a new graph representation method to represent EEG signals and develop two-stream graph-based frameworks to extract the time and frequency domain features of graphs. The graph representation method we proposed dramatically reduces the time complexity and memory requirements while retaining the information of the EEG signal. Our two-stream graph-based framework can simultaneously extract features in the time and frequency domain and extract the sequence features of the vertices in the graph. We apply our method to the classification task of epileptic EEG signal, and the results show that the performance of the method proposed in this paper is better than existing methods. We compare our method with the state-of-the-art method. It turns out that our graph representation method has the lowest time complexity and the minor memory requirement. The classification results of the two-stream graph-based framework surpass other methods in many experiments. These results prove that our method is competitive for the epileptic EEG signal classification task.