Deep C-LSTM Neural Network for Epileptic Seizure and Tumor Detection Using High-Dimension EEG Signals

Electroencephalography (EEG) is a common and significant tool for aiding in the diagnosis of epilepsy and studying the human brain electrical activity. Previously, the traditional machine learning (ML)-based classifier are used to identify the seizure by extracting features from the EEG signals manually. Although the effectiveness of these contributions have already been proved, they cannot achieve multiple class classification with automatic feature extraction. Meanwhile, the identifiable EEG segment is too long to limit the capability of real-time epileptic seizure detection. In this paper, a novel deep convolutional long short-term memory (C-LSTM) model is proposed for detecting seizure and tumor in human brain and identifying two eyes statuses (open and close). It achieves to predict a result in every 0.006 seconds with a short detection duration (one second). By comparing with other two types deep learning approaches (DCNN and LSTM), the presented deep C-LSTM obtains the best performance for classifying these five classes. All of the obtained total accuracy are over 98.80%.


I. INTRODUCTION
Epilepsy is the most common severe neurological disorder, and nearly 50 million people are diagnosed with epilepsy worldwide [1]. The statistical results from the World Health Organization (WHO) display that 2.4 million people have epilepsy annually. The patients might injure themselves, develop other medical problems, and life-threatening emergencies during seizure activity. More seriously, they will broken bones, concussions, head injury with bleeding into the brain, and breathing difficulty [2]. The overall risk of dying for a person with epilepsy is 1.6 to 3 times higher than for The associate editor coordinating the review of this manuscript and approving it for publication was Shiping Wen . the general population. Sudden unexpected death in epilepsy (SUDEP) is likely the most common disease-related cause of death in epilepsy. It is not frequent, but it is a genuine problem, and people need to be aware of its risk [3]- [6].
The electroencephalogram (EEG) signal is generally used for epileptic detection as it is a condition related to the brain's electrical activity. Analysis of EEG signal can also be adopted to fascinate advanced human robot interaction [7]- [9]. EEG recordings are recorded digitally for viewing on a computer display unit, which also lends themselves to be automatically analyzed. Using a common language for seizure classification also makes it easier to communicate among clinicians caring for people with epilepsy and doing research on epilepsy. An epileptic seizure can be detected by analyzing VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ the EEG signals, due to large numbers of brain cells are activated abnormally at the same time in the brain during a seizure. Since the treatment of seizures depends on an accurate diagnosis, it is essential to make sure that a patient has epilepsy and knowing what kind of disorder. Separating seizures into different types helps guide further testing, treatment, and prognosis or outlook. The challenging is locating the ictal spikes and seizures during EEG recording. It is a time-consuming process for an expert to analyze the entire length of the EEG recordings, in order to detect epileptic activity [10]. The possibility of the expert misreading the data and failing to make a proper decision would also be narrowed down due to both massive amounts and increased usage of long-term EEG recordings. Epilepsy is characterized by the occurrence of recurrent seizures in the EEG signals. During the last decades, many methodologies are proposed to detect epileptic seizures and tumors in the brain based on EEG signals. Although these presented classifiers have achieved perfect classification accuracy, some drawbacks limit their performance. First, most of the previous works only consider binary classification problems because it is difficult to establish a multiple-class model based on the EEG signals, which might decrease the accuracy. Second, few of the contributions consider predicting the seizure with a short-time segment. Traditional classification methods, where the most EEG is reviewed by a trained professional, are very time-consuming when applied to recordings of this length [11]. Traditional feature extraction algorithms will miss several useful information so that decreasing the classification accuracy.
In this paper, we proposed a novel deep C-LSTM neural network structure for epileptic seizure and tumor detection. The main contributions of this paper can be summarized as follows: 1) The deep C-LSTM model can classify five brain statuses, including eyes open and close, seizure activity, and two seizure free statuses. 2) It enhances the accuracy and achieves noise robustness.
3) It can predict the result using a short EEG signal segment (1 second).
The paper is organized as follows: Section II list the previous contributions for seizure detection. The adopted data is described in Section III and the problem statement is introduced in Section IV. Section V describes the data reconstruction, DCNN, LSTM and the proposed deep C-LSTM approaches. Experimental protocol are detailed in Section VI and the acquired comparison results are discussed in Section VII. Moreover, Section VIII drew the conclusions and future works.

II. RELATED WORK
During last decade, several feature extraction methods and classification approaches are present to improve the performance of the built classifier and enhance the seizure detection ability. The Continuous Wavelet Transform (CWT) and Discrete Wavelet Transform (DWT) are the two general methods to extract features for building a classifier [23]. The epileptic seizure recognition dataset which is published on the UCI machine learning website is mostly used. However, few of the studies consider to classify all of the five classes. Table 1 list several significant contributions for epileptic seizure detection using the traditional machine learning (ML) classification methods and feature extraction approaches. Most of them consider to classify the seizure activity and normal brain as a binary classification problem. In general, the selected features have three aspects, time-domain, frequency-domain, and time-frequency-domain [24]. The most popular features are Approximate Entropy (ApEn), Sample Entropy (Sam-pEn), Phase Entropy, Average Frequency (AF), Spectral Entropy (SE), Normalized Spectral Entropy (NSE), Spike Rhythmicity (SR), and Relative Spike Amplitude (RSA) [13]. Meanwhile, the most popular and useful ML methods for identifying the seizure by EEG signals are Fuzzy Sugeno (FS), Support Vector Machine (SVM), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT), Gaussian Mixture Model (GMM), and Naive Bayes (NB) [19].
Recently, many epileptic seizure detection methods based on deep learning (DL) neural network are proposed for enhancing accuracy and automatic feature selection. Table 2 37496 VOLUME 8, 2020 list four current studies using different deep learning methods. All of them get a higher accuracy than the traditional ML approaches, but there are some limitations have not be considered. First, the detection length is too large to achieve a real-time monitoring. Second, few of them can classify multiple classes. Hence, we proposed a novel deep learning based structure to solve the above problems. The deep C-LSTM approach is a popular method applying on many research fields, such as text classification [36], [37], fast biomedical volumetric image segmentation [38], web traffic anomaly detection [39]. This model has proved to obtain a higher classification accuracy by comparing with several previous DL models, such as DCNN and LSTM.
Although all of list previous contributions achieve a high classification accuracy (over 84%), most of the recognized classes number are less than four. Hence, the obtained accuracy are limited. On the other hand, most of the detection length are longer than 2 seconds, which limits the performance of real-time monitoring. Furthermore, noise interference is another problem which will affect the recognition rate. However, none of them consider to solve this problem.

III. DATA DESCRIPTION
The EEG signals for epileptic seizures and tumors are adopted from the UCI machine learning website. The raw data are recorded with the same 128-channel amplifier system, using an average common reference [40]. The dataset consists of five file sets which represent two activities (i.e., eyes open in set A and eyes closed in set B) and three diseases in set C, D, and E. They originated from the EEG archive of presurgical diagnosis. The EEGs are selected from five patients who have achieved complete seizure control after resection of one of the hippocampal formations. Set C record from the opposite hemisphere of the brain within the hippocampal formation, while set D is captured from the epileptogenic zone. However, both of them consider measured the seizure of free intervals. Set E is the epileptic seizure activity. The sampling rate of the data acquisition computer system is 173.61Hz with a 0.53-40Hz (12dB/oct) band-pass filter settings. Fig. 1 shows the collected EEG signals representing the mentioned five activities and diseases. Each original dataset consists of 100 files, with each file representing a single subject/person. Each file is a recording of brain activity for 23.6 seconds. The corresponding time-series is sampled into 4097 data points. Each data point is the value of the EEG recording at a different point in time. So we have total of 500 individuals with each has 4097 data points for 23.5 seconds.
In this paper, we adopt a sliding window strategy with fixed detection length and overlap to recognize the epileptic seizure activities and identify the other classes.

IV. PROBLEM STATEMENT
The aim of the epileptic seizure and tumor detection can be regarded as a multiple-class classification problem using 100 dimensions EEG signals. The following notations describe the procedure of classifier establishing and prediction. Let's assume the labeled sequence as (s i , y i ), i = 1, 2, · · · , N . N is the length of the whole collected dataset. As it is described in Section III, each input matrix s i ∈ R 100×L d . L d is the detection length. y ∈ a discrete class label. In this article, there are five classes (see Fig. 1), i.e., eyes open (A), eyes close (B), seizure free Hippocampal (C), seizure free Epileptogenic zone (D), and seizure activity (E).
The first step is to establish five classes classifier by the proposed DL model based on the labeled training couples (s p , y p ), p = 1, 2, · · · , M . M is the number of couples in the training dataset and M ≤ N . The built DL classifier can be denoted as: where θ is the whole parameters of the DL model. The testing procedure can be regarded as a supervised way. When a new input s t acquired, the DL classifier can predict the result aŝ y t = f (s t , θ). The recognition error ε can be calculated by comparingŷ t with truth label y t as follows: where where T is the length of testing dataset. The DL classifier f aims to find the optimal parameter set θ by is the overall set of DL parameters.

V. METHODOLOGY
A novel Deep C-LSTM structure is designed to implement a multiple-target classifier for accuracy enhancement and noise robustness. To achieve the claims, it needs to reconstruct the raw EEG signals for preparing the training and testing datasets and establish the C-LSTM neural network.

A. DATA RECONSTRUCTION
To build the DL-based classifier for identifying the five brain status, specially epileptic seizure recognition, a sliding window strategy [41] is adopted with the fixed detection length L d and overlap L o . As it is described in Section III, the raw EEG signals is a time-varying sequence with 100 dimensions. Hence, each EEG segment is reconstructed as a s t ∈ R 100×L d matrix. Finally, the inputs can be regraded as a sequence {s 1 , s 2 , · · · , s t } which changes over time (see Fig. 2). First, the patterns will be extracted from the reconstructed input matrix s t ∈ R 100×L d by the designed convolutional layer with a filter F ∈ R n×m . The convolution operation is to obtain the features matrix c t ∈ R (100−n+1)×(L d −m+1) by computing the convolution results between s t and F as follows:

B. DEEP CONVOLUTIONAL NEURAL NETWORK
where s [j−n+1:j,i−m+1:i] is a sub-matrix of size m along the columns and the operator ⊗ is the element-wise multiplication. By sliding along the column dimension of s, each component c i is acquired as an element-wise product, which is a single value as shown in Fig. 2.
The single filter is used to compute a convolution by implementing the input matrix. To form a more vibrant representation of the EEG signals, the DCNN model applies a set of filters to compute the convolutional matrices in parallel. Then, the obtained multiple feature maps can be denoted as c ∈ N @n × m, where N is the number of filters (also shown in Fig. 2). Meanwhile, a bias vector b ∈ N is added to the convolution results so that they can learn an appropriate threshold.
However, the full whitening of the convolutional outputs is costly and indistinguishable in every column. To solve these problems, a typical DCNN model adopts two necessary simplifications BN and ReLU. The BN layer aims to joint the inputs and outputs and normalizes the computed features independently. Especially, it is made as a sequence with the mean of zero and the variance of one. More details, the ddimensional convolutional inputs c = {c 1 · · · c d } can be normalized as follows: This step speeds up convergence even though some features are not decorrelated [42]. However, the representation of the layer might be changed due to the merely normalizing each input of a layer. Notably, it will constrain the inputs to the linear regime of the nonlinearity by a sigmoid normalizing operation. Hence, a pair of parameters γ k and β k are introduced to scale and shift the normalized value: Then, the representation power of the network can be restored by learning along with these parameters. y ∈ is a discrete class label. The non-linear activation function is adopted to enable the learning of non-linear decision boundaries. In this paper, we use the ReLU function to build the DCNN model to enhance accuracy and accelerate calculation.
To prevent the DCNN model from overfitting, we add a dropout layer with a rate of 0.3 before passing to the FC and softmax layer. The probability distribution over the classes can be computed as follows: where w and b are the weights and bias of the v-th label [43].

C. RECURRENT NEURAL NETWORK
As a special RNN structure, the LSTM network has proven robust and powerful for modeling a general-purpose sequence with long-range dependencies in time-varying studies [44]. Due to the collected EEG signals are the time-based sequence, the current status has a strong relationship with the previous environment. The LSTM model is the best choice to solve this problem. As shown in Fig. 3, the memory cell c t in the LSTM module, can accumulate the state information by accessing, writing, and clearing the several self-parameterized controlling gates. If the gate is activated, the information will be accumulated to the cell. If the forget gate f t is not on, the past cell c t−1 will be saved. Otherwise, the previous information could be ''forgotten''. The output gate o t can control whether it needs to propagate the latest cell output c t and the final state h t . The related equations can be expressed as follows: where x t is the input sequences.

D. THE PROPOSED DEEP C-LSTM ARCHITECTURE
Although both DCNN and LSTM models are proven powerful to hand time sequence and noise robustness, they can not keep a high and stable classification accuracy while the LSTM model contains too much redundancy resulting in time-consuming. To address these problems, we proposed a deep convolutional-LSTM (C-LSTM) model to classify the five classes EEG signals. Fig. 4 display the architecture of the designed deep C-LSTM network, which consists of a DCNN module and LSTM networks. The DCNN structure aims to extract the features and reduce the dimension of the raw EEG signals. The LSTM layer is to enhance recognition accuracy. In the DCNN module, two convolutional networks are adopted with the same size of filter 8 × 8. The first CNN layer has four filters, while the second one has eight. The rate of dropout layer is 0.3, and the neurons of the FC layer is set as 10. Hence, the size of the parameters matrix obtained from the first FC layer is 10. The LSTM layer has 30 neurons, then the parameters matrix of the second FC layer is 5 × 30.
The parameters of each layer in the proposed deep D-LSTM frame are described as follows: • Inputs: As it is described in Section V-A, the input can be regarded as a 100 × L d matrix, because the raw EEG signal have 100 dimensions. Fig. 4 shows the an input EEG matrix with 1 second, namely L d ≈ 174.  To process the problems of large data and parameters, we use the "Adam" optimizer for adaptive estimates of lower-order moments [46]. This method is proved to be implemented straightforwardly and computationally efficient. Furthermore, it has little memory requirements. We set the initial learn rate at 0.005. The learn drop period and factor are 50 and 0.1, respectively.
• LSTM: A LSTM network with 30 nodes are used to learn the information from the time sequence. Similarly, the LSTM model also adopts the "Adam" optimizer. The learning rate is set as 0.005 with a 0.2 drop factor and 5 drop period.
• Dropout: To solve the over-fitting problem, two dropout layers are adopted behind the ReLU function and LSTM layer. Meanwhile, the dropout network aims to improve the generalization error along with the increasing layers of the neural networks [47]. It also can reduce the training time so that avoiding the time-consuming phenomenon. We set a 0.5 percentage of the two dropout networks.
• softmax: After acquiring the output from the FC layer, the softmax activation function is adopted to turn probabilities to logic numbers. After obtaining the probability of each input of nodes in softmax layer, the highest values will be selected as the final classification results [48].

VI. EXPERIMENTAL PROTOCOL A. DATASET SETTING
As it is described in Section III, the used EEG dataset have 4097 × 5 samples (23.6 × 5 seconds). We choose three types detection length (1s, 1.5s, and 2s) to evaluate the classification accuracy. For each experiment, we adopt two strategies for building the training and testing datasets, namely 50% − 50% and 60% − 40%. For example, the 50% − 50% experiment means it uses half of EEG data to train the classifier and the rest half data for evaluation. To enhance the persuasiveness of the experiment, we selected arbitrary 50% segments, and run the experiment over 20 times to obtain the average and standard deviation of the classification results.

B. EVALUATION PARAMETERS
We adopt overall accuracy to evaluate the total classification accuracy of classifiers. The computational equations are defined by Eqs. 2 and 3. For evaluating the performance of each binary classification task, sensitivity and F1-score are used. Sensitivity aims to measure the proportion of actual positives that are correctly identified. A high F1 score means that the classifier has low false positives and low false negatives. Sensitivity and F1-score can be calculated by the following equations: where TP and FP demote the number of true positives and false positives, respectively. TP means the model correctly predicts positive class, while FP indicates the model incorrectly predicts positive class. Similarly, FN and TN are the numbers of false negatives, and true negatives, which means the model incorrectly predicts negative class, and the model predicts negative class correctly. The best value for F1-score and sensitivity is 1.

A. CLASSIFICATION PERFORMANCE
The Epileptic detection performance of the proposed deep C-LSTM model was evaluated by comparing the overall accuracy, F1-score, and sensitivity. For avoiding the overfitting problem of the neural network method, all of the experiments are run more than 20 times. In table 3, the proposed deep C-LSTM model obtains the high total accuracy than the other two types of DL methods in both 50% − 50% and 60% − 40% strategies. The comparison results also prove that the deep C-LSTM approach can classify the five classes using a short detection period (1s, 1.5s, and 2s) with a high recognition rate (more than 98.80%). However, the total accuracy cannot describe the performance of the multi-class classification and the ability of the classifier. As it is described above, F1-score and sensitivity are the two main measurement parameters to evaluate the classification ability of each class. Hence, both qualitative and quantitative analyzing methods are adopted to evaluate the sensitivity and robustness ability in Fig. 5. The DCNN model with two layers of CNN modules TABLE 3. The comparison total accuracy among DCNN, LSTM and deep C-LSTM.
is an unstable classifier, which is unsuitable to be used for epileptic seizure detection. The LSTM model shows a lower F1-score value to recognize the seizure activity while the proposed deep C-LSTM method acquired the best F1-score in each binary classification task, which proves that it is the best method for accuracy enhancement and robustness.
Similarly, the average sensitivity of each class is computed in Fig. 6. Although the proposed deep C-LSTM gets the best sensitivity, the comparison results show that the DCNN model is good at recognizing the seizure activity, while the LSTM model cannot. Nevertheless, it gets a higher recognition rate for identifying the eyes open and close.

B. NOISE ROBUSTNESS
To implement this model to practical operation, it is necessary to acquire the noise robustness. Hence, various Gaussian noise sources are generated to the input at each epoch in the network training [49]- [51]. It aims to estimate the noise status of deep C-LSTM model by measuring standard deviation of actual noise. In this experiment, the proposed deep C-LSTM method is evaluated by comparing the DCNN and LSTM approaches because noise robustness depends on the regularization method and the selection of the hyperparameter.
By using various Gaussian noise source, such as different signal-noise rate (SNR), it can evaluate the noise robustness with the proposed deep C-LSTM model. In this experiment, we choose two SNR values, i.e., 10dB and 30dB. For more details evaluation, the F1-score, sensitivity, and overall accuracy are compared. Table 4 displays that the proposed deep C-LSTM model has a better ability of noise robustness. Most of the F1-score and sensitivity are close to 1.
To enhance the persuasiveness of the experiment, we add 20dB SNR to compare the overall accuracy among DCNN, LSTM, and C-LSTM. The proposed deep C-LSTM not only obtains the highest average accuracy (over 99.38%) but also gets the lowest standard deviation.

VIII. CONCLUSION
In this paper, we proposed a novel deep C-LSTM model to detect epileptic seizures and tumors in the human brain. A five classes dataset is adopted to evaluate the performance of the deep C-LSTM method. Meanwhile, the ability of noise robustness is proved by adding different white noises in the raw EEG signals. The deep C-LSTM method is proved for recognizing epileptic seizure with short detection length (1 second). By comparing with LSTM and DCNN approaches, the deep C-LSTM model improves multipleclass classification accuracy and noise robustness. However, these model-based methods are generally vulnerable to model errors and unexpected measurement noise, computationally expensive. This aspect needs to be more investigated in the future.
The trained deep C-LSTM model is expected to be applied to the practical operation of epileptic seizures and tumor detection. Due to the limitation of the dataset, the built deep C-LSTM model should be improved by training on a larger dataset.
Despite some defects of the deep C-LSTM model, this approach is still promising because an improved performance can be achieved easily from the dataset, which simulates various uncertain conditions such as measurement noise and the classification accuracy. She has participated in some research projects in Sweden supported by Promobilia Foundation as well as Swedish Scientific Council (2018-00750), and worked on bio-signal processing and exoskeleton control. She has served as a reviewer for over ten scientific journals, such as the IEEE TRANSACTION ON INDUSTRIAL ELECTRONICS,  the IEEE TRANSACTION ON BIOMEDICAL ENGINEERING, the IEEE TRANSACTION  ON AUTOMATION AND ENGINEERING, the IEEE TRANSACTION ON CYBERNETICS,  the IEEE TRANSACTIONS  He is currently a Research Fellow with the Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano. He has participated in the EU funded project (SMARTsurg) in the field of Surgical Robotics. He is fostering an international research team constituting three Ph.D. students and a few Masters Students in the field of medical robotics. He has served as a reviewer for over 30 scientific journals, such as the IEEE TRANSACTION ON BIOMEDICAL ENGINEERING, the IEEE/ASME TRANSACTIONS ON MECHATRONICS, the IEEE TRANSACTION ON AUTOMATION AND ENGINEERING, the IEEE TRANSACTION ON CYBERNETICS, the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, and so on. He is also the Special Session Chair of IEEE International Conference on Advanced Robotics and Mechatronics (ICARM 2020). He has published several articles in international conferences and journals and has been awarded ICRA 2019 travel grant. His main research interests include control and instrumentation in medical robotics, human-robot interaction, surgical robotics, deep learning, bilateral teleoperation, and so on.