Automatic Identification of Epileptic Seizures From EEG Signals Using Sparse Representation-Based Classification

Identifying seizure activities in non-stationary electroencephalography (EEG) is a challenging task since it is time-consuming, burdensome, and dependent on expensive human resources and subject to error and bias. A computerized seizure identification scheme can eradicate the above problems, assist clinicians, and benefit epilepsy research. So far, several attempts were made to develop automatic systems to help neurophysiologists accurately identify epileptic seizures. In this research, a fully automated system is presented to automatically detect the various states of the epileptic seizure. This study is based on sparse representation-based classification (SRC) theory and the proposed dictionary learning using electroencephalogram (EEG) signals. Furthermore, this work does not require additional preprocessing and extraction of features, which is common in the existing methods. This study reached the sensitivity, specificity, and accuracy of 100% in 8 out of 9 scenarios. It is also robust to the measurement noise of level as much as 0 dB. Compared to state-of-the-art algorithms and other common methods, our method outperformed them in terms of sensitivity, specificity, and accuracy. Moreover, it includes the most comprehensive scenarios for epileptic seizure detection, including different combinations of 2 to 5 class scenarios. The proposed automatic identification of epileptic seizures method can reduce the burden on medical professionals in analyzing large data through visual inspection as well as in deprived societies suffering from a shortage of functional magnetic resonance imaging (fMRI) equipment and specialized physician.


Introduction
As reported by world health organization, about 50 million worldwide are suffering from epilepsy [1].Epilepsy, as the second most common brain disorder after stroke, is characterized by an unexpected seizure, where, nerve cells generate abnormal electrical activities which leads to loss of consciousness in a limited period of time [2].Proper diagnosis of epileptic seizure is essential to control and reduce the risk of epileptic attacks [3].Currently, the diagnosis of epilepsy is based on neurological examination and auxiliary tests such as neuroimaging and Electroencephalography. EEG signals can reflect epileptic abnormalities between inter-ictal (between seizures) and ictal (during seizures) stages.Typically, neurons are in contact with each other by means of electrical potentials which follow a normal pattern in healthy human brain activity.While an abnormal electrical activity occurs in the brain's neural network during epilepsy, this incremental electrical activity can spread out through the entire cortex.A neurologist traditionally inspects the epileptic malformations.The interpretation of EEG signals using an intuitive evaluation is a time-consuming and tedious task, and the obtained results may vary and are limited according to the level of knowledge and expertise of the related physician.The use of anti-epileptic drugs have some restrictions and in 20-30% of patients is unable to control the seizure [3].However, it is reported that using anti-epileptic drugs within pre-ictal stage might be more effective which prevents the occurrence of ictal stage and the possible physical damages caused by individual unconsciousness [3,4].Therefore, designing an automated computer diagnostic system seems to be essential to detect epileptic states from EEG signal based on machine learning techniques.In addition to helping the expert diagnose the epileptic stages, it will have the ability to continuously monitor the high-risk patients which alerts the seizure before its occurrence and inform the patient to take the drug.There are several stages of an epileptic seizure (brain activity of an individual with epilepsy), which play a major role in anticipating these seizures.Previous studies show that the seizure process is divided into four stages including preictal, inter-ictal (pre-seizure disturbances), ictal (during a seizure), and postictal.Evidence suggests that seizures come from a recognizable brain state called pre-ictal, which can be considered as a clue to predict the upcoming stages (ictal) [4][5][6].
In the following the recent studies on the automatic identification of epileptic seizures are reviewed.Tzallas et al. [7] calculated the power spectrum density of EEG signal segment using a variety of time-frequency distributions and used PSD as a discriminative feature to classify epileptic seizure stages.Adeli et al. [8] reported a classification algorithm using wavelet transformation and nonlinear dynamics based features such as the largest Lyapunov exponent and correlation dimension.Oweis et al. [9] extracted frequency features from the Hilbert-Huang transform.They also used the t-test to verify the importance of the features.The accuracy and specificity of their algorithm for classification of 2 epileptic and normal states were reported 94% and 96%, respectively.Bajaj et al. [10] used the empirical mode decomposition (EMD) to compute modulation bandwidth features and then utilized least squares-support vector machine (LS-SVM) for classifications.They also used the statistical test of Kruskal-Wallis to verify the features.The sensitivity, accuracy and specificity of their algorithm to classify 2 epileptic and normal states were reported 100%, 99% and 99% respectively.Alam et al. [11] used EMD and artificial neural networks (ANN) for the identification of epilepsy.Both the above methods are affected by modemixing problems due to the use of EMD, meaning that EMD may result in varying oscillations in the same mode or similar oscillations in different modes.Peker et al. [12] extracted five statistical features using dual-tree complex wavelet transform and then applied complex-valued neural network transformations to classify epileptic seizure states in 4 different scenarios.They also used a 10-fold cross validation to evaluate their algorithm.Wang et al. [13] introduced an autoregressive multivariate, partially directed coherence and SVM classification for the automatic seizure detection.Samiee et al. [14] proposed a rationally discreet short-time Fourier transform and statistical features for the classification of epileptic seizures.Das et al. [15] employed normal inverse Gaussian parameters in the wavelet domain into their seizure classification scheme.Guler et al. [16] proposed a seizure detection scheme using wavelet coefficients and a multi-class support vector machine based on the Lyapunov exponents.Guo et al. [17] presented a seizure detection model using the line length features of EEG wavelet sub-bands, followed by an artificial neural network for classification.Swami et al. [18] have extracted features such as energy, Shannon entropy and few other statistical features from EEG sub-bands and feed them to a general neural regression network classifier.Hassan et al. [19] presented an automatic diagnostic design for various epileptic seizures based on the tunable-Q wavelet transformation and bootstrap classification leading to an accuracy of 99%.Sharma et al. [20] used flexible analytical timefrequency wavelet transformation and calculated fractal dimensions to discriminate various epileptic states.They have reported accuracy of 99% for their proposed method based on LS-SVM classifier.Acharya et al. [21] proposed conventional neural networks (CNN) for automatic identification of pre-ictal, inter-ictal and normal states from EEG signal.The propsed CNN architecture includes 10 convolution and 3 fully connected layers, which lead to accuracy and sensitivity of 88% and 95%, respectively.The challenging step in automatic epileptic seizure detection is to select the discriminative features of various stages of epilepsy.In the majority of the existing works, at first different time, frequency, time-frequency as well as statistical features are extracted, and then the best discriminatory features are picked either manually or using traditional feature selection methods, which is a time-consuming procedure demanding high computational complexity.In addition, the best features in one case/subject may not be considered optimal in another.Therefore, it is crucial to implement an algorithm which learns the appropriate features corresponding to each case/subject.This will remain as the main advantage of this paper.At first, a sparsifying transform is introduced for the EEG signal of each designated state of epileptic seizure.Then, an online dictionary learning is used to obtain the sparsest representation for each of the states and sparse representation-based classification (SRC) is applied in order to identify different classes.The proposed approach can be considered as an end-to-end classifier, in which there is no need to a feature selection/extraction procedure and the discriminative features of each class will be automatically learned during dictionary learning.In dictionary learning, there are two parameters which need to be optimized, namely, the atoms of the dictionary and the sparse coefficients that relate the atoms of the dictionary to the training data set.Since the dictionary learning problem is NP-hard, dictionary learning algorithms use alternating methods to optimize the parameters.In the first step, called sparse coding, the sparse coefficients are calculated by considering a pre-defined dictionary.The most conventional algorithms used as the first step are Matching Pursuit (MP), OMP [22,23].In the second step, the sparse coefficients that are calculated in the previous step are used to update the atoms of the dictionary.These two steps are repeated until the dictionary learning algorithm converges.The most of the attention in dictionary learning problem is to improve the algorithms used in the second step.Some of the important algorithms that are used in this step are: Method of Optimal Directions (MOD) [24], Recursive Least Squares (RLS) dictionary learning [25], Online Dictionary Learning (ODL) for sparse representation [26] and K-Singular Value Decomposition (K-SVD) method [27].
In this paper, we have also focused on various scenarios for occurrence of epileptic seizure considered in the related literature (and also the existing datasets) and evaluated the proposed algorithm in 9 most complex scenarios to identify the specific states related to the epileptic seizure.The results very promising, such that in 8 out of 9 scenarios the classification accuracy was 100% while in the remaining one, it was as much as 95%.Finally, unbalanced class data is another challenging issue in the previous work, where, the authors used data augmentation methods to make the data from different classes balanced, or some classifiers which are not sensitive to unbalanced class data, while, the proposed method in this paper is almost insensitive to unbalanced class population.The remaining of the paper is organized as follows: The used database and the related mathematical background of SRC are given in Section 2. Theory of the proposed algorithm is discussed in Section 3. The simulation results and comparison of the proposed method with the state-of-the-art are given in Section 4, followed by the conclusion remarks in Section 5.

Materials and methods
In this section, we first introduce the EEG database from the University of Bonn.Then, the mathematical background of SRC theory will be provided.

EEG database
In this paper, we have used the EEG database created by Andrzezak et al. [6] at the University of Bonn.This database is widely used in seizure detection techniques which is publicly available.It consists of 500 single-channel EEG signal epochs in 5 subsets (A, B, C, D, and E) from both normal and individuals suffering from seizure (100 epochs from each subset).Sample EEG epochs belonging to the subsets; A, B, C, D, and E is shown in Figure 1.Subsets A and B contain EEG data, recorded in a relaxed and awake state from five healthy subjects with open eyes (subset A) and closed eyes (subset B).The subsets C and D were recorded in five patients who had complete seizure control after epileptic focus resection.The EEG signals in subset C were recorded from the formation of the opposite brain hemisphere (inter-ictal), while the signals in D were recorded from the hippocampal formation identified as an epileptogenic area.Finally, subset E contains only ictal activity in the epileptogenic area.All subsets include 100 EEG segments, whereas each segment has a sampling rate of 173.610Hz for 23.6 seconds (thus containing 4097 samples).

Sparse Representation-Based Classification
In the following, the mathematical background of SRC algorithm is introduced.The main idea in SRC is to obtain a sparsifying transform for each of the classes using training data set and then classify the data from test set based on the residual reconstruction error of the test data using each of the sparsifying transforms [28].In mathematical terms, a signal is called k-sparse if at most k out of N samples are nonzero (this is also stated as , where is the zero norm of vector y).Most of the existing natural signals including EEG are sparse or have sparse representation in a specific domain (transform).Considering as the sparsifying dictionary, the sparse representation of the data signal vector y can be obtained by solving the linear system of equations .Gathering length N data vectors of class i from S EEG recording electrodes in the columns of a single matrix , the sparse representation model for multi-electrode EEG signal can be obtained as follows: (1 where C is the total number of classes, , and is the corresponding sparse representation.Now, assuming the test data sample Y, the corresponding sparse representation will be obtained by solving the following optimization problem using the dictionaries of each class, to obtain : (2) ,..., and 1,..., where is the sparse representation of the j-th column of the test data matrix, i.e., , using the sparsifying dictionary of class i, .Finally, SRC classifies the data by comparing the residual error of the reconstructed EEG signal using the dictionaries of all classes, i.e., (3) where , is the Frobenius norm and is the estimated label of the test data.In many practical cases, however, the test data are accompanied by some bounded observation/measurement noise, where the optimization problem in (2) can be restated as follows in order to account for the noise component: is a positive and small number that corresponds to the noise energy.

The proposed method via dictionary learning and sparse representation-based classification
In this section, the proposed method to automatically classification of epileptic seizure states is described.The block diagram of the proposed method is shown in Figure 2. In the first phase, the recorded signals are divided into two subsets of test and training data (data collection).In the second phase, the dictionary matrices are updated for the different classes using the training data (dictionary learning).The sparse representation of the test data is obtained in the third stage using the dictionary matrices from the dictionary learning phase and then, they are reconstructed (reconstruction phase).Finally, in the fourth phase, automatic identification of epileptic seizures is performed based on the difference between initial (original) and the reconstructed signals from the third stage (classification phase).In the upcoming subsections, at first, online dictionary learning algorithm is discussed followed by the introduction of the proposed classification procedure and its parameters.

Correlation Based Weighted Least Squares Update of Dictionary (CBWLSU)
In general, the dictionary is referred to a set of atoms (columns of the dictionary matrix), which can be used to represent an underlying data as a linear combination of its atoms.In batch learning, the whole training data is used at once in order to obtain the atoms of the sparsifying dictionary.This method often has high computational burden, while the sequential methods in which the training data is utilized in a sequential manner have relatively lower computational burden.In online dictionary learning (a kind of sequential learning), starting from an initial solution/guess for the dictionary, its atoms are updated in a recursive manner as the new training data becomes available.In [29], a new online dictionary learning algorithm, namely, correlation-based weighted least square update (CBWLSU), is proposed to update the atoms of the dictionary one by one based on their correlation with the new training data.This method has two major advantages: First, it significantly reduces the computational burden of heavy matrix-inversion by reducing the dimension of the matrix, which should be inverted.Second, it prevents the updating of the unnecessary atom.Algorithm 1 shows the summary of CBWLSU dictionary learning.

SRC using CBWLSU dictionary learning
First of all, for the collected signals of epileptic seizure states, the over-complete learned dictionary from training samples for the state using CBWLSU algorithm is denoted as .
Then, the sparse representation for a test data y (of unknown label) will be obtained using all of the C learned dictionaries, leading to their corresponding sparse representations as .The reconstruction error for the test data using the sparsifying dictionary from i-th state, i.e., , can be calculated as: (5) -- Finally, the data will be assigned a label, j * , based on the solution of the following optimization problem: (6) This procedure is depicted in Figure 3.The trial and error procedure is followed to determine the parameters of the proposed method.Since the length of each segment of is considered to be equal to the length of the sample data (4097 samples), the dimensions of the sparsifying dictionary is set to 4097×6000.In the training and testing processes, 90% of the data is randomly used for training and the remaining 10% for testing and 10-fold cross-validation is used to evaluate the classifier.The sparsity parameter k is empirically set to 10 for both learning and classification procedures.

Simulation results
The simulation results of the proposed method are presented in this section.The simulations are conducted on a PC with 8 GB of RAM and a 1.6 GHz core i5 CPU.In order to assess the classification performance of the proposed algorithm in different scenarios in terms of complexity as well as clinical relevance, nine different scenarios (namely case I to IX in Table 1) were considered based on different combinations of the five existing EEG subsets (A, B, C, D and E) introduced in Section 2.1.These cases consist of four 2-class, three 3-class, and one 4 as well as one 5-class problems, constituting a more practical as well as a fair testbed to compare with the existing state-of-the-art.In order to visually asses the reconstruction performance of the proposed algorithm, a random sample is picked from each of the subsets and the original and reconstructed signals are plotted in Figure 4, which shows that the reconstructed signals are quite consistent with the original ones.In order to gain more insight, the reconstructed signals of 90 samples of each subset (for training dataset) are shown in Figure 5 at a particular time instance.Furthermore, as a quantitative measure for the reconstruction performance, the normalized reconstruction error ( ), for the segment of the signal in Figure 5 is computed and plotted in Figure 6.
Accordingly, it can be concluded that the samples could be efficiently encoded as sparse representations using learned atoms.To put it more clearly, we chose one test sample from each subset (A, B, C, D and E) and the sparse representation coefficients of these five test samples based on their corresponding learned dictionaries are given in Figure 7.
In terms of the computational complexity of the dictionary learning procedure, the runtime of the proposed algorithm for training each dictionary using the corresponding training dataset is roughly 28 minutes.In other words, a total of 140 minutes was spent on training 5 dictionaries (for each subset), while only 6 seconds were spent on classifying the total testing dataset given the trained dictionaries.) for the samples of the subsets in Figure 5.In order to evaluate the classification of the proposed method for 9 different predefined cases, the classification performance in terms of accuracy, sensitivity and specificity is shown in Table 2.It is evident from Table 2 that among various clinically important cases, maximum accuracy, sensitivity and specificity for 8 out of 9 predefined cases is obtained, which is 100 percent, while the accuracy, sensitivity and specificity for the remaining VIII case is still very promising.During recent years, several automatic seizure detection methods using EEG signal were proposed.
In Table 3, we compared various studies conducted on the same database to classify different predefined cases using EEG signals.The best results are highlighted in boldface.It is clear from Table 3 that our proposed method offers the highest accuracy, sensitivity, and specificity for all 9 cases among all the comparative methods.In previous studies, common methods such as WT, EMD, etc. were used to extract the important characteristics and features of the signal, involving some common problems regarding the parameters of the feature selection/extraction procedure such as choosing the type of the mother wavelet, the number of decomposition levels, and etc.One of the most important advantages of the proposed method compared with the other methods is that the feature extraction is automatically done based on dictionary learning and no feature selection procedure is needed.In order to assess the performance of the proposed method against observation noise, white Gaussian noise of SNR -20 to 20 dB is added as the measurement noise to the EEG signals and the classification accuracy for all 9 cases is reported in Figure 8.As it is seen, the classification performance of the proposed method is considerably robust to the measurement noise in a wide range of SNR, such that the accuracy is still more than 80% for SNR of -4 to 20 dB.Despite the contributions, this work has some limitations, as with other previous studies.First, notwithstanding the use of the Bonn database, a clinical validation study based on a bigger dataset is still necessary.Second, the training time of the proposed algorithm is relatively high, which can be solved using graphical processing unit (GPU) systems.

Conclusion
In this paper, a new method for automatic identification of epileptic seizures is presented using SRC and dictionary learning.In the proposed method, the EEG signals are used to separate 2 to 5 classes in 9 different scenarios using the dataset recorded at the University of Bonn.We achieved 100% accuracy, sensitivity and specificity for all scenarios except C-VIII, which is very promising compared to the state-of-the-art seizure detection approaches.Furthermore, it is shown that the proposed method is robust to the measurement noise of level as much as 0 dB.It is also expected that the automated system will reduce clinician's workload in detecting subtle information hidden in the large EEG data and thus save a lot of time in identifying seizures.

Fig 1 .
Fig 1. Sample EEG epochs belonging to the subsets; A, B, C, D, and E.

Fig 2 .
Fig 2. The block-diagram of the proposed method.
Dictionaries which are used to obtain sparse representation for the signals are called sparsifying dictionaries and divided into two categories of deterministic and training-based dictionaries.Deterministic sparsifying dictionaries are not dependent on the underlying signal, like FFT and DCT bases matrices, while the entries of the training-based sparsifying dictionaries are completely dependent on the signal to be represented.Training-based dictionaries are signal-specific and can obtain the sparsest representation of a specific signal.Dictionary learning algorithms use training data in two manners: batch learning methods and sequential learning methods.

Fig 3 .
Fig 3. Block Diagram for automatic identification of epileptic seizures.

Fig 4 .
Fig 4. Original and reconstructed signals for each subset (A, B, C, D and E) for the sample no 50.

Fig 5 .
Fig 5. 90 samples of the reconstructed signals (for training dataset) at a particular time for each subset.

Fig 6 .
Fig 6.Reconstruction error () for the samples of the subsets in Figure5.

Fig 7 .
Fig 7. The sparse representation coefficients of the test samples from of five subsets.

Fig 8 .
Fig 8. Accuracy of the proposed method versus SNR in additive white Gaussian noise scenario.

Table 1 .
Nine different classification cases considered in this study and their description.

Table 2 .
Classification performance (Accuracy, Sensitivity and Specificity) for each class.

Table 3 .
The performance of the proposed method compared with the other methods on the Bonn EEG database.