Hybrid Machine Learning Scheme for Classification of BECTS and TLE Patients Using EEG Brain Signals

Approximately 50 million people have epilepsy worldwide. Prognosis may vary among patients depending on their seizure semiology, age of onset, seizure onset location, and features of electroencephalogram (EEG). Several researchers have focused on EEG patterns and demonstrated that EEG patterns of individuals with epilepsy can be used to predict prognosis and treatment responses. However, accurate EEG analysis requires an experienced epileptologist with several years of training, who are often unavailable in small or medium sized hospitals. In this paper, a novel machine learning (ML) model that accurately distinguishes Benign Epilepsy with Centrotemporal Spikes (BECTS) from Temporal Lobe Epilepsy (TLE) is proposed. BECTS and TLE show different seizure types and age of onset, but differential diagnosis can be challenging due to the similar location and patterns of the EEG spikes. The proposed hybrid machine learning (HML) model processes the diagnosis in the order of (1) creating feature matrices using statistical indexes after signal decomposition, (2) processing feature selection using Support Vector Machine (SVM) technology, and (3) classifying the results through ensemble learning based on decision trees. Simulation was performed using real patient data of 112 BECTS and 112 TLE EEG signals, where training was performed using 80% of the data and 20% of the data was used in the performance analysis comparison with the actual labeled data based on the diagnosis of medical doctors. The performance of the hybrid classification model is compared with other representative ML algorithms, which include logistic regression, KNN, SVM, and ensemble learning based decision tree. The model proposed in this paper shows an accuracy performance exceeding 99%, which is higher than the performance obtainable from the other ML classification models. The purpose of this study is to introduce a novel EEG diagnostic system that shows maximum efficiency to support clinical real-time diagnosis that can accurately distinguish epilepsy types. Future research will focus on expanding this ML model to categorize other types of epilepsies beyond BECTS and TLE and implement the HML diagnostic blockchain database into the hospital system.


I. INTRODUCTION
Epilepsy is the fourth most common neurological disease in the world. Epilepsy is the fourth most common neurological disease in the world. Incidence rates for epilepsy are The associate editor coordinating the review of this manuscript and approving it for publication was Gang Wang . typically between 30 and 80 per 100,000 population per year in developed countries, but rates are higher in countries with limited resources [1]. Mortality and morbidity vary among different types of epilepsies. Recognition of epilepsy types is very important, as it is the basis to predicting the prognosis and determine the treatment method to be applied to the epilepsy patient. This is why identifying the causes, type, and mechanism of epilepsy has been a hot topic of medical research for a very long time [2].
The traditional method of EEG analysis is based on a skilled clinician visually examining the EEG signals. To evaluate seizures, reviewing long-term video EEGs of several days may be required, which is a very time-consuming and expert centered activity. Over several years, many attempts have been made towards automating EEG analysis, but only a few research efforts have been able to propose an effective method to accurately classify epilepsy with limited success [3]- [5]. The main reason of this difficulty is because EEG signals are patient based non-stationary, nonlinear, and highly complex signals [6]- [8]. It is even more difficult to conduct accurate classification and diagnosis in real-time, which is why there is no available medical system that has been able to do this so far.
In this paper, a novel hybrid machine learning (HML) EEG analysis technology was developed, where the results show a significantly improved performance in efficiency compared to existing state-of-the-art schemes, which makes it more useful to apply in clinical real-time EEG diagnosis of epilepsy types, in particular, distinguishing Temporal Lobe Epilepsy (TLE) from Benign Epilepsy with Centrotemporal Spikes (BECTS), which is also called Benign Rolandic Epilepsy (BRE), or Self-limited Epilepsy with Centrotemporal Spikes. BECTS is a common childhood epilepsy syndrome that affect children from age 3 to 13. Seizures respond well to antiepileptic drugs and a relatively accurate prognosis can be made. Seizures usually remit by the age of 16. High amplitude spikes or sharp-and-slow wave complexes are noted in the central and temporal EEG leads. In TLE, spike-and-wave or sharp slow waves can also be recorded in the temporal EEG leads. However, in contrast to BECTS, TLE represents the most common form of drug-resistant focal epilepsy. TLE and BECTS can have similar EEG patterns, which include analogous temporal spike discharges, but have a completely different prognosis. In patients with BECTS, patients grow out of seizures while patients with TLE often develop drug resistant epilepsy and receive surgery. Thus, it is important to differentiate these two conditions, but diagnosis can be challenging especially when spikes are seen only in one hemisphere of the brain. Based on this important need, the purpose of this research was set to create a machine learning (ML) system that can accurately classify the types of epilepsy patients in clinically applicable form in real-time.
Several researchers applied machine learning classification on EEG, but mostly focused on EEGs during seizure and other tasks [9]- [11]. However, in the clinical practice, only a few EEG capture seizures, and background EEG is used most of the time. Data on ML classification of EEG without seizures are limited. Importance of classifying both background and seizure related EEG changes have been emphasized in [12]. In [12], researchers have proposed a method of categorizing interictal EEG features in children using consideration of the organization of the background activity and a morphology/topography of epileptiform discharges. This paper proposes a classification system that clearly distinguishes BECTS and TLE in real-time with an accuracy exceeding 99% using a novel hybrid ML technique. The developed HML system operates in the following method. After generating sub-bands using Empirical Mode Decomposition (EMD) signal decomposition technology, a feature matrix is created using statistical indexes. Based on the decomposed statistics and indexes generated, patients were classified in to BECTS and TLE by the HML system. The proposed HML system prepares the diagnosis based on the following three steps.
1) Create feature matrices using the statistical index after EMD signal decomposition of the EEG signals. 2) Conduct feature selection using Support Vector Machine (SVM) ML technology. 3) Classify the results into BECTS and TLE using decision tree based ensemble ML technology. In the developed system, classified patient data is transmitted to establish the database. Since EEG data is personal medical data, patient information is encrypted and saved in a blockchain database, which needs to be a certified privacy preservation data security system. Encrypted blockchain technology was used to solve the problem of medical data leaking of BECTS and TLE patient information, and shard-based parallelism was used to increase the data throughput [13]- [15].
As the amount of data applicable to the classification learning models increases, the accuracy of the HML classification of Epilepsy patient groups can be further improved. The process of storing the data (including patient information and seizure type labeled information on the blockchain) after performing the classification for each patient (using the European Data Format (EDF) files) created by the EEG recording HML system is shown in Fig. 1.
The following sections of this paper are organized as follows. Section two introduces existing studies related to EEG analysis. Section three describes the data collection method and signal decomposition techniques applied. The fourth section introduces the process of feature extraction using statistical indicators. In the fifth section, a description of the HML structure using SVM and decision tree ensemble ML techniques is provided. In section six, various types of patient EEG signal data is classified in to BECTS and TLE types using the proposed HML model, and a performance comparison with other representative ML schemes, and descriptions of the performance difference, are provided, followed by the conclusion of this paper.

II. RELATED WORKS
As epilepsy is a complex group of diseases, accurate classification is challenging [16]. Seizure types are very important to distinguish, but higher level diagnosis of epilepsy syndrome can be made only when other features including age of seizure onset, family history, and EEG findings are available. EEG diagnosis is based on electrical signals used to monitor the brain activity in order to sense action potentials in nearby VOLUME 8, 2020 neurons. The importance of EEG has been emphasized by a recent publication by the International League Against Epilepsy (ILAE) Neurophysiology Task Force [17].
There are several papers that focus on epileptic seizure classification using EEG data. In [18] and [19], complete ensemble EMD with adaptive noise (CEEMDAN) was applied to the EEG data, where in [18] EEG classification was performed by modeling CEEMDAN functions using normal inverse Gaussian (NIG) parameters and the Adaboost algorithm, and [19] uses extracted spectral features from the CEEMDAN mode functions and the classified EEG data using linear programming boosting. In [20], a classification model was created using spectral analysis methods and various ML algorithms, where a multiscale principal component analysis (MSPCA) de-noising method was used to improve the performance. In [21], a SVM model that uses a genetic algorithm and particle swarm optimization is used to conduct EEG classification analysis. In [22], EEG data was classified using EMD and a multilayer perceptron neural network (MLPNN) classifier. In [23], EMD, discrete wavelet transform (DWT), and wavelet packet decomposition (WPD) were applied to EEG data, which are effective signal decomposition techniques. In addition, new classification models were proposed using various ML algorithms, such as, SVM, KNN, and multilayer perceptron (MLP).
In previous studies, algorithms were developed to identify and classify the characteristics of ictal and interictal sections of the EEG signals. In this paper, an algorithm was developed to classify two epilepsy disease types with different characteristics. To specify the difference between BECTS and TLE, two channels of EEG data were selected to perform multi-channel analysis. Previous studies have compared the performance using the existing ML techniques individually rather than using a combination of ML algorithms. In this paper, SVM was used to create a new feature matrix that has a major influence on the classification task among the features extracted from the two EEG channels. Then, EEG data classification was conducted using random forest, which is an ensemble learning algorithm that applies decision tree technology.

III. EEG DATA SEGMENTATION A. DATA SELECTION
A screening process for patient real-time diagnosis was conducted at the Severance Children's Hospital (located in Yonsei University, Seoul, South Korea). Patients undergoing EEG tests through inpatient and outpatient treatment in the Department of Pediatric Neurology were reviewed, where this study was approved by the institutional review board (IRB no. 4-2020-0197). Through this process, 112 routine EEG signal data sets were collected for each interictal state of BECTS and TLE patients. Among the electrode positions of multi-channel EEG on the basis of the standard international 10-20 system, T3-C3 and T4-C4 channels were used in comparing the symmetry properties between BECTS and TLE patients, as shown in Fig. 1. The sampling rate was 100 Hz and the length of each data is unified to 30 seconds.

B. SIGNAL DECOMPOSITION METHOD
The input EEG data is a combination of different signals in different frequency domains. Before feature extraction is conducted on the signal, a decomposition process that divides signals into several sub-band signals was conducted. The EMD method was used among several methods, and the algorithm presented in Fig. 2 was used to generate the Intrinsic Mode Functions (IMFs) from the EMD results [24]. The EMD method was developed based on the assumption that any non-stationary and non-linear time series consists of different simple intrinsic modes of oscillation [25].
First, the local maxima and minima of each epoch of the EEG input signal x(t) were obtained. Then the median value (which is the center value when data samples are sorted in order of size) was obtained from the median function m(t). Next, h(t) was obtained by subtracting m(t) from the input signal. Thereafter, a process of discriminating whether the function h(t) can be classified as an IMF is performed. The process of distinguishing the IMF is based on determining if the following two conditions are satisfied [25].
1) An IMF has only one extremum between two subsequent zero crossings (i.e., the number of local minima and maxima differs at most by one). 2) An IMF has a mean value of zero. Note that the second condition implies that when h(t) does not satisfy this condition, h(t) is defined as an input signal and is processed in the first step again. If both conditions are satisfied, one IMF is generated, and the above process is repeated on the residual signal until it becomes a monotonic function. The proposed HML scheme uses multiple IMFs to analyze the tendency of EEG in patients, where IMF1 represents the gamma band neuronal oscillation (>30 Hz), IMF2 represents beta band oscillation (13−30 Hz), IMF3 reflects the alpha band oscillation (8−13 Hz), IMF4 reflects the theta band oscillation (3.5−8 Hz), and IMFs 5 and 6 represent the delta band oscillation (0.5−3.5 Hz) [26]. Through the EMD process, IMFs were generated from the raw EEG data, where Fig. 3 shows IMF1-IMF5 of the BECTS and TLE patients EEG data, which are the first five IMFs of the T4-C4 channel data segments.

IV. FEATURE SELECTION
EEG sub-band signals were generated by the EEG data decomposition process using the EMD method. The following four indicators were used for statistical analysis of the decomposed signals. Figs. 4 and 5 show the boxplot of four feature indices of the T3-C3 and T4-C4 channels among the decomposed signals of BECTS and TLE patients.

A. COEFFICIENT OF VARIATION
The coefficient of variation used was derived as a quantitative index of the relative variability of the EEG data signal [27]. The coefficient of variation represents the ratio of the standard deviation to the mean of the decomposed IMFs, which can be expressed as where σ and µ represent the standard deviation and mean of the IMF, respectively.

B. FLUCTUATION INDEX
The fluctuation index is used as an indicator to measure the intensity of the EEG signal [28]. The fluctuation index is the sum of the difference between the epochs of the IMF divided by the total signal length, which can be obtained from the following where N denotes the total number of epochs in the IMF. As shown in Figs. 4 and 5, the fluctuation of BECTS patients who show unstable EEG signal characteristics at the central-temporal side is slightly higher than that of the TLE patients in most IMF functions.

C. SKEWNESS
Skewness is the third statistical moment, which indicates the symmetry of EEG signals [29]. If the median is greater than the mean, the value of skewness is positive. In the opposite case, it becomes negative, which can be expressed as follows.

D. KURTOSIS
Kurtosis is the fourth statistical moment, which indicates the sharpness of the EEG signal. If the curve of the signal is sharper, the peak value has a higher value, which can be expressed as follows [30].
As can be seen from Fig. 5, the average distribution of the kurtosis value is also higher in BECTS patients since the temporal lobe has a relatively unstable signal compared to TLE patients in most IMF functions.

V. EEG CLASSIFICATION A. SUPPORT VECTOR MACHINE
SVM is a ML model effective in pattern recognition and data analysis. In SVM, the main goal is to find the decision boundary with the maximum margin, which is defined as the distance between the hyperplane (decision boundary) that classifies the class and the closest training samples [31].
Let the training set with the total set S with N training samples be denoted as which is separated by a hyperplane with margin ρ. Then for each training example where w is the normal vector of the hyperplane. Then the margin can be expressed as 2 w , and the problem of SVM that seeks to obtain the maximum value of the margin while following the hyperplane condition can be expressed in the following optimization problem. arg min Equation (7) is expressed as an unconstrained problem using the Lagrange multipliers α i ≥ 0. Next, the optimal support vector can be found with the dual problem according to the Karush-Kuhn-Tucker (KKT) condition and w can be expressed as i α i y i x i . Using the designed linear SVM, a feature selection process is performed to prevent overfitting. At this time, one of the wrapper methods, support vector machine-recursive feature elimination (SVM-RFE), is used [32]. In [33], SVM-RFE was  used on EEG signals to detect the scalp spectral dynamics of interest. SVM-RFE is effective in distinguishing the importance of features, which are determined using the corresponding value w i in the weight vector. If the value w 2 i approaches 0, this has the meaning that the corresponding i th feature does not affect the performance at all. Therefore, the features are sorted in the order that affects classification according to the value of w i . After removing the feature that does not affect VOLUME 8, 2020 the output, the step is repeated until the maximum accuracy is reached with the remaining feature. Consequently, a new feature matrix S * is created that guarantees maximum accuracy. The detailed algorithm for SVM-RFE is shown in Fig. 6.

B. DECISION TREE
Decision tree is an analysis technique that classifies the entire data into several subgroups by representing decision rules in a tree structure, and the variable area is determined by using the p-value of the Chi-square statistics, the Genie index, and the entropy index box [34]. In this paper, entropy is used as a classification criterion, which is a measure of impurity. Higher values of entropy indicate a higher level of impurity. Entropy can be represented by the following formula Entropy(S * ) = −P BECTS log 2 P BECTS − P TLE log 2 P TLE (8) where S * is the newly defined collection of training examples, and P BECTS and P TLE denote the proportion of data belonging to the BECTS and TLE patient groups, respectively. After the separation process in the decision tree, the parameter called information gain is used to determine how much of an improvement has been gained after the separation process than before separation, which is presented in the following formula where values(I ) is the set of all possible values of attribute I , and S * v refers to the subset of S * when attribute A will have the value v. Since the features are continuous variables, the entropy is calculated based on the boundary value at which the class changes.
Decision trees can produce meaningful results even with relatively small amounts of data, and it can be used in combination with other classification methods. However, if the partitioning process continues, an overfitting problem may occur [35]. The proposed HML scheme was developed to prevent this problem from occurring, which results in improving the reliability and accuracy.

C. HYBRID MACHINE LEARNING
The proposed HML scheme was developed using EMD, SVM, and decision tree technology, where SVM was trained using the feature matrix generated with four feature indexes.
Based on the training and by removing one feature from the feature matrix, the HML system records the performance of each training set to establish a ranking order in reference to the features. After all training has finished, the process will remove the features with the lowest rankings. By identifying the order of the features affecting the classification accuracy, a new set consisting only of features that guarantee the highest accuracy can be created. Some features are randomly selected from the newly proposed feature set. Among the selected features, the elements that properly differentiate the patient's disease group constitute the first step in the random forest. If the number of sub-trees is small, the training and test time can be shortened, but the generalization ability will degrade making it more likely to result in a misclassification. On the other hand, if the number of sub-trees increases, the training and test time will increase, but a more accurate performance will be obtained, which is a phenomena commonly experienced in random forest related schemes. In the proposed HML scheme, based on Fig. 7, the number of decision trees was set to 100 considering that it enables the highest level of accuracy at a suitable computation time. In addition, the number of variables selected when the node is split in the sub-tree was set to the square root size of the feature matrix S * . In this way, sub-nodes are continuously created and one decision tree is completed. By repeating this process 100 times, a judgement is made on which disease is more likely to be BECTS or TLE among the 100 decision trees, and finally the disease classification task is performed. The HML scheme is based on combining EMD based multi-IMF sub-band signal processing of the EEG data to perform feature selection through SVM and then apply ensemble learning (EL) to perform patient group classification using multiple decision trees. If the ML algorithm is processed without the feature selection process, some features will not have a reasonable basis suitable to be analyzed in the classification model, and therefore will not be suitable as training data. This could lead to overfitting problems with poor classification accuracy when applying the model to the actual test set. By using the proposed feature selection and tuning process techniques, it was possible to solve the overfitting problem that occurs and increase the classification accuracy. The flow chart for HML is presented in Fig. 6.

VI. SIMULATION RESULTS
In the experiments, 112 BECTS patients' data and 112 TLE patients' data were used from the Severance Children's Hospital database [36]. In our study, classifiers were implemented in Python based on the Scikit-learn library, which includes a wide range of state-of-the-art ML algorithms. Five parameters were used to evaluate the classifier performance, which were, accuracy, sensitivity, specificity, F1-score, and Kappa value. In addition, Table 1 In addition to the above three indicators, two indicators, F1-score and Kappa value, were additionally used for performance analysis. The F1-score represents the harmonized average of the positive predictive value (PPV) and true positive rate (TPR), where PPV can be expressed as TP TP+FP and TPR as TP TP+FN . The Kappa value is obtained by removing the accuracy of random classification (rand), which is commonly used in evaluating the EEG classification performance [37]. The two indicators are presented in (13) and (14).
In addition, features of BECTS and TLE patient groups were extracted using the EMD algorithm and statistical indicators. In the experiments, to find the difference between the two epilepsy groups, all statistical features were tested using a T test. The t and p values are given in Table 2. In some features, the value of p showed a level of 0.05 or less, VOLUME 8, 2020 which confirmed that there was a significant difference between the two groups.
In the first performance analysis, the HML algorithm is compared with the Principal Component Analysis (PCA) simulation results, which is a representative dimensionality reduction algorithm. In the simulation, the T3-C3 and T4-C4 channel data were decomposed into five IMFs via EMD. Four feature indexes were generated for each IMF. Therefore, a feature matrix of size 10 × 4 per data was created for the ML process. Then, in the feature reduction process, PCA and SVM-RFE algorithms are used respectively. In the last stage, classification of BECTS and TLE patients was performed through ensemble ML technology using 100 decision trees, where the performance of each scheme was compared. The k-fold cross-validation was applied to verify the performance of this simulation. In the k-fold cross-validation process, the entire data set was divided into k portions, of which k-1 subsets were training sets and the remaining one subset was used as a test set. In this way, a total of k times of learning was conducted to finally derive the average of the results of each learning process. In this process, the value of k was set to 5. Fig. 8 shows a comparison of the PCA and SVM-RFE performance based on the classification accuracy and computation time. As a result of the 5-fold cross-validation, the classification accuracy with SVM feature reduction was 100%, 100%, 100%, 96%, and 100%, which was similar to the classification result using PCA, which showed a 98.2%, 100%, 96%, 100%, and 98.2% classification accuracy. However, the time consumed when using SVM-RFE was 6.496, 5.152, 4.48, 6.72, 5.152 seconds, and the PCA scheme required a processing time of 27.104, 34.496, 13.44, 31.808, 33.6 seconds. Overall, the SVM-RFE scheme s average processing time (5.6 seconds) is much shorter compared to the PCA scheme s average processing time (28.0896 seconds). Considering the slightly higher accuracy and the significantly faster processing time of SVM-RFE over the PCA scheme, the HML algorithm adopted SVM-RFE for the ensemble learning s feature reduction mechanism s pre-processing operation. In the second simulation, before using various ML algorithms, pre-processing using the EMD algorithm and statistical indexes was performed in the same way. Then, the existing ML algorithms (which are logistic regression, K -nearest neighbor (KNN), decision tree, SVM, and ensemble learning (EL)) were compared with the HML algorithm proposed in this paper. In each ML algorithm, the main hyperparameters used in the simulation were set as follows: iteration was set to 80, solver was set to 'liblinear' in logistic regression, number of neighbors was set to 3 in KNN, criterion was set to 'entropy' in decision tree, kernel was set to 'linear' in SVM, and number of trees was set to 100 in EL. When each ML algorithm was applied, the parameters that resulted in the best performance were used in the classification process to result in the highest accuracy and lowest calculation time, where these results were compared with the HML algorithm s performance. Table 3 shows the average results obtained through the most representative ML technologies compared to the proposed HML scheme. Logistic regression, KNN, decision tree, SVM, and EL resulted in an accuracy of 78.13%, 82.59%, 85.71%, and 87.5%, and 93.75%, respectively, and the HML algorithm resulted in a 99.11% accuracy, which exceeds the other ML algorithms. The F1-score and Kappa values of HML were 0.99 and 0.98, respectively, and the existing ML algorithms logistic regression, KNN, decision tree, SVM, and EL resulted in F1-scores of 0.78, 0.82, 0.85, 0.87, and 0.93 and Kappa values of 0.57, 0.65, 0.72, 0.75, and 0.92, respectively. When the feature reduction process using SVM-RFE was combined with each ML algorithm, the classification accuracy of logistic regression and the KNN algorithm decreased to 63.33% and 81.67%, respectively. In the case of the decision tree, the classification accuracy increased to 91.33%, however, the HML scheme s performance was confirmed to be superior compared to the other ML schemes. In addition, when HML was applied, all BECTS patients were properly classified without any errors, and TLE patients were properly classified with a 98% accuracy.

VII. CONCLUSION
Epilepsy can be classified into various types, where this analysis requires experienced epileptologists and neurologists who have received several years of medical training. In this paper, a novel classification technique for BECTS and TLE patients among epilepsy types using the proposed HML scheme is presented. In HML, the T3-C3 and T4-C4 channel data from the 10-20 EEG system was used, and sub-band signals were derived using the EMD method. By calculating the four statistical indices (i.e., coefficient of variation, fluctuation index, skewness, and kurtosis) of the EMD sub-band signals, the feature matrix was created to be used as input data to the ML classification system. In order to solve the overfitting problem that may occur during the HML process, SVM-RFE was used to remove some of the features that have little effect on the HML classification accuracy, thereby creating a new feature matrix. Using the newly created input data, the classification process of BECTS and TLE patient groups were conducted using 100 decision trees in an ensemble ML selection process.
The HML algorithm presented in this paper shows a high classification accuracy using limited patient data, and requires significantly less processing time compared to the existing analysis methods. As soon as the patient data is collected, the classification process can be performed in real-time, which can reduce the consumed time of the overall epilepsy diagnosis process. In addition, the proposed epilepsy classification HML algorithm can be combined with blockchain database technology to further enhance patient data protection and EEG signal sets storage expansion capabilities.
In future research, additional patient biometric information will be used along with the EEG signals to enhance the epilepsy classification accuracy. For this purpose, new multi-dimensional ML and deep learning customized techniques will be investigated. YUJAUNG KIM received the B.S. degree in electronic engineering from Daegu University, the M.S. degree in biomedical engineering from Cleveland State University, and the Ph.D. degree in biomedical engineering from the University of Iowa. She participated at The Center for SUDEP Research. She is currently a Postdoctoral Researcher with Yonsei University health system. Her research interests include epilepsy, SUDEP, and signal processing. She has served as a member of the Society for Neuroscience and American Epilepsy Society.