Seizure Prediction Analysis of Infantile Spasms

Infantile spasms (IS) is a typical childhood epileptic disorder with generalized seizures. The sudden, frequent and complex characteristics of infantile spasms are the main causes of sudden death, severe comorbidities and other adverse consequences. Effective prediction is highly critical to infantile spasms subjects, but few related studies have been done in the past. To address this, this study proposes a seizure prediction framework for infantile spasms by combining the statistical analysis and deep learning model. The analysis is conducted on dividing the continuous scalp electroencephalograms (sEEG) into 5 phases: Interictal, Preictal, Seizure Prediction Horizon (SPH), Seizure, and Postictal. The brain network of Phase-Locking Value (PLV) of 5 typical brain rhythms is constructed, and the mechanism of epileptic changes is analyzed by statistical methods. It is found that 1) the connections between the prefrontal, occipital, and central regions show a large variability at each stage of seizure transition, and 2) 4 sub-bands of brain rhythms (<inline-formula> <tex-math notation="LaTeX">$\theta $ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$\beta $ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$\gamma $ </tex-math></inline-formula>) are predominant. Group and individual variabilities are validated by using the Resnet18 deep model on data from 25 patients with infantile spasms, where the consistent results to statistical analyses can be observed. The optimized model achieves an average of <inline-formula> <tex-math notation="LaTeX">$79.78~\%$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$94.46\%$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$75.46\%$ </tex-math></inline-formula> accuracy, specificity, and recall rate, respectively. The method accomplishes the analysis of the synergy between infantile spasms mechanism, model, data and algorithm, providing a guideline to build an intelligent and systematic model for comprehensive IS seizure prediction.


I. INTRODUCTION
I NFANTILE spasms (IS), also called the Western syndrome, is a common childhood epilepsy syndrome. The incidence rate of IS among all newborns is 0.031%. The disease is characterised by frequent, prolonged and uncontrollable seizures that often cause irreversible damage to infant's brain [1], [2]. The physical pain and illness also place huge burden to patient's family. Meanwhile, the risk of death in epilepsy is 3 times higher than other diseases. Factors contributing to death include suddenly seizure death, persistent status epilepticus, and unintentional injuries. The suddenly seizure death is closely associated with the generalised tonic-clonic seizures, nocturnal seizures and poor seizure control. IS is a typical generalised tonic-clonic seizures. Effective diagnosis and intervention are critical in epilepsy treatment in the golden age of patient [3].
Epilepsy is neurological disorder caused by repeated synchronous abnormal discharges of neurons or nerve loops in the brain. It can be divided into three main categories by type of seizure: focal seizures, full-blown seizures and seizures that cannot be classified. IS seizure patterns are full-blown seizures. In the past, most scholars generally believe that seizure onset cannot be predicted. Until 2013, Cook et al. [4] show the closed-loop seizure counseling system through clinical implantation of invasive electroencephalography (iEEG). The results demonstrate the possibility of a focal epilepsy prediction system being implemented. So far, seizure prediction based on data learning become popular. These studies have been successful in showing differential changes in intracranial EEG dynamics prior to focal seizures [5]. But there still lacks of analysis on generalized seizure epilepsy [4].
Scalp EEG (sEEG) has been widely used in epilepsy prediction due to its ease of acquisition, low cost, and high temporal resolution [6], [7]. Most of the differences in seizure prediction algorithms based on sEEG are visible in two main steps, namely feature extraction and pre-ictal classification against interictal categories. But there lacks of predictive algorithms for IS seizures as most studies are on the associated EEG feature extraction and seizure detection [8], [9]. Smith et al. used signal amplitude and power spectral characteristics to quantify the differential changes between seizures and non-seizures in IS [10]. Nariai et al. adopted high-frequency oscillations in the sEEG interictal period as an objective biomarker of IS [11]. Zheng et al. constructed multiple brain networks and analyzed the mechanisms of seizure network changes in IS of three states: pre-ictal, seizure and post-ictal [12], [13].   [15]. Yang et al. applied traditional nonlinear classifiers (e.g., SVM and random forest (RF)) for IS seizure detection through fused features of EEG and ECG [16]. It is worth noting that these studies confirm that there are indeed exist large differential changes between IS seizures and non-seizures. As comparatively shown in Table I, they all ignore transitional states and fail to quantify the dynamic changes in seizure mechanisms, making them not applicable to IS seizure prediction.
In summary, the pathophysiological mechanisms of IS are still unclear and the seizure mechanisms are complex. The seizure prediction of IS remains an open problem. In this paper. We attempt to build an infantile spasticity seizure prediction system based on statistical analysis and Resnet18 deep network model using sEEGs, with the aim of further exploring the mechanism of IS seizure. The overall schematic diagram of the analysis is shown in Fig. 1. Comparing with existing research, our main contributions include: 1) The continuous sEEG signal of IS (generalized seizures) are divided into five different periods, (i.e. Interictal, Preictal, Seizure Prediction Horizon (SPH), Seizure, and Postictal). Combining with the graph theory, an epileptic network of Phase-Locking Value (PLV) is constructed. The mechanism of seizure changes in the brain network during the transitional phase of seizures by statistical methods is analyzed. 2) The optimal feature vector combinations of the correlation matrix in different brain rhythms are selected, which are used to train a deep residual network model (Resnet18) for IS seizure prediction. The PLV correlation matrix as the pre-seizure biomarker between groups and on individuals is validated by the machine learning model. The optimized model can achieves an average of 79.78 %, 94.46%, 75.46%, accuracy, specificity and recall rate, respectively. 3) An analytical framework for IS prediction, that can explain the synergy between combined data, models, seizure mechanisms and algorithms through statistical methods, deep network models and visualization, is developed in this paper. The proposed model can effectively achieve accurate IS seizure prediction. The study is conducted on real recorded sEEG of 25 IS childhood patients from the Children's Hospital, Zhejiang University School of Medicine (CHZU). The statistical analysis and model learning feature visualization results on the CHZU database both revealed that 1) the associations between prefrontal, occipital, and central regions exhibited substantial variability at each stage of the seizure, and 2) the brain rhythms of (θ , α, β, γ ) bands played dominated roles in IS seizure prediction.

A. Subject Identification and sEEG Recordings
The sEEG data of 25 IS childhood patients are adopted in the study. Table II describes the details of the participants, in which the ratio of male to female patients is 2:3 and the mean age is 11 months. The subject data collection and labeling are done by clinical neurophysiologists in CHZU with pediatric expertise. These subjects are diagnosed with infantile spasms (IS), with at least one series of spastic seizures without the prognostic control. All sEEG records are collected in the clinical environment with the informed consent of the patient's legal guardian. The experiments are performed in accordance with the Declaration of Helsinki and are reviewed and approved by CHZU.
Approximately 2 or 16 hours of EEG signals from IS patients are recorded using the NicoletOne's 10-20 standard EEG device with the Fpz electrode as the reference channel and the sampling frequency of 1000 Hz. Followed the construction of a brain network, the EEGs of channels A1 and A2 are deleted, and the remaining EEGs of 19 channels (e.g., Fp1, F3, Fz, etc.) are used for data analysis. Here, F, T, P and O represent the frontal, temporal, parietal and occipital brain regions, respectively, and z is the midline, where odd and even numbers indicate the left and right brain regions, respectively. Artifacts (e.g., EMG, EYE, etc.) are present in the data for all subjects due to that these artifacts may relate to the behavioral activities of the subjects during seizures [17], [18]. The study aims to explore the transition mechanism of IS seizures by statistical and deep network models. The corresponding sEEG processing and analysis are performed on MATLAB R2019a and python platforms. The interference of power lines on EEGs is eliminated by a 50 Hz notch filter, and the frequency band of interests in EEGs is extracted using a band-pass filter (1-80 Hz).

B. Data Segmentation
Epileptic seizures that occur suddenly are caused by abnormal, self-sustained discharges in some neurons and the closed-loop networks in the brain, and the length of seizures fluctuates widely. The suddenness and individual variability make it difficult to perform accurate seizure prediction. sEEG has been widely implemented as the most powerful diagnostic and analytical tool for epilepsy [7]. There generally classify EEG data from epileptic patients into four phases: pre-ictal, seizure, post-ictal, and interictal (referring to the interictal period in addition to the previously mentioned states). The unpredictability of seizure states is the direct threat to the life safety of epileptic patients. How to provide effective alerts and interventions in the proximity of seizure [19] becomes an open issue? Therefore, a brief period of effective Seizure Prediction Horizon (SPH) exists before the seizure state, and the EEGs are divided into 5 states, as shown in Figure 2. Seizure predictors require that alerts must occur prior to seizures, along with the seizure onset duration detection. Based on the above classification of brain activity in epileptic patients, the seizure prediction problem can be considered as a classification task on Interictal, Preictal, SPH, Ictal, and Postictal brain states. Pre-seizure states are alerted when they are detected, indicating that potential seizures are imminent. During SPH, effective pharmacological intervention and electrical stimulation can prevent seizures [20], making the SPH detection very important.
For IS seizure prediction, the Interictal, Preictal, SPH, and Postictal are set as 30-40 minutes, 5-30 minutes, 0-5 minutes before seizure, and 0-30 minutes after seizure, respectively. Meanwhile, the time of onset of significant EEG changes during a seizure is defined as the start of a seizure, and the seizure duration is labeled based on neurologist's diagnosis. A sliding window of 4s is used to sample the sEEG data with an overlap rate of 50% among consecutive frames.

A. sEEG Pre-Processing
Human brain has multiple functional brain regions, like learning, memory, cognition and emotion. The synergy between different regions generates different rhythms of brain waves [21], [22]. Different brain wave rhythms can reflect the activities of different functional brain regions. In this study, the sEEG is analyzed on 5 major sub-bands: δ (1-4 Hz), θ (4-8 Hz), α (8-13 Hz), β (13-30 Hz), and γ (30-80 Hz). A summary of the relationship between these frequency bands and childhood epilepsy is shown in Table III. We employed the 4th order IIR mid-pass filter to divide the (1-80 Hz) sEEG signal into 5 rhythms [23].

B. Functional Connectivity Matrix Construction
EEGs are essentially nonlinear and non-smooth signals. Direct brain network analysis by the above brain rhythm signal generally relies on using the wave amplitude and phase components of sEEG. We use the Hilbert's variational method to decompose the signal to find the instantaneous phase and construct the brain network correlation matrix based on the phase synchronization.
Assume the above brain rhythm signal is s(t), the analytic signal calculated by Hilbert transform is denoted as z(t): where Here m(t) is the instantaneous amplitude and θ(t) is the phase, p.v. is the Corsi criterion value [30]. The real signal s(t) is extended to the complex plane by Hilbert transform to satisfy the Corsi-Riemann equation. The phase angle of the sEEG signal at moment t for different rhythms can be calculated by the above equation. Supposing that for two channels of the same rhythm the sEEG is X (t) and Y (t), respectively, then the phase difference between them can be expressed as: When the phase difference between two signals is a constant value, indicating that they are synchronized. The phase-locked value (PLV) is an important parameter for quantifying signal synchronization and is often used to quantify the task-induced changes in remote synchronization of neural activity [31], [32]. Assume that the phase difference calculated by (4) is ϕ n (t), and PLV is defined as: where N represents the sequence length of phase difference and the calculated PLV varies from 0 to 1. For PLV of 1, meaning that there has a constant phase difference time series and the phase angle is distributed uniformly within [0,2π). In this study, to fully utilize the rich spatio-temporal and frequency domain features of sEEG, the PLV values between different channel are calculated, which can effectively address the inherent limitations of sEEG signals, such as low spatial resolution. A functional correlation matrix of sEEG on the 5 brain rhythms is constructed as follows: 1) Define the network node. The 19 electrodes on the sEEG collector device are defined as network nodes. 2) Estimated inter-node connections. Computing the connection values between all nodes by (5). 3) Generate an adjacency matrix to evaluate brain network connectivity patterns. The connectivity values between all nodes are integrated to make an adjacency matrix. The brain network connectivity patterns are analyzed by a threshold filtering method on the strength of connections between nodes. 4) Secondary interpolation. It is to improve the spatial resolution of the PLV correlation matrix. The spatial resolution of the original 19 × 19 PLV correlation matrix is enhanced to 256 × 256 by the bilinear interpolation algorithm in this study. The overall correlation matrix extraction of 5 different states is shown in Fig. 3. As observed, after interpolation, rich detail information can be obtained, which can compensate for the poor spatial resolution of EEGs. Differences mainly reflect in the frontalanterior region, occipital region, and central region between the placed electrodes of the connections. The values of PLV connections between these channels with the approaching onset show a trend of first weakening and then enhancing. This pattern is evident in the γ band, while the δ band does not change.

C. Functional Connectivity Analysis
The different properties and functions of neurons in the brain constitute the complex neural loops and neural networks at different levels. These spatially separated different functional regions and structures interact with each other to achieve the complex functions of the brain [33]. IS seizures have been thought to result from hyper-synchronized activity of neuronal cells in these areas when an imbalance between excitation and inhibition in the cerebral cortex occurs. When the level of synchronization increases, the waveform frequency increases, meaning that more neurons are correspondingly synchronized, then the amplitude of the sEEG becomes larger, leading to the generation of spikes or spikes. Notably, once this phenomenon ends, the brain returns to normal function. By setting up nodes and defining edges, complex relationships between different brain regions can be mapped into the network topology through complex networks. Such mapping has the advantages of being purposeful, interpretable, etc., which facilitates understanding and exploring the complex dynamics and behaviors of the brain, and is beneficial for exploring mechanisms ranging from apparently normal brain activity to epileptic seizures.
Improving the neurophysiological understanding of pre-seizure states to determine whether universal mechanisms exist for the various pre-seizure states observed is beneficial for enhancing the seizure prediction. The study aims to reveal which physiological aspects underlie the predictive features of the EEG through statistical analysis and deep learning models. Here, the variability of the mean PLV correlation matrix for all samples in 5 different states of IS is analyzed by traditional methods. Thresholding of the association matrix is directly based on determining the strength of connectivity of brain regions. Different threshold settings often lead to different connectivity results. In this study, the PLVs of multiple conditions are pooled to determine the thresholds by analyzing the data distribution. Therefore, it can increase the signal-to-noise ratio (SNR) and provide a threshold that is more robust to non-representative data.
In Fig. 4 (a), the distribution of PLV values for all states does not show a normal distribution and each state should have its own threshold due to that the average connection value changes across states. Therefore, by analyzing the distribution of PLV values in Fig. 4 (a), it can be observed that the threshold is determined by the sum of the variance and the median of the data for each state. Those greater than this threshold are considered to have enhanced connectivity on that edge, and those less than the threshold are considered to have no connectivity. The brain network connectivity topology is finally constructed and shown in Fig. 4 (b). The correlation matrix was used as the input of the analytical model in this study, and also being used to see more clearly the variability of the connection matrix in different states. The corresponding results were presented as the binarized correlation matrix in Fig. 5. We can find that (1) in δ band, Fp1, Fp2, F3, F4, C3, C4 electrodes show synchronization variability with other electrodes, which also varies significantly in the Preictal, SPH, Ictal, and Postictal phases. (2) In θ , α and β bands, the variability of the connections between P3, P4, O1, O2 electrodes and other electrodes is mainly in the SPH and Ictal phases. (3) In the γ band, the lead connection strengths of the Preictal, SPH, Ictal, and Posticta states all showed variability. In summary, the connections between the prefrontal, occipital, and central regions of different rhythms showed high variability. The PLV of mean correlation matrices in different states was tested by paired t-test. The variability between the data was demonstrated when the p-value was less than 0.05, which was indicated by * . The more number of * represents the larger variability, and the results are shown in Fig. 6. The figure clearly shows that in the δ band, only the Ictal and Postictal phases present variability, with no variability between the other states. While in θ , α, β, and γ bands, each state presents different variability between them.

D. IS Seizure Prediction Model
The purpose of this study is to distinguish the different stages of the PLV correlation matrix with deep learning models. Particularly, the residual network model (Resnet18) is applied for IS prediction. Comparing with conventional popular deep networks, such as Alexnet, VGG, etc., Resnet has the following advantages in our study: (1) In the residual blocks of Resnet, the input can be propagated forward faster through the cross-layer data lines to avoid the problem of gradient disappearance. (2) Different Resnet models can be obtained by configuring different number of channels and the number of residual blocks in the module, with simple structure and convenient parameter modification. The Resnet18 structure is shown in the predictive analytic model section in Fig. 1. There mainly consists of three parts in Resnet18: 4 residual modules constructed by 16 convolutional layers, a convolutional layer and a fully connected layer. The kernel size of each convolutional layer is 3. The number of convolutional kernels of the four residual modules is 64, 128, 256, and 512, respectively.

Algorithm 1 Seizure Prediction Analysis Algorithm
Number of categories K = 5. Output: Statistical analysis results, Predictor identification results 1: for k = 1 to K do 2: Extraction of brain rhythms 3: Calculate the PLV correlation matrix 4: end for 5: Obtaining functional brain connectivity matrix 6: Calculation of the average correlation matrix and statistical analysis 7: for i = 1 to i /32 do 8: Randomly select a batch the PLV correlation matrix N i from N 9: Perform feature extraction on X through forward propagation 10: Calculate the loss and compute the loss through back propagation 11: Update the network parameters 12: end for 13: return Evaluation Model In summary, in building the prediction model, sEEG signals of 19 electrodes were selected as nodes for the construction of the functional network. The connectivity between the nodes was calculated by PLV. The brain network of 5 rhythms in different states was constructed, and the connectivity analysis was accomplished by thresholding and statistical methods. The two-dimensional enhanced correlation matrix was used as input to Resnet18 for IS seizure predictive. Algorithm 1 summarizes the detailed steps.

IV. RESULTS AND DISCUSSIONS
For analysis, sEEG data recorded from 25 IS patients were involved for study in this section. The total sEEG segmentation samples adopted in the experiments are 40066, namely, there include 40066 PLV correlation matrix for training, testing and validation, and each PLV correlation matrix is obtained from 4s sEEG. For each subject, the samples for each state are shown in Fig. 7. We selected 19 EEG electrodes as the brain network nodes (Fp1, Fp2, F3, F4, F7, F8, C3, C4, T3, T4, P3, P4, T5, T6, O1, O2, Fz, Cz and Pz), and constructed the PLV correlation matrices for 5 rhythms in 5 different states. The connectivity between brain regions is analyzed using thresholding and binarization, and the association of different rhythms with each state of IS by statistical is also studied. The correlation matrix after quadratic interpolation was used for the deep residual network model learning. The stochastic gradient descent (SGD) was applied as the neural network optimizer and the cross entropy loss was adopted as the loss function in optimization. The learning rate is initialized to 0.001 and reduced to 1 10 of the previous for every 10 epochs with the batch size of 32. The ratio of training, testing and validation dataset is set to be 8:1:1. Precision, recall, Specificity, F_score and accuracy are derived for performance evaluation. Particulary, the parameter settings of the model did not change when the individual difference analysis model was constructed.

A. Brain Rhythm Correlation Analysis
The resolution-enhanced correlation matrices of δ, θ , α, β, and γ rhythms are input to Resnet18 for analyzing the correlation between different rhythms and IS seizure status. The experiments were performed independently for single rhythm, two rhythm and multi rhythm analyses. The radar plot in Fig. 8 represents the performance of using Single Rhythm as input for model training and testing. Fig. 9 shows the effect of the action between the two rhythms on the accuracy of the model. Table IV shows the performance using some of the multi-rhythmic features as input. It is evident from the above three results that (1) the higher the frequency in the single rhythm, the greater the positive effect on the identification of the state of the IS. (2) The complementarity of useful information between features of β and γ band is the strongest, while that of low frequency bands is opposite.
(3) The combined features of multiple rhythms are better than those of using single rhythms. In terms of accuracy, the model built on the combined features of θ , α, β, and γ bands offers the best performance.
The δ wave is the main component of the childhood sEEG. But merely using the δ band features, the IS seizure prediction performance is not satisfactory. The result does not change significantly throughout the transition from normal EEG to seizure, as reflected in Figs. 8, 9, and Table IV. By contrast, the higher frequency bands features are more effective in identifying different states of IS. This observation is consistent with the statistical significance analysis results in Fig. 6. This phenomenon was due to the high activity of neuronal cells in various cognitive and motor brain regions before and after seizures and the increase of high frequency layers in brain waves. To be a meaningful early warning system in clinical usage, the pre-seizure biomarker should be able to be detected early to shorten the time to false warning. The PLV correlation matrix was demonstrated effectively in combining with deep network models for the brain network changes measuring in different states of IS seizures, which can be potentially used as a pre-seizure biomarker for IS seizure prediction. From the above analysis on the IS prediction, the overall  observations are that: 1) the PLV matrices of brain rhythms in the middle and high frequency bands changed significantly at each stage of IS episodes, 2) there was greater informational complementarity between θ , α, β, and γ band pairs, 3) The PLV correlation matrix can be adopted as the biomarker for IS seizure prediction.

B. Model Prediction Analysis
An additional aim of this study is to assess the feasibility of the PLV association matrix as predictive features or preictal biomarkers in predicting the exact timing of seizure. We selected feature vectors consisting of the PLV association matrices in θ , α, β, and γ rhythms as biomarkers of preictal states based on the above analysis. The confusion matrix of all 5 different states of IS epilepsy is shown in Fig. 10(a). It is clear that the biomarker can effectively distinguish the postictal state, while the remaining 4 states are relatively not satisfactory when comparing with the postictal. Based on discussions with neurologists, we have chosen the continuous sEEG signals with too much similarity between the signals of different states. Fig. 11 visualizes the features extracted by Resnet18 for comparison through t-SNE. Fig. 11 also directly validates the clinical experience. It can be clearly seen from the figure that the postictal state is clearly distinguished from the others, while the remaining states are not as distinguishable.
In addition, the receiver operating characteristic (ROC) curves of the 5 states are depicted in Fig. 10(b). The maximum value of the area under the curve (AUC) is 0.9804. The AUC values for both the micro-average and macro-average reached approximately 0.94. The prediction performance was evaluated based on the ROC curve, which relates the true positive rate to the false positive rate. The AUC can be used to quantify the performance of the algorithm. The ROC curve and the AUC area, although demonstrating the good performance of the Resnet18 model, are not as effective as its detection, probably because of the large individual differences.
In addition, we have visualized the feature vectors learned by the residual blocks in Resnet18, as shown in Fig. 12. It is clear that (1) after the multi-layer convolution, some correlations in the correlation matrix are amplified in characterizing different epilepsy states of Infantile Spasms. These discriminating areas are mainly in the regions marked by the red boxes in Fig. 5. (2) The distinctions learned from the data-driven model and the results of the statistical analysis consistently showed that the connectivity between prefrontal, occipital, and central regions has significantly difference among five different states. (3) The PLV connectivity changes are most significant in the prefrontal, occipital and central regions of the γ waves.
Since few research has been done on generalized epilepsy seizure prediction, to exemplify the possibility of this Confusion matrix of the Resnet18 and the ROC curve of 5 categories (micro-average: Each class has equal weight, the outcomes aggregated across all categories and the metric is obtained by aggregating all outcomes, macro-average: Each class has equal weight, the metric within each class and the average metrics across categories are obtained).
combination of features as a determinant of epileptic states, we compared the results with existing continuous signal based focal epilepsy prediction algorithms (i.e., classification of epileptic states into 5 states), as shown in Table V. The major difference between focal epilepsy and generalized epilepsy is whether there are significant changes in EEG features in the preictal state. Although the two kinds of epilepsy predictions are not fully comparable, the prediction algorithm for focal epilepsy has some implications for generalized epilepsy prediction. Here, we compare the data types, models, processing methods and recognition results between existing methods and ours. In [20], the performance is tested on a balanced dataset by removing the samples in non-episode periods. But in real applications, there have far more data in the non-ictal phase than in the episodic phase, making the testing not suitable under realistic conditions. In [35], a good recognition can be achieved, but the algorithm is limited for small amounts of invasive EEGs. In [34], [36], and [37], all studies are performed in simulated real seizures. Comparing with the three methods, our proposed model performance achieved good accuracy, sensitivity,and specificity. The model showed characteristics of low misdiagnosis rate and high underdiagnosis rate. Combining the results of the confusion matrix, it is clear that the possibility of the postictal state being misclassified is low, while the ictal and SPH are mostly misclassified, which is caused by the high similarity of the data from 0 to 40 minutes before the infantile spasm seizure. The above comparisons indirectly demonstrate the feasibility of PLV as a biomarker for IS seizure prediction. Meanwhile, the brain network features have been explored to discuss the mechanism during IS episodes through statistical and deep learning methods. We use the non-invasive scalp EEGs, which is better than [34].
Overall, it is clear that (1) the PLV features of θ , α, β, and γ rhythms are the most optimal combination features in IS seizure prediction. (2) The visualization results of the features learned by the model demonstrate the finding

C. Individual Variability Analysis
Seizure prediction algorithms should be specific to individuals and/or the intervention used at the time of prediction. The prediction time associated with seizures and the duration of the assumed pre-seizure period varied considerably among different individuals. The results in Fig. 10(b) also revealed that IS seizure prediction has obvious individual differences. Therefore, an independent modeling analysis is performed for the first 17 IS patients in individual, and the results are shown in Fig. 13, respectively. Due to using few sample sizes in individual testing, the model performance are generally degraded in the SPH and the preictal states when comparing with the testing on mixed samples. Particularly, the detection performance on subjects 5, 8, 16, and 17 are generally better, it may be due to the presence of in-sample testing and selection bias in the study, the limited number of episodes in the data, and the severe data imbalance between the individual states of the episodes. Therefore, to improve the performance of machine learning models, it is important to improve the neurophysiological understanding of preictal states to determine whether the prevalent mechanisms lead to the various preictal states.
In summary, we evaluated the possibility of PLV association matrix as a biomarker for IS seizure prediction by brain rhythm correlation analysis, model prediction analysis, and individual variability analysis. The consistence of the results by statistical analysis with the results of deep learning models were also demonstrated. These results are useful as a guide for constructing generalized seizure predictors. But there still have some limitations, including (1) more epilepsy seizure data of infantile spasms are still needed for analysis, (2) the performance testing is carried out using the post-acquisition data in the paper, but analysis on real-time seizure prediction on new data is still lacking, (3) besides Resnet, analyses on more up-to-date models should be conducted.

V. CONCLUSION
In this study, we designed an analytical framework for IS seizure prediction based on statistical analysis with Resnet18 as the predictor, aiming to characterize the local and overall properties of the epilepsy network and to help understanding how seizures occur in the epilepsy network. The sEEG data collected from 25 IS subjects were involved in the study. The extracted continuous sEEG of generalized seizures were divided into 5 different periods, namely (Interictal, Preictal, SPH, Ictal, and Postictal). The epilepsy network of PLV was constructed through the graph theory, and the mechanisms of seizure changes in the brain network during the transitional phase of seizures were analyzed by statistical methods. Resnet18 was also employed as the IS seizure predictor to find the optimal combination of feature vectors of correlation matrices with different rhythms, which has been further used to validate the PLV correlation matrix as a biomarker of generalized preictal seizures in terms of intergroup and individual differences. Finally, the synergy between data, models, seizure mechanisms and algorithms is explained through comprehensive statistical analysis, deep network prediction and visualizations. It is observed that how to effectively address the overfitting issue casued by data imbalance and large individual differences in the SPH period becomes critical in our future research in enhancing the prediction performance.

ETHICAL STANDARDS
This study has been approved by the Second Affiliated Hospital of Zhejiang University and registered in Chinese Clinical Trail Registry (ChiCTR1900020726). All patients gave their informed consent prior to their inclusion in the study.