Classifying Routine Clinical Electroencephalograms With Multivariate Iterative Filtering and Convolutional Neural Networks

Electroencephalogram (EEG) is widely used in basic and clinical neuroscience to explore neural states in various populations, and classifying these EEG recordings is a fundamental challenge. While machine learning shows promising results in classifying long multivariate time series, optimal prediction models and feature extraction methods for EEG classification remain elusive. Our study addressed the problem of EEG classification under the framework of brain age prediction, applying a deep learning model on EEG time series. We hypothesized that decomposing EEG signals into oscillatory modes would yield more accurate age predictions than using raw or canonically frequency-filtered EEG. Specifically, we employed multivariate intrinsic mode functions (MIMFs), an empirical mode decomposition (EMD) variant based on multivariate iterative filtering (MIF), with a convolutional neural network (CNN) model. Testing a large dataset of routine clinical EEG scans (n = 6540) from patients aged 1 to 103 years, we found that an ad-hoc CNN model without fine-tuning could reasonably predict brain age from EEGs. Crucially, MIMF decomposition significantly improved performance compared to canonical brain rhythms (from delta to lower gamma oscillations). Our approach achieved a mean absolute error (MAE) of 13.76 ± 0.33 and a correlation coefficient of 0.64 ± 0.01 in brain age prediction over the entire lifespan. Our findings indicate that CNN models applied to EEGs, preserving their original temporal structure, remains a promising framework for EEG classification, wherein the adaptive signal decompositions such as the MIF can enhance CNN models’ performance in this task.


I. INTRODUCTION
E LECTROENCEPHALOGRAM (EEG) is used for quanti- fying the brain's states by measuring time-varying electric potential differences across the scalp.EEGs can differentiate various neural states such as eyes-closed and eyes-open, resting or alert conditions [1], awake and sleep [2], emotional arousal [3], and others.EEG is affordable and widely available methodology applied in clinical, cognitive, and basic neuroscience.In the classification tasks based on machine learning approaches, EEG has been successfully used for emotion recognition [1], motor imagery identification tasks [4], seizure detection [5], brain injury monitoring [6], Alzheimer's disease classification [7], depression detection [8], sex classification [9], and classification of abnormal EEGs [10].
Brain age, or neurophysiological age, quantifies individual neural mechanisms as deviations from the population average.Serving as a potent predictive marker, brain age can highlight susceptibility to various mental health conditions.A significant discrepancy between an individual's chronological age and their predicted neurophysiological age may signal an elevated risk for cognitive decline and conditions such as Alzheimer's or Parkinson's disease, schizophrenia, and the severity of clinical symptoms.This metric can provide a specific window into the potential onset and progression of neurodegenerative and psychiatric disorders [11].
Studies based on EEG in tandem with machine learning models may employ two approaches.These studies treat EEG data either as a temporal sequence, capitalizing on the time-dependent nature of brain signals, or as a vector of extracted features, harnessing key characteristics for further analysis and prediction.Time series are high dimensional data that are more suited for deep learning approaches compared to the classical (non-neural) machine learning methods.For instance, the work in [21] aims to predict the brain age of the participants using EEGs as a direct input to a bi-long short term memory (Bi-LSTM) and gated recurrent unit (GRU) models.The age of participants is categorized into six age groups, and classification accuracy is obtained as the metric for analyzing the performance of the models.Their model achieves an accuracy of 93.69% for predicting brain age.
The nature of EEG features extracted (evaluated) from EEG recordings remains diverse [2], [19], [22].These EEG features can be defined within the frequency domain, exemplified by spectral power, within the time-frequency domain [23], as showcased through spectrograms, or within the temporal domain, indicated by a range of linear and non-linear features.
The study [19] extracted five distinct sets of features from EEG data with the aim of predicting brain age.These feature sets include amplitude, range, spectral, connectivity, and fractal dimension domain features.Finally, those features were given as input to stack ensemble of support vector regression, extreme gradient boosting (XGBoost), and Gaussian polynomial regression to obtain a mean absolute error (MAE) of 6.87 years.In another study [2], the authors extracted features based on their previous work [24] for predicting brain age.Their model obtains an MAE of 7.6 years for typically aging participants.A new BLSTM-LSTM model is proposed in [22] to predict the age and sex of participants using discrete wavelet transforms extracted from EEGs.Their analysis showed that compared to other EEG rhythms, the beta band predicts individual's age and sex more accurately.This study obtained an accuracy of 93.7% in predicting the participants' age.
Advanced EEG preprocessing is typically applied before feature extraction, as the artifacts present in raw data may bias the estimation of EEG metrics.Empirical mode decomposition (EMD) [25] has been used as a preprocessing stage for further feature extraction from EEG signals in several studies [26], [27].In our study, we hypothesized that approaches such as multivariate iterative filtering (MIF) can provide an improvement in the performance of prediction models, which treat EEG as temporal sequences.
More specifically, we applied a novel approach for predicting brain age, which combines multivariate intrinsic mode functions (MIMFs) as a variant of the EMD algorithms known as MIF, with a convolutional neural network (CNN) designed ad-hoc for the task of brain age prediction.Previous studies have applied several versions of EMD for extracting features in the temporal domain [28], [29].Our findings have further supported the view that the innovative MIF technique, built upon EMD, could enhance the prediction accuracy, when compared to traditional methods based on wide-spectrum EEG or filters corresponding to canonical frequency bands of brain rhythms.Our approach is schematically illustrated as a block diagram in Fig. 1.
The primary aim of our study is to enhance the knowledge representation of routine clinical EEGs for classification purposes, ensuring the preservation of their temporal characteristics.Methodologically, we evaluate and contrast two distinct strategies for the signal decomposition of EEG time series, particularly in the context of estimating brain age: (i) the conventional method of band-pass filtering, which reconstructs canonical brain rhythms commonly used in cognitive and clinical neuroscience, and (ii) a novel technique of MIF.This latter method decomposes a signal into oscillatory modes and is frequently employed in the field of engineering sciences.Our findings indicate that although band-pass filtering yields satisfactory results, in particular based on alpha and beta rhythms, in accordance with previous studies, the components derived from MIF demonstrate superior performance.This suggests that adopting a similar MIF approach could significantly enhance the accuracy of clinical label predictions and the determination of parameters in real-world clinical scenarios.
Our research makes several significant contributions to the field of neurological study and clinical practice: 1) We introduced a custom-designed CNN architecture explicitly tailored for predicting brain age from EEG time series data.This novel approach allows for the direct input of time series data into the predictive model.2) In an effort to maintain the practical applicability of our findings, we applied very basic preprocessing of EEG data.This decision ensures that the EEGs remain as close to their original clinical state as possible, facilitating easier adoption by healthcare professionals.3) We employed a dual strategy for decomposing EEG recordings into MIMF components and canonical frequency oscillations (encompassing delta, theta, alpha, beta, and lower gamma frequencies).4) A comparative analysis was conducted to evaluate the efficacy of our CNN architecture using two distinct methods of signal decomposition, providing insights into the most effective approaches for EEG analysis.5) Given that our study is grounded in the analysis of routine clinical EEGs, recorded from both in-patients and out-patients and prioritizes preserving the original temporal structure of the data, our findings are immediately applicable in real-world contexts.This positions our work as a valuable asset for developing decisionsupport systems to enhance clinical EEG evaluations.The rest of the paper is structured as follows: Section II describes the EEG dataset used in this study.Section III discusses the methods we developed, tested, and validated.Results are presented in section IV.Section V discusses our approach's performance, followed by a conclusion in section VI.

II. DATASET
The EEG dataset employed in this study was recorded over a seven-year period, from 2012 to 2018, at a public hospital of British Columbia, in the process of diagnostic evaluation.The original dataset included a total of 7048 participants, with a wide age range spanning from 1 to 103 years.The dataset was highly heterogeneous.It included virtually all the EEGs from the given public hospital without any selection bias, including EEGs from both in-patients and out-patients.
The hardware and firmware used for EEG recordings were similar across all EEG stations.Natus Xltek EEG32U EEG amplifier and gold-cup electrodes were equipped at each setup.The setup followed the standard 10/20 system positioning, and twenty electrodes, namely, FP1, FPZ, FP2, F3, F4, F7, F8, FZ, T3, T4, T5, T6, C3, C4, CZ, P3, P4, PZ, O1, and O2 were used for recording EEG.The Figure 3 illustrates the EEG electrode placement.In addition, electrooculography (EOG)  and electrocardiography (ECG) were recorded using two pairs of electrodes.In general, the locations of the reference and ground electrodes were unknown.The EEG recordings ranged from 10 minutes to several hours (average duration of about 35 minutes) in length.The sampling frequency for the EEGs was either 500 Hz or 512 Hz.
The Research Ethics Board at Simon Fraser University and Fraser Health Authority approved the ethics protocol on 1st April 2022 (protocol number: H18-02728).

III. METHODOLOGY A. Data Preprocessing
We applied a minimalistic preprocessing pipeline for EEG.First, we converted the EEG recordings from its native Natus's proprietary format into the European data format (EDF) using Natus's Neuroworks software.Subsequently, the data were anonymized with the PyEDFlib library [30] in Python.We then applied a zero-phase, overlap-add finite impulse response (FIR) band-pass filter with a Hamming window across a frequency range of 0.5 Hz to 55 Hz, as implemented by the MNE-Python library.This frequency range was chosen to capture EEG activity across delta, theta, alpha, beta, and lowergamma bands, while excluding interference above the 60 Hz Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
power line frequency.The EEG data were resampled to a frequency of 128 Hz to standardize the dataset.
In the subsequent stage, we screened each EEG recording to identify flat intervals (signified by digital zeros with a minimum peak-to-peak value threshold of 1e-6) and segments corresponding to photic stimulation and hyperventilation procedures.We aimed to extract a randomly selected 32-second segment of resting-state EEG from each processed sample.This step failed in some cases and those cases were discarded.We believe that the 32-second segment is relatively long to properly include the lowest frequencies such as delta and theta.At the same time, we did not want to increase the complexity of our models which take time series as input.Hence, this arbitrary choice of 32-second was a compromise between the need to include neurophysiologically relevant rhythms and the need not to increase the computational complexity.The final step involved normalizing the time series for each EEG recording, adjusting each channel-specific signal to achieve a standardized mean amplitude of zero and a variance of one across time points.The EEG preprocessing pipeline is detailed schematically in Figure 4, which provides a visual overview of the steps undertaken to prepare the final dataset of n = 6540 samples for analysis.

B. Multivariate Iterative Filtering
EEG signals exhibit characteristics similar to non-stationary signals, which can introduce biases into EEG metrics.To address this, EMD was designed to break down nonstationary signals into narrow-band components.An alternative to EMD, known as iterative filtering, was introduced in [31].This method applies a moving average filter that iteratively processes the signal, decomposing it into narrow-band oscillatory modes termed intrinsic mode functions (IMFs).
The natural extension of this approach for multi-channel signals has been proposed in [4] and [32].This extension has proved to be very useful as biological signals such as EEG recordings are generally collected using multiple electrode systems for improving spatial resolution.Additionally, when channel-by-channel analysis is carried out, univariate decomposition techniques like iterative filtering fail to produce distinct IMFs across different channels because of the random nature of EEG data and its low signal-to-noise ratio.MIF effectively tackles this issue.For each channel present in the signal, MIF applies a unique moving average filter to produce an equal number of MIMF bands with similar frequency content on each channel.The maximum value of the signal extrema from all channels determines the length of the moving average filter.The length of the moving average filter is calculated as, where κ is a constant, N is the total number of samples, and e is the highest extrema value among all channels present in the signal.The cutoff frequency of the filter at each stage is inversely proportional to the length (L) of the moving average filter [32].The MIF approach is presented below in the form of the algorithm (Algorithm 1).

C. Convolutional Neural Networks
CNN [33] is a specialized deep learning architecture designed to process structured data such as images or timeseries.It comprises numerous convolution blocks that extract pertinent information from the input data.
Generally, a typical convolution block comprises three fundamental layers: the convolution layer, the pooling layer, and the dense layer.A dropout layer may also be incorporated for regularization purposes.The role of each layer is explained below.// where n e is the number of extrema in x 6: i = 1 7: while stopping criterion as specified in [4] and [25] is not satisfied do 9: find number of extrema for all channels, E ∈ R ch×1 ; 10: Using maximum (E), compute the filter length L i ; 11: design moving average filter w i (n) of length L i ; 12: 13: end while x = x − x i 17: end while 18: MIMF = MIMF ∪ {x} weights of the filter kernel.The filter kernel size depends on the application in which it is used (for instance, larger kernel size is useful for the case of spatio-temporal data to capture long-range spatial and temporal dependencies).Mathematically, convolution operation ( * ) can be defined [33] as, where x is the input, w is the filter, and t represents time.
2) Pooling Layer: The pooling layer acts as a dimensionality reduction layer.It reduces the number of parameters in the input by performing selection operation such as average pooling, which takes the average of outputs of the previous layer.3) Dense Layer: The dense layer (also known as fully connected layer) acts as a predictor that uses the features generated by previous layers and predicts the required output.4) Dropout Layer: The dropout layer acts as a regularization layer to prevent the architecture from falling into overfitting.The regularization is done by randomly dropping some neurons while keeping the other neurons unmodified in the hidden layer, hence nullifying the contribution of the dropped neurons toward the next layer.

D. Proposed Approach
In our study, we evaluated three distinct scenarios or versions of EEG recordings to ascertain their effectiveness as input for CNN model designed to predict patients' brain age.More specifically, we contrasted three types of filters applied to EEG data: a broad-spectrum band-pass filter ranging from 1 to 55 Hz (which most closely approximates the original EEG), five canonical bands spanning from delta to lower gamma frequencies, and six bands delineated by the MIMFs.
We propose a CNN architecture, designed ad-hoc for the task of predicting brain age from EEGs.The CNN functions by accepting multivariate time-series data, represented by EEGs in our context, and predicting a scalar value that corresponds to the patients' brain age.Our CNN model is constructed with five convolution blocks that encompass convolution, pooling, and dropout layers.We consider average pooling over max pooling because average pooling smoothly extracts features from the data, whereas max pooling ignores a large chunk of data which is not desirable in our case.Also, the padding is kept to be the same across all blocks.We chose our activation function as rectified linear unit (ReLU) to prevent the problem of vanishing gradients.The ReLU function is mathematically represented as, where max is the maximum operator.
Additionally, the dropout rate was kept to be 25% in all the convolution blocks.Table I summarizes our proposed CNN architecture.We extracted MIMF bands using the MIF technique and used them as feature inputs to our CNN model to predict the brain age.We generated six MIMF bands, namely MIMF1, MIMF2, MIMF3, MIMF4, MIMF5, and MIMF6, where MIMF1 corresponds to the higher frequency regime, and MIMF6 corresponds to the lower frequency regime.These six MIMF bands essentially capture the entire range of frequencies (0.5 Hz -55 Hz) present in the preprocessed EEG signals.
The data preprocessing was done in Python.The MIMF bands were generated using MATLAB 2022b version.The CNN architecture and training scheme were implemented using Tensorflow and Keras.We ran each scenario 50 times to incorporate any variability in the performance.Model training was done for 20 epochs for each scenario, and each epoch took about ∼ 21 minutes to complete.We randomly split the dataset in the ratio of 70%:30% for train and test splits.

IV. RESULTS
To ensure robustness and reliability, our final results represent the average performance metrics, calculated from the test splits of the dataset over 50 separate runs.

A. Data Visualization
The preprocessed EEG data are illustrated in Fig. 5.In this example, we showed five channels, namely C3, C4, CZ, F3, and F4, for plotting the preprocessed EEG.The extracted MIMF bands obtained from the five channels of the preprocessed EEG (Fig. 5) are shown in Fig. 6.The power spectral density plots obtained from the extracted MIMF bands for the above mentioned five channels are shown in Fig. 7.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Band-Pass Filtered EEG
Our CNN model predicted the brain age of the test participants in the broad-spectrum (0.5 -55 Hz) original EEG data with the MAE of 15.99 ± 0.37 years.The Pearson correlation coefficient between the actual age and the predicted age for the original EEG was 0.49 ± 0.03, and the explained variance was 0.23 ± 0.03.The scatter plot of the actual age versus the predicted age is shown in Fig. 8a.Each point represents a participant from the test split of the dataset fitted with a linear regression line.For the ideal case, the line should have been with a slope of 45 degrees with an MAE of 0 and a correlation coefficient of 1.The regression line for the original EEG is far from having a 45-degree slope because of a typical statistical phenomenon called "regression to the mean".

C. Canonical Frequency Bands
Compared to the broad-spectrum EEG, the CNN model demonstrated enhanced performance when applied to canonical EEG rhythms extracted from the original EEG signals, with certain rhythms standing out.Specifically, alpha (8-13 Hz) and beta (13-30 Hz) rhythms yielded results comparable to those of the original EEG, boasting an MAE of 15.64 ± 0.29 years and 15.99 ± 0.42 years, respectively.Furthermore, correlation coefficients of 0.51 ± 0.02 and 0.49 ± 0.02 were noted for alpha and beta rhythms, respectively.Other rhythms, namely delta, theta, and lower gamma, fell short in performance relative to alpha and beta rhythms.The alpha rhythm's scatter plot is depicted in Fig. 8b.A comparative representation of the performance metrics across all EEG rhythms is consolidated in Table II.

D. MIMF Bands
The CNN model's performance varied across different MIMF bands specifically, MIMF1 through MIMF6, extracted from the broad-spectrum EEG.Superior performance was observed with MIMF2, MIMF3, and MIMF4 bands when compared to the broad-spectrum EEG, while MIMF1, MIMF5, and MIMF6 bands demonstrate performance levels akin to those of the original EEG.
Figure 8c shows a scatter plot illustrating the correlation between actual age and predicted age for the MIMF3 band.
A summary of performance metrics for all MIMF bands is presented in Table II.

V. DISCUSSION
Our study explored the hypothesis that breaking down EEG signals into narrow-band components could enhance the efficacy of deep learning models, particularly those that maintain the temporal structure of EEG recordings.To this end, we developed an ad-hoc CNN model applied to predict brain age from routine clinical EEG.Our findings revealed that a new method, which involves decomposing EEG recordings into MIMF components, significantly outperformed traditional approaches that rely on filtering EEG signals into canonical frequency bands such as delta, theta, alpha, beta, and lower gamma oscillations, as well as the use of broad-spectrum EEG signals.While not the initial focus of our investigation, we also successfully designed, implemented, and validated a CNN model capable of estimating patients' biological age based on multivariate EEG recordings, critically without compromising the temporal integrity of the EEG data.
Specifically, we used the MIF technique to extract MIMF bands from minimally preprocessed EEGs.We tested a range of inputs including the original broad-spectrum band-pass filtered EEG and the six MIMF bands.For a comprehensive comparison, we also tested our CNN model on EEG rhythms extracted from the original EEG, including delta, theta, alpha, beta, and lower gamma rhythms.We ensured robust testing of our approach by utilizing an extensive dataset of clinical EEGs, recorded from a large, diverse participant pool (n = 6540) spanning a wide age range (1 to 103 years).
Our approach focused on maintaining clinical relevance by preserving EEGs in the time domain, facilitating simplified brain analysis.To enhance performance within this constraint, we employed MIF-based feature extraction.This method surpasses alternatives like spectrogram computation, which require complex processing and could complicate clinical decision-making.We deployed our CNN model as a regressor to predict brain age, ensuring both accuracy and clinician friendly results.
Our work's key message is about a choice of knowledge representation (features) based on real-world routine clinical EEG data.We wanted to preserve the dynamics and original structure of EEG signal.MIMF bands obtained from MIF technique proved to be a promising knowledge representation method for our case.Within this context, our requirement was to have a model tailored to time series analysis.Consequently, we proposed a CNN model specifically designed for the task of brain age prediction.
The performance of our CNN architecture on the original EEG is better compared to another study performed on the same dataset [18].For the case of EEG rhythms extracted from the EEG recordings, the performance of alpha and beta rhythms was comparable to that of the original EEG, whereas the performance of other rhythms, namely delta, gamma, and theta, was inferior to that of the alpha and beta rhythms.This suggests that alpha (8-13 Hz) and beta (13-30 Hz) rhythms incorporate sensitive markers for estimating brain age.
For the case of six MIMF bands, MIMF1, MIMF2, MIMF3, MIMF4, MIMF5, and MIMF6, their performance with the CNN architecture was either comparable or superior as compared to the performance of the original EEG and the study performed in [18].The best performance was obtained with the MIMF3 band with the MAE of 13.76 ± 0.33 years and the Pearson correlation coefficient of 0.64 ± 0.01.The performance of the MIMF2 and MIMF4 bands was also great, with correlation coefficients of 0.62 ± 0.01 and 0.58 ± 0.01 for MIMF2 and MIMF4, respectively.The other three MIMF bands, i.e., MIMF1, MIMF5, and MIMF6, had a performance similar to that of the original EEG.This suggests that the MIMF2, MIMF3, and MIMF4 bands hold hidden important information about the nature of a participant's brain signals.These MIMF bands, along with the CNN architecture, can be proven important for estimating the brain age of a participant based on the participant's EEG, which is a significant biomarker for analyzing the health of an individual's brain.
The results show that the MIMFs (MIMF2, MIMF3, and MIMF4) performed better than predefined EEG rhythms like alpha and theta.MIF-based adaptive extraction of frequency bands are more suitable than using predefined bands.MIFbased adaptive decomposition accounts for the variability of EEG signals in different participants, even at different times.Figure 9 shows the comparison of p-values.We essentially compare the performance of CNN model across 12 different inputs.For that, we ran a series of Mann-Whitney U test [34] pairwise for all the combinations among the 12 different inputs.If the p-value is very low, it represents that the difference in the performance of the two inputs considered in a pair is significant.As evident from the figure, the p-values for MIMF2 and MIMF3 are the lowest, showing the statistically significant difference in the performance as compared to the other inputs, supported by the metrics in Table II.
In Table III, we compared the performance of our approach with a few other studies on predicting brain age from EEGs.The study [19] computes five different sets of features extracted from EEGs, including amplitude, range, spectral, connectivity, and fractal dimension domain features.They obtained an MAE of 6.9 years.However, the number of participants they used is relatively small (n = 468).The study [2] used a feature extraction technique based on their previous work [24].They obtained an MAE of 7.6 years for healthy participants.However, the dataset used in this study consisted of only sleep data, which needs to be recorded for the entire sleep duration.In contrast, our dataset is highly diverse in nature, with a variety of participants and with the recording being collected at different times.In addition, this study uses multiple epochs from a single EEG, whereas we extract only a single epoch corresponding to each participant's EEG.The authors in [18] utilize the byte-pair encoding-based feature extraction method for extracting features from EEGs by treating them as unstructured data.They used the same dataset as used in this study.However, their MAE was relatively high (15.7 years) as compared to the MAE of 13.76 years obtained in the present study.

A. Limitations of This Study
The modest performance of our method, as discussed in Table III, can be attributed to the diverse nature of our EEG dataset, which was collected from real clinical settings.Variability arose from EEGs being recorded by different technicians across various stations, with inconsistent placement of the reference electrode.Our analysis encompassed all EEG scans recorded for diagnostic purposes in a single hospital, including both in-patients and out-patients, leading to a highly varied sample population in Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.We ran a series of t-tests, for comparing the performance pairwise across 12 inputs.We summarize all those pairwise performance comparisons in the form of this heatmap.Lower p-value represents higher significant difference in the performances of the two inputs considered in the comparison.As evident from the figure, MIMF2 and MIMF3 have the lowest p-values supporting the results presented in Table II.

TABLE III COMPARISON OF ACCURACY OF BRAIN AGE PREDICTION FRAMEWORK WITH OTHER WORKS
terms of diagnoses, comorbidity levels, and medication use, which is known to influence EEG dynamics [35].Furthermore, the EEG scans were not categorized into normal or abnormal groups upfront, complicating the identification of EEG status due to reliance on non-structured neurological reports.

B. Advantages of Our Approach
The most important highlight of our work is that our work was done in the context of routine clinical EEG from both in-patients and out patients of a hospital, representing real world clinical decision making scenario.Using MIF for extracting features, we still preserved time domain dynamics of the EEGs, hence making brain analysis simpler and maintaining clinical relevance.Additionally, despite all the confounding factors such as different EEG technicians, EEG stations, EEG reference, and medications, the performance based on MIMF bands generated using MIF technique shows significant improvement as compared to the classical canonical frequency band-based features [35].The MIMF bands have demonstrated superior performance on different tasks associated with multivariate EEG scans [4], [32].Here, we utilized the MIMFs obtained from our EEG dataset to be used as a direct input to our CNN model to predict the brain age of the participant.The performance metrics presented in Table II demonstrate that MIMF2, MIMF3, and MIMF4 bands show superior performance relative to the original EEG and EEG rhythms.Though not the main focus of our work, we devised an ad-hoc CNN architecture for predicting age from a person's EEG scans.Since our input is a time series with spatial dependencies between electrodes positioned over the scalp, the use of CNN can be useful as it can capture both spatial (effect of positioning of electrodes over the head) and temporal (time variations in the EEG signals) dependencies present in the data.This further opens up the possibility of using CNNs and MIMF bands for different regression tasks using physiological signals such as ECG, electromyogram (EMG), and EOG.There have been few popular deep learning architectures developed specifically for the EEG paradigm such as the EEGnet [36].However, we recognize that EEGNet, while valuable, may not be the most appropriate choice for our study.EEGNet was primarily developed for event-related potentials (ERPs) with a relatively modest sample size.In contrast, our study involves a significantly larger sample size and focuses on routine clinical EEGs, which are essentially resting-state recordings.Our approach prioritizes knowledge representation through MIF over canonical frequencies and places less emphasis on the specific models used for feature comparison.
Our study acknowledges the high heterogeneity of clinical data collected in real-world settings.While this variability might initially appear as a drawback, it also serves as an asset in mitigating machine learning model drift.The conservative nature of clinical EEG recording and evaluation processes, if constrained to laboratory-setting norms, would limit the broader applicability of our findings.Yet, embracing this data diversity is crucial for minimizing algorithmic bias, thereby fostering the development of ethical artificial intelligence techniques to enhance healthcare efficiency.
The methodology proposed in this study addresses EEG data and holds potential for analyzing other biological signals, such as EOG and ECG.This broader applicability could culminate in a comprehensive medical decision-making system, offering significant benefits to clinicians and patients alike by facilitating rapid and accurate disease diagnosis.

VI. CONCLUSION
In this study, we compared two different knowledge representation techniques, learning features from minimally preprocessed routine clinical EEG data, while preserving the temporal structure of the data.These features were then coupled with a CNN model to predict patients' age across the entire life span.Our methodology specifically involved decomposing EEG signals into a set of amplitude-frequency modulated functions termed as MIMFs.These functions served as inputs for our custom-built CNN.Performance of the CNN model was evaluated against more traditional EEG approaches, such as broad-spectrum band-pass filtered EEG and EEG rhythms falling within canonical frequency bands.Our work demonstrated that MIMF features can enhance neurophysiological evaluation with respect to real-world clinical data as compared to the traditional canonical filtering techniques.Our proposed framework demonstrated improved efficacy for brain age prediction, with potential applications for assessing age-related neurological disorders such as Parkinson's and Alzheimer's disease.The results also illuminate a future direction for combining CNN architecture and MIMF band-based features to analyze a range of physiological signals.

Fig. 1 .
Fig. 1.Block diagram of the proposed approach for brain age prediction from EEG.

Fig. 2 .
Fig. 2. Age distribution of participants present in the dataset.

Fig. 3 .
Fig. 3. Schematic of the positioning of electrodes used in the EEG dataset.

Fig. 8 .
Fig. 8. Scatter plots for actual age versus the predicted age for the: (a) Original EEG, (b) Alpha rhythm, and (c) MIMF3 band.

Fig. 9 .
Fig. 9. Comparison of p-values for different inputs used for CNN.We compared the performance of the model across 12 different inputs to our CNN.We ran a series of t-tests, for comparing the performance pairwise across 12 inputs.We summarize all those pairwise performance comparisons in the form of this heatmap.Lower p-value represents higher significant difference in the performances of the two inputs considered in the comparison.As evident from the figure, MIMF2 and MIMF3 have the lowest p-values supporting the results presented in TableII.

TABLE II PERFORMANCE
COMPARISON OF OUR CNN MODEL ON ORIGINAL EEG, EEG RHYTHMS, AND MIMF BANDS