Depression Identification Using EEG Signals via a Hybrid of LSTM and Spiking Neural Networks

Depression severity can be classified into distinct phases based on the Beck depression inventory (BDI) test scores, a subjective questionnaire. However, quantitative assessment of depression may be attained through the examination and categorization of electroencephalography (EEG) signals. Spiking neural networks (SNNs), as the third generation of neural networks, incorporate biologically realistic algorithms, making them ideal for mimicking internal brain activities while processing EEG signals. This study introduces a novel framework that for the first time, combines an SNN architecture and a long short-term memory (LSTM) structure to model the brain’s underlying structures during different stages of depression and effectively classify individual depression levels using raw EEG signals. By employing a brain-inspired SNN model, our research provides fresh perspectives and advances knowledge of the neurological mechanisms underlying different levels of depression. The methodology employed in this study includes the utilization of the synaptic time dependent plasticity (STDP) learning rule within a 3-dimensional brain-template structured SNN model. Furthermore, it encompasses the tasks of classifying and predicting individual outcomes, visually representing the structural alterations in the brain linked to the anticipated outcomes, and offering interpretations of the findings. Notably, our method achieves exceptional accuracy in classification, with average rates of 98% and 96% for eyes-closed and eyes-open states, respectively. These results significantly outperform state-of-the-art deep learning methods.


I. INTRODUCTION
D EPRESSION is a prevalent and serious mental disorder affecting 280 million people worldwide [1], which extensively influences an individual's quality of life.This significant public health concern impacts an individual's physical and mental welfare in various aspects, such as alterations in appetite, diminished motivation and interest, irregular sleep patterns, and in severe instances, contemplation of suicide.Early diagnosis of depression and treatment can prevent patients' conditions from worsening [2].The revised Beck depression inventory (BDI-II) stands out as one of the extensively employed psychometric assessments for quantifying the extent of depression [3].Comprising a 21question multiple-choice self-report inventory, this assessment method examines characteristic attitudes and symptoms associated with depression.This test's overall score, which ranges from 0 to 63, can classify the severity of depression into four groups [4].It's important to acknowledge that the BDI-II test lacks a robust physiological underpinning and is qualitative in essence.Given that depression influences the neurotransmitter release within the human brain, it is reasonable to hypothesize that it also impacts the electrical neuronal activity captured through electroencephalography (EEG).EEG captures rich temporal data and offers reasonable spatial resolution, particularly when recorded with a larger number of electrodes, such as 64, although it doesn't match the precision of MRI images [5].This brain activity-related dataset is amenable to evaluation and interpretation through diverse machine learning methodologies.Analysis of EEG is used for the diagnosis of different neuropsychiatric disorders such as Schizophrenia [6], [7], [8], Alzheimer's [9], ADHD [10], [11], dementia [12], brain fatigue [13], [14], sleep disorders [15], [16], bipolar manic depression (BMD) [17], and Seizure [18].
To differentiate between individuals with depression and those without, some studies first extract characteristics from the raw data and then input these features into machine learning (ML) and artificial neural networks (ANN).For instance, in [19], authors extracted Higuchi's fractal dimension (HFD) and sample entropy (SampEn) features, applying them to seven machine learning algorithms, including Multilayer Perceptron and Logistic Regression.Their reported average classification accuracy was 93.5% for distinguishing depressed and healthy individuals.In another study conducted by Puthankattil and Joseph [20] several EEG features like relative wavelet energy (RWE) and sample entropy were extracted and given to a two-layer feedforward ANN, achieving a classification accuracy rate of 98.11%.Acharya et al. [21] extracted a number of nonlinear features, such as detrended fluctuation analysis (DFA), fractal dimension, higher order spectra (HOS), Hurst's exponent (HE), largest Lyapunov exponent (LLE), recurrence quantification analysis (RQA), and SampEn and fed them into five different classifiers.Using a support vector machine (SVM) classifier, they reported an average accuracy of 98%.Raw EEG signals have been widely employed for classification using Artificial Neural Networks.For example, Acharya et al. [22] applied raw EEG signals to a 13-layer convolutional neural network (CNN) and obtained classification accuracies of 93.5% and 96% over the left and right hemispheres, respectively.Xia et al. [23] proposed an end-to-end integrated deep-learning model for classifying major depressive disorder (MDD) patients and healthy controls using raw EEG data.They achieved 91.06% average accuracy.In another approach, Hashempour et al. [24] introduced a hybrid convolutional and temporal-convolutional neural network (CNN-TCN) to estimate the BDI score from raw EEG signals in a continuous manner.Their method achieved a mean squared error (MSE) of 5.64±1.6 and mean absolute error (MAE) of 1.73±0.27for the eyes-open state, as well as an MSE of 9.53±2.94and MAE of 2.32±0.35 for the eyes-closed state.
Moreover, in [25], researchers introduced "DeprNet," a deep learning-based CNN, achieving an accuracy of 0.91% and an AUC of 0.95 in the classification of EEG data from both depressed and normal subjects.Notably, their analysis of the final CNN layer visualization revealed that right electrodes had higher prominence in depressed subjects, while left electrodes exhibited greater prominence in normal subjects.In [26], an innovative automatic feature extraction method was employed using the Node2vec framework.This approach offered three fusion strategies: graph-level, feature-level, and decision-level fusion, with a peak accuracy of 93.3% attained in the decision-level fusion process.In another study [27], a novel technique for extracting features from EEG signal channels has been developed.These features were integrated using a fuzzy ensemble strategy.In their method, they used K-Nearest Neighbor classifier, which delivered the highest classification accuracy among the three datasets, with accuracy scores of 91%, 96%, and 94%.Furthermore, [28] introduced a dataset and employed traditional supervised machine learning algorithms to differentiate between healthy subjects and those with depression.Notably, the XGBoost classifier demonstrated the best performance, achieving an 87% accuracy rate for the eyes-open (EO) state.These studies collectively exemplify inventive methodologies and robust classification accuracy in the domain of depression detection.
Although the mentioned methodologies have achieved commendable classification accuracy, none of them offered model interpretations that could facilitate the identification and comprehension of the brain mechanisms linked to depression.While deep learning techniques draw inspiration from certain observed properties in brain research [29], [30], the latest generation of ANNs, called spiking neural networks (SNN) exhibits a greater degree of biological realism [31].
SNNs, as computational models, encompass spiking neurons as processing components, interconnected by biologically feasible learning algorithms [32], [33], [34].SNNs inspired by the brain have found utility across diverse domains, including but not limited to forecasting [35], simulation of the impact of mindfulness on individuals with depression [36], realworld data classification, image recognition, odor recognition, motor control, trajectory tracking, and more.In 2014, Kasabov et al. introduced an SNN architecture called Neucube [37], designed to facilitate effective learning, modeling, and classification of spatiotemporal brain data (STBD).Shah et al. [38] employed the SNN Neucube architecture to model and visualize brain activity in individuals displaying symptoms of depression.They utilized the dynamic evolving spiking neural network method (deSNN) for classification, achieving an accuracy of 68.18% for eyes-open state and 72.13% for eyes-closed state.Despite the utilization of brain-inspired SNNs for diverse spatiotemporal brain data (STBD) modeling applications, a proficient supervised model for classifying Neucube's output results has yet to be introduced.
Despite growing interest in processing of EEG patterns for assessing depression, not many studies have scored the degree of depression using EEG and more importantly, identifying the brain mechanisms associated with different degrees of depression.However, such a research motivates further development of this field by addressing these critical aspects.To achieve this goal, we have utilized an extensive dataset comprising EEG signals from 119 participants who underwent the Beck test and were stratified into four depression levels: minimum, mild, moderate, and severe.To estimate the depression level, we present a novel methodology, combining a brain-inspired SNN architecture with an LSTM neural network to model, visualize, learn, compare, and classify the subjects' EEG signals.To compare the results of our method, we have also applied the raw EEG to a CNN-TCN, a CNN-LSTM, and a 13-layer CNN network.
The organization of this paper is outlined as follows: Section II provides an overview of the dataset employed in this study, followed by a concise introduction to SNNs.Subsequently, we introduce a hybrid network that merges an SNN architecture with an LSTM model.Section III presents the visualization of the simulated network, accompanied by an analysis of the underlying brain structures associated with depression.We present the empirical results, compare them to state-of-the-art methods, and evaluate their respective strengths and limitations.Finally, in Section IV, we conclude our study and outline avenues for future research.

A. Dataset Description and Preparation
1) Participants: We utilized an openly available dataset from the PRED+CT website [39], initially comprising EEG recordings from 121 participants, including 72 females and 49 males.Subsequently, two subjects with incomplete practical information were identified and excluded from the analysis.All participants granted written informed consent, a protocol duly sanctioned by the University of Arizona's ethics review process.The recruitment process involved enrolling individuals from introductory psychology courses, with the selection based on their scores in the BDI mass survey.The eligibility criteria encompassed factors such as (a) age range of 18 to 25 years, (b) absence of any history of head trauma or seizures, and (c) no ongoing utilization of psychoactive medications [40].The recruited subjects exhibited diverse levels of depression.Among the enrolled participants, 76 had a Beck score ranging from 0 to 13, classifying them into the control group (minimum depression).Furthermore, 14 participants received scores ranging from 14 to 19 (indicating mild depression), 24 subjects attained scores between 20 and 28 (reflecting moderate depression), and 5 individuals scored within the range of 29 to 63 (indicative of severe depression) [41].
2) Data Acquisition and Preprocessing: The dataset comprised 500 seconds of recorded signals, acquired through 64 channels along with two additional channels, HEOG and VEOG, following electrode settings aligned with the 10-20 standard EEG recording system [42].The signals were recorded during a resting state, utilizing a sampling frequency of 500 Hz.The last two channels along with the 'CB1' and 'CB2' channels are dropped, and 'M1' and 'M2' are set as reference channels.This results in having a total number of 62 proper scalp channels.The recording paradigm encompassed events of both eyes-open and eyesclosed conditions, exhibiting varying durations for different individuals.Consequently, the EEG data has been segregated into two distinct datasets: eyes-open and eyes-closed resting states.
In the initial stage, the EEG signal of each individual is partitioned into distinct event points.There are a total of 12 unique events within the signal.Due to variations in the number of occurrences for each event across different subjects, differing quantities of segments are generated for each unique event.In pursuit of dataset balance, we homogenize the segment count for each distinct event to align with the minimum segment count of 120.The raw EEG signals are first preprocessed before being fed to the model.To accomplish this, our methodology involves several steps.In the preliminary phase, we apply a downsampling of EEG signals by a factor of two to reduce the data volume without significantly violating the Nyquist rate.Subsequently, the signal baselines are eliminated.Following this, a 50 Hz notch filter is employed, as outlined in [43] and [44], to counteract the power line interference.The signals then undergo bandpass filtering, with cutoff frequencies set at 0.2 Hz and 50 Hz.Lastly, the signals are processed using a Butterworth filter with a fifth-order configuration, incorporating a high-cut at 50 Hz and a low-cut at 1 Hz.The filtered EEGs are passed through independent component analysis (ICA) in the last stage to remove any remaining undesirable components.This study utilizes the MNE-python software [45] to mitigate data contamination, primarily through a semiautomated independent component analysis (ICA) approach.In this context, we employ FastICA due to its notable speed advantages over traditional ICA methods and its capability to accommodate non-Gaussianity.The procedure involves principal component analysis (PCA) for whitening the mixtures and ICA for decomposition.It is important to note that HEOG and VEOG channels are initially dropped from the analysis as they are not used for artifact removal.The artifacts, which mainly include eye artifacts (such as blinks and eye movements), muscle artifacts, heart artifacts (ECG), and other non-neural artifacts, are eliminated using FastICA.The final step involves back-projecting the remaining ICA components into the channel space.
In the featured dataset, the training samples are not equally distributed across the target classes.Therefore, we employ an undersampling technique in order to prevent the model from being biased toward the class that has a larger number of training cases, which would reduce the model's predictive ability.To achieve this, an initial selection involves opting for two minutes (equivalent to 30000 samples) of the EEG signal from both the eyes-open and eyes-closed states for each subject.Secondly, the signals are divided into fivesecond windows (1250 samples), with each window having a 90% overlap.The 5-second window size has been chosen to accommodate SNNs that learn from spike occurrences.In the absence of specific cognitive tasks during data acquisition, this extended window supports more effective unsupervised learning through spike-time dependent plasticity (STDP) within the SNN reservoir.It allows for the capturing of subtle temporal patterns and enhances the modeling of spatiotemporal EEG patterns, aligning with the network's spike-driven processing.The data is then balanced across all depression levels based on the number of individuals.As a result, within each window, the data point count for depression classes is normalized to align with the count of the class possessing the smallest data point size.A data matrix with the dimensions (4554, 1250, 62) and float type values is the result of this windowing operation.

B. Spiking Neural Networks
In accordance with the all-or-none principle, information within human brains is encoded through distinct events referred to as action potentials or spikes.A neuron generates a spike when its cumulative potential surpasses a predetermined threshold; otherwise, it continues to be inactive.Information regarding external stimuli and other internal computations is carried by the timing of spiking, the neuron's location, the neurons' firing rate, and the temporal patterns.Due to its binary information processing capability, SNN maintains its advantage in terms of energy efficiency and effectiveness over conventional ANNs [46], [47].Incorporating a more biologically realistic neuron model compared to traditional ANNs [48], SNNs, as the third generation of neural network models, uniquely mimic the intricate mechanisms of the brain's neurons.This inherent similarity to the brain's neuron mechanisms makes SNNs particularly well-suited for the analysis and modeling of EEG data.Operating across a multitude of spiking neurons, this model effectively processes dynamic input information.The leaky-integrate and fire model (LIF), a representation of a spiking neuron, can be employed to emulate each neuron within the SNN model [49].Notably, temporal dynamics are integrated into the operations, alongside the synaptic states of the neurons.This temporal consideration aligns well with scenarios where the timing of input signals is the main concern [50].Consequently, SNNs emerge as an apt approach for applications involving STBD analysis, including EEG and fMRI [51].
LIF neurons represent the predominant neuronal model employed within SNNs.It describes the behavior of a neuron as it integrates incoming signals and, when a certain membrane potential threshold is reached, fires an action potential.The LIF neuron model can be mathematically described by the following equation: where V (t) is the membrane potential of the neuron at time t, τ is the time constant of the neuron's membrane,V rest is the resting membrane potential, R is the membrane resistance, and I (t) is the input current.When the membrane potential V (t) crosses a predefined threshold, the neuron fires, resetting V (t).
This model provides a basic representation of how neurons integrate and transmit signals in the brain.An illustration of LIF neuron's function is shown in Fig. 1.The STDP rule [64] is a fundamental learning mechanism in SNNs.It governs the adjustment of synaptic weights based on the precise timing of spikes between pre-synaptic and post-synaptic neurons.STDP is inspired by biological mechanism of synaptic plasticity, where the strength of synaptic connections is modified in response to the timing of neuronal spikes.The STDP rule can be mathematically represented as follows: where w represents the change in synaptic weight, t is the time interval between the pre-synaptic and post-synaptic spikes, A LTP and A LTD are positive constants that control the magnitude of long-term potentiation (LTP) and long-term depression (LTD), respectively, and τ LTP and τ LTD are time constants that determine the rate of weight changes during LTP and LTD phases.STDP is a crucial learning mechanism in SNNs, allowing them to adapt their synaptic connections based on the temporal order of spikes, which is essential for various cognitive and computational tasks.
The depression identification method proposed in this study is built upon a customized and improved version of the NeuCube framework [37].This adaptation has been finely tuned to address the specific requirements of the task at hand, particularly focusing on enhancing the classification component.

C. The Proposed SNN Architecture in Combination With an LSTM Network
Our proposed SNN architecture functions as a spatiotemporal machine, employing a brain-inspired spiking neural network design.Its overarching objectives encompass knowledge extraction, STBD learning modeling, and investigation into the neurological mechanisms underpinning data generation [52], [53].In our proposed approach, the 3D SNNr module is merged with an LSTM network, enhancing the comprehension and classification of depression.In Fig. 2, we present a diagram illustrating the consecutive steps of our proposed method.The subsequent steps pertain to the modeling phase.
1) Spike Encoding: SNN-based architecture processes information through binary spiking events.Accordingly, the initial step is to encode all continuous variables into spike trains.In this context, our focus lies specifically on temporal spike encoding techniques, wherein spike timings signify alterations in the signal's value over time [54].This strategy is motivated by the biologically tenable hypothesis that information is encoded by accurate relative spike timing [55].The majority of widely used encoding algorithms [56] revolve around monitoring temporal signal changes, subsequently represented through the exact timing of spikes.Examples include threshold-based representation (TBR) algorithm, stepforward (SF) encoding, moving-window (MW) encoding, and the Bens spiker algorithm (BSA) [57].
In this research, we employ the address event representation (AER) approach, a simplified adaptation of the TBR technique, to transform EEG data into spike trains [58], [59].This approach proves particularly effective for data streams, as is the case with EEG signals.It hinges on the principle of thresholding the rate of alteration in an input variable over time [60].Notably, each of the 62 input data channels is assigned an individualized variable threshold value, forming the core of the algorithm.Acknowledging the potential divergence in signal dynamics and value ranges across input channels, a distinctive variable threshold array is computed for each channel in the following way: where k ranges from 1 to the number of channels N channels = 62, T is the signal length, and N represents the number of samples.X is a (T × N channels × N ) data matrix and VT represents the resulting variable threshold array.Upon the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.occurrence of a rate of signal change surpassing the specific variable threshold in the kth input channel, a positive spike is emitted.Conversely, when the rate of change breaches the variable threshold in a descending, or negative, direction, this triggers the generation of negative spikes.Algorithm 1 offers a detailed exposition of this process, meticulously outlining each step.
2) The 3D SNNr and Input Mapping: The 3D SNN reservoir (SNNr) module essentially constitutes an assemblage of spiking neurons positioned spatially, with well-defined coordinates for the input neurons.This structure is designed to mimic the configuration of neurons within the brain.Here, we have implemented an SNNr with N r eser voir = 1471 leaky integrate and fire (LIF) model.These neurons are situated in accordance with the Talairach Atlas [61], [62], forming a cuboid shape resembling the human brain.Each neuron represents 1 cm 3 brain area.The number of channels resulting from the loaded dataset, which in this case is N channels = 62, defines the number of input neurons.The coordinates of these input neurons are a subset of the SNNr coordinates.Using the Koessler et al. mapping method [63], the nearest neuron in the Talairach Atlas is allocated to the associated channel based on measurements of electrode placements, as shown in Fig. 3. Through the input neurons, the spike trains acquired following data encoding with the AER method are fed into the SNNr.
An N r eser voir × N r eser voir matrix of distances called L dist is created where an L2 norm is computed to determine the distances between pairs of neurons.The "small world" connectedness tenet was selected based on the biological process.Neighboring neurons become potentially coupled to one another as a result.The technique of small-world V T (k) ← 0 4: x ← channel k of the ith sample in X input 6: x T ×1 ← |δx| end for 11: V T k ← V T k /N 12: end for 13: for k = 1 to N channels do 14: x ← channel k of the ith sample in X input 16: x T ×1 ← δx 17: x spike ← 0 T ×1 18: for j = 2 to T do x spike ( j) ← 1 end for 25: end for connectivity (SWC) involves introducing a parameter for linking neurons within a defined range, referred to as the small-world radius (SWR).At the outset, all connections C (i j) among the neurons within the entire reservoir are initialized to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 3. Utilizing the Talairach template coordinates, brain's three-dimensional coordinates are assigned to the designated spiking neurons within the SNNr.This procedure yields a three-dimensional SNNr structure that mimics the shape of the brain.Notably, red neurons (right) are designated as input neurons and correspondingly map to the positions of 62 EEG channels (left).
1.If L dist (i j) > SW R, the connection status between the two neurons is marked as zero (disconnected).Each connection between neurons i and j signifies i as the pre-synaptic neuron and j as the post-synaptic neuron.When a connection is deemed bidirectional, we randomly assign a value of 1 to one connection and a value of 0 to the other, thereby preserving only one of the two options.As a result, an SNNr with sparsely connected neurons is created.Our model's SNN initialization is carried out by using the SWC connection rule with SW R = 2.5.After initialization, the weights of connections W i j between the connected pairs of neurons (ij) are established using the subsequent equation: where rand(1) generates pseudorandom values drawn from the standard uniform distribution within the open interval of 0 to 1.
According to (7), the matrix W is expected to contain around 70% positive and 30% negative weights.In Algorithm 2, we present an in-depth account of this procedure, leaving no room for ambiguity.
3) Unsupervised Learning in SNNr and Visualization: This methodology divides the learning process into two phases: unsupervised learning and supervised learning.The process of unsupervised learning is employed to adapt the initial connection weights of the SNNr model as the model learns from the continuous EEG data presented in the form of spikes.The STDP rule, an unsupervised learning technique with biological plausibility, is employed for this learning process.The STDP mechanism regulates the synaptic strength based on the temporal relationship between presynaptic and postsynaptic action potentials.This algorithm operates with the utilization of the subsequent parameters: • N iter : number of training iterations • β: spike generation threshold • η: learning rate • R: the resting period between spikes • D: the leakage rate of neurons' potential while inactive N channels spike states are sent to the associated input neurons within the SNNr at each time step and potential propagations are computed.Take the (i, j) neuron pair as an example, where i represents the pre-synaptic neuron and j represents the post-synaptic neuron.In the event when a for j = 1toN reservoir do if rand < 0.5 then end for 20: end for presynaptic neuron fires and neuron j is not in the refractory time: During any given moment t, if a neuron's potential surpasses the firing threshold potential β, the neuron fires, leading to its potential P k (t) being reset to 0. Concurrently, its refractory counter R k is established at R (refractory time).Conversely, if neuron k fails to achieve the firing threshold potential at time t, its potential is diminished by the leak rate D, and its refractory counter is updated: Based on the STDP model grounded in the Hebbian learning rule, an increase in the connection weight between two neurons Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I THE STDP MODEL PARAMETERS AND SETTINGS
occurs when a presynaptic neuron fires immediately before a postsynaptic neuron, and conversely.This study employs a modified version of the STDP model.
Following the general STDP principle, the modification of the connection weight occurs when a presynaptic neuron i fires at time t, and the postsynaptic neuron j has most recently fired at t f j : On the other hand, if a post-synaptic neuron j fires at time t and a pre-synaptic neuron i has most recently fired at t f i : At the start of each subsequent training cycle n iter (of N iter ), The learning rate η is adjusted to η √ n iter . It is essential to emphasize that all time instances involved in the learning algorithm are discrete.The parameter settings for the STDP method, as presented in Table I, were determined by grid search.We established a parameter range guided by [65], aligning with the physiological ranges of actual neurons.Grid search was then employed to systematically investigate various parameter combinations, evaluating network performance.As network performance metrics, we employed statistical metrics, including mean, standard deviation, and coefficient of variation, to gauge synaptic weight and spiking rate balance during grid search.These specific values have been selected as they yielded the most favorable results.After the training of SNNr is done, an output matrix with boolean type values and a size of (4554, 1250, 1471) is obtained which is used as input for the next part of our model.The output data shows that the number of time steps is the same as the input EEG data.Nevertheless, the number of channels has changed from 62 electrodes to 1471 neurons.Every neuron acts just like an electrode in this situation.Consequently, this data might be regarded as advanced EEG data.Algorithm 3 provides a comprehensive breakdown of this procedure, elucidating the intricacies step by step.
Through unsupervised STDP learning, the unique spike trains in EEG STBD data are transformed into connections between neurons.These connections effectively capture the recurring patterns within the EEG data.Subsequently, these learned connections can be visually observed, represented graphically, and further scrutinized, enabling us to delve deeper into the data's underlying structure.Additionally, they empower us to perform comparative analyses of EEG data across diverse subject groups.
Algorithm 3 unsupervised SNNr Weight Learning: STDP x ← all N channels spikes of the ith sample in X Spike for all j ∈ α do end for 33: end for 4) Supervised Learning Using LSTM: Long short-term memory (LSTM) networks, which were introduced by Hochreiter and Schmidhuber in 1997 [66], have demonstrated their effectiveness in analyzing and interpreting EEG signals.
These networks excel at capturing the temporal dependencies in EEG signals, effectively modeling both short-term and longterm patterns.By employing LSTM layers to process the sequential EEG data, the model can capture the temporal dynamics within each channel and the interdependencies among different channels.This capability allows LSTMs to leverage the spatial patterns and relationships in EEG signals, resulting in improved classification performance by considering the holistic information in the multichannel EEG data.LSTMs, as a special type of recurrent neural networks (RNN), were explicitly designed to address the challenge of long-term dependency in RNNs.Traditional RNNs trained through back-propagation through time (BPTT) often encounter the vanishing/exploding gradient problem when learning from extended sequences.In order to overcome

TABLE II THE LSTM MODEL PARAMETERS AND SETTINGS
this challenge, LSTMs employ a gated cell structure as a replacement for the traditional RNN cell.Fig. 4 illustrates the basic architecture of an LSTM cell.
Due to the temporal nature, large receptive field size, and large number of channels of the SNNr output, we employ an LSTM module to classify depression levels.The binary output matrix of the previous module with the size of (N samples , Timesteps, N r eser voir ) is fed to an LSTM layer which is configured with 64 memory cells.Subsequently, in order to prevent over-fitting and enhance the model's generalizability, we utilize a dropout layer.This layer is connected to a fully-connected linear layer with 32 units and a ReLU activation function.Lastly, we employ another fullyconnected linear layer with 4 units and a softmax activation function to carry out the classification.In Table II, we present the network's model parameters and settings, while Table III provides a more comprehensive description of the network's layers.Algorithm 4 exemplifies the conclusive stage of the model.

Algorithm 4 Supervised Learning and Classification
Require:

III. RESULTS AND DISCUSSION
This study involves two analysis steps.In the first step, we investigate brain connectivity and patterns associated with depression through visualization and interpretation of the SNN model.In the subsequent step, we evaluate the classification accuracy of our proposed model and compare it to other existing methods employed by fellow researchers in the field of depression recognition.

A. Pattern Discovery of Dynamic Brain Activities Associated With Depression Through Visualization of the SNN Models
In this section, we explore the functional connections within the brain by analyzing the insights obtained from the learned SNNr models.To compare the underlying brain functions across different states of depression, we conduct separate STDP training of the SNN models using samples from each group.For the analysis of the trained networks, we construct graphs [67], which effectively depict the extent of interactions among distinct brain regions.For quantifying the extent of interaction among the input neurons within the SNN models, we construct an N × N affinity matrix within the confines of the SNN model.This matrix captures the aggregated spikes exchanged between neurons i and j via the connection W i j .Each input neuron establishes a cluster of surrounding neurons, signifying those that receive the most spikes from that particular input neuron relative to others.The level of spike interaction between any two groups of neurons is calculated in terms of the spikes exchanged.The strength of connections is visually depicted by the line's thickness connecting nodes, symbolizing the intensity of spike transmission between different segments of the brain model.
In Fig. 5 and 6, we present the 500 strongest connections for each level of depression during both eyes-closed and eyes-open states.These connections are represented by blue lines (indicating excitatory connections) and red lines (indicating inhibitory connections).Additionally, the brightness of neurons represents their spike emission.In both the eyes-closed and eyes-open states, there is a notable trend of the strongest connections in the brain shifting towards higher regions as depression severity increases.This pattern suggests disturbances in the normal connectivity patterns, potentially reflecting the impact of depression on neural communication and network dynamics.This scattering towards the top of the brain, particularly involving the prefrontal cortex responsible for cognitive functions and emotional regulation, suggests a significant impact of depression on these crucial processes.Our findings are aligned with previous research, including [68], which reported hyperconnectivity between the thalamus and the cortex in individuals with major depression, supporting the idea of increased connections between lower and higher brain areas in depressed patients.Conversely, in non-depressed states, stronger connections are concentrated in lower brain regions, such as the limbic system, which is associated with emotional processing and regulation, possibly indicating a more balanced emotional state.The observed changes in brain connectivity may also reflect the brain's adaptive response to depression, with neuroplasticity playing a role in forming Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.new connections to cope with the challenges posed by the condition.

TABLE III DETAILED PARAMETERS SETTINGS OF THE LSTM MODULE
In Fig. 7 and 8 we present graphical representations of the brain's 62 electrodes, based on the standard 10-20 EEG electrode system, for four levels of depression during both eyes-closed and eyes-open states.Notably, our results reveal a remarkable increase in connections related to the frontal and prefrontal cortex regions as depression becomes more severe.This finding suggests that depression may have a specific impact on the connectivity within these brain areas, which are known for their involvement in cognitive and emotional processing.Moreover, as the depression level increases, the connections tend to become less sparse but rather stronger.This indicates that more connections are formed, and existing connections become more robust, potentially reflecting a reorganization of neural communication during depressive states.In the minimally depressed group, robust connections were consistently observed between F6 and PO3, F6 and FT8, and C3 and PO3 in both eyes-closed and eyes-open states.In contrast, for individuals with severe depression, strong connections were identified between F5 and T7, FT8 and F6, FT8 and T8, as well as T8 and PO8.
The results align with the findings regarding the differences between eyes-closed and eyes-open states.In the eyes-open state, there are more sparse connections, including long-range connections.This alignment is consistent with the notion that the brain integrates information from distant regions to process sensory inputs.Thus, during eyes-open states, the brain's functional connectivity involves a broader network of brain regions communicating over longer distances to handle external sensory information.Conversely, during eyes-open states, the sparsity observed is in line with the understanding that synchronization and coherence may be weaker due to the brain's active engagement in processing sensory information.The weaker synchronization and coherence during these periods could lead to more isolated and less   coordinated neural activities, resulting in a sparser connectivity pattern.
In Fig. 9 and 10, we show the correlation between each channel's weighted degree centrality and depression severity.Channels FP1, Fpz, F3, PO5, and CP2 demonstrated the lowest correlation, suggesting that their connectivity weakens as depression worsens.Conversely, channels AF3, AF4, F5, F1, FT8, POz, and CP4 exhibited the highest correlation, indicating a stronger association with depression severity.These findings suggest that depression level significantly impacts brain network connectivity, particularly involving the frontal and prefrontal cortex regions.Channels with low correlation in the non-depressed group might play a crucial role in maintaining emotional balance, while their weakening connectivity may contribute to depressive symptoms.Channels with higher correlation may be more directly related to depressive symptoms, indicating their potential relevance in depression manifestation and progression.

B. Classification Results
To assess the classification efficacy of our proposed model, we conduct a comparative analysis with three well-established Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.deep learning-based models using the same dataset.The first model under consideration is the hybrid deep CNN-TCN network, introduced by Hashempour et al. [24].The second model combines a convolutional neural network with long short-term memory (CNN-LSTM), as proposed by Ay et al. [69].Lastly, we examine a 13-layer deep CNN presented by Achariya et al. [22].By juxtaposing our proposed model against these prominent benchmarks, we aim to gain insights into its performance and potential advantages over existing state-of-the-art approaches.
The model evaluation utilizes a 10-fold cross-validation procedure, with 10% of the subjects used as the testing set in each iteration.It is important to note that the unsupervised learning step is conducted using the entire dataset in a single step, considering the time-consuming nature of this process.The aggregate performance is calculated by averaging the outcomes from all ten evaluations.The results are showcased and compared in Fig. 11 and Fig.  IV.CONCLUSION This study investigates the potential of depression recognition through the utilization of a novel combination of LSTM and SNN.For the first time, these models are employed to model, map, learn, classify, visualize, and comprehend EEG signals associated with four distinct depression levels, namely minimum, mild, moderate, and severe.The proposed model integrates diverse methods and algorithms that facilitate the exploration and investigation of multiple aspects within EEG data.This encompasses the spatial mapping of data onto a three-dimensional SNN structure, unsupervised learning within the SNNr, visualization of connectivity and spiking patterns within the trained SNNr to unveil novel insights into the data and underlying brain mechanisms, along with supervised learning within an LSTM network.Comparative analysis with other deep learning techniques showcases the advantages of employing the SNN approach in modeling timespace brain data.This study not only achieves improved accuracy in classifying samples from different subject groups but also reveals informative patterns of brain activities, shedding light on the understanding of different severity of depression.Our findings unveil significant differences between different depression levels that hold promise as potential markers for early prediction and prevention of depression.The proposed methodology exhibits wide applicability to diverse neuroimaging and clinical longitudinal data.Future work will focus on refining SNN hyper-parameter optimization, further enhancing the visualization and analysis of the brain-structured SNN during both the learning process and post-learning phase to deepen our comprehension of brain processes related to depression.Furthermore, the exploration of larger and wellbalanced datasets, as well as task-specific EEG signals, will be undertaken to delve into the impact of varying tasks on the analysis.This approach seeks to comprehensively investigate the effects of different tasks on the model's performance and insights.

Fig. 2 .
Fig.2.The architecture of the proposed method comprises three primary stages: EEG data encoding into spike trains, mapping to a 3-dimensional brain-inspired SNN reservoir with 1471 neurons, and a two-step learning process for EEG dataset-unsupervised and supervised, ultimately leading to classification.

Algorithm 1
AER Spike EncodingRequire: X input ∈ R T ×N channels Ensure: X Spike ∈ {0, 1} T ×N channels 1: N ← #(X input ) ▷ Number of data samples in the dataset 2: for k = 1 to N channels do 3:

Algorithm 2
SNNr Connection and Weight InitializationRequire:X brain ∈ R 1×3 , X input ⊂ X brain , C : 1 N reservoir ×N reservoir▷ Hyperparameters: SW R Ensure: C : {0, 1} N reservoir ×N reservoir , W : R N reservoir ×N reservoir 1: N r ← #(X brain ) ▷ Number of reservoir neurons 2: L dist is a matrix of distances between all pairs of neurons 3: for i = 1toN reservoir do 4:
reservoir output data Ensure: Y classes ▷ Number of output classes 1: Split X reservoir into training and testing sets 2: Initialize LSTM model for classification 3: for each epoch in training do 4: Train LSTM model on X reservoir 5: end for 6: Initialize empty array Y classes 7: for each data point in testing set do 8: Pass data through trained LSTM model 9: Classify data into one of the classes 10: Append class label to Y classes 11: end for 12: Return Y classes

Fig. 5 .
Fig. 5.The connectivity results from four distinct SNNr modules in the eyes-closed state are depicted.For each SNNr, the top 500 strongest connections are showcased.Positive (excitatory) connections are portrayed with blue lines, whereas negative (inhibitory) connections are represented by red lines.The brightness of each neuron corresponds to its spike emission level: (a) minimal depression; (b) mild depression; (c) moderate depression; (d) severe depression.

Fig. 6 .
Fig. 6.The connectivity results from four distinct SNNr modules in the eyes-open state are depicted.For each SNNr, the top 500 strongest connections are showcased.Positive (excitatory) connections are portrayed with blue lines, whereas negative (inhibitory) connections are represented by red lines.The brightness of each neuron corresponds to its spike emission level: (a) minimal depression; (b) mild depression; (c) moderate depression; (d) severe depression.

Fig. 7 .
Fig. 7.The graphs effectively encapsulated the overall spike interaction during the eyes-closed state across regions within the SNN models, symbolizing the 62 EEG channels as input neurons, throughout the STDP learning process for: (a) minimal depression; (b) mild depression; (c) moderate depression; (d) severe depression.The nodes within the graphs depict the areas of input neurons in the SNN model, while the thickness of lines represents the degree of spike transmission between these neuron areas (clusters).These clusters correspond to the input neurons (EEG channels).

Fig. 8 .
Fig. 8.The graphs effectively encapsulated the overall spike interaction during the eyes-open state across regions within the SNN models, symbolizing the 62 EEG channels as input neurons, throughout the STDP learning process for: (a) minimal depression; (b) mild depression; (c) moderate depression; (d) severe depression.The nodes within the graphs depict the areas of input neurons in the SNN model, while the thickness of lines represents the degree of spike transmission between these neuron areas (clusters).These clusters correspond to the input neurons (EEG channels).

Fig. 9 .
Fig. 9. Correlation between each channel's weighted degree centrality and depression severity for eyes-closed state.

Fig. 10 .
Fig. 10.Correlation between each channel's weighted degree centrality and depression severity for eyes-open state.

12
. The results clearly indicate that our proposed model surpasses the other three models in classification accuracy.This success can be attributed to the model's effective feature extraction, specifically related to potential connectivity relationships among different EEG channels, achieved through STDP unsupervised learning.The model's biological plausibility is a key advantage, as it is well-suited for processing biological EEG signals.This biologically-inspired approach likely contributes significantly to the improved performance compared to alternative methods, aligning well with the inherent characteristics of EEG data.The utilization of just one LSTM layer to classify the output of SNNr further substantiates the efficacy of this biologically-inspired approach.