Bio-inspired Filter Banks for Frequency Recognition of SSVEP-based Brain-computer Interfaces

Brain-computer interfaces (BCIs) and their associated technologies have the potential to shape future forms of communication, control, and security. Specifically, the steady-state visual evoked potential (SSVEP) based BCIs have the advantages of better recognition accuracy, and higher information transfer rate (ITR) compared to other BCI modalities. To fully exploit the capabilities of such devices, it is necessary to understand the underlying biological features of SSVEPs and design the system considering their inherent characteristics. This paper introduces bio-inspired filter banks (BIFBs) for improved SSVEP frequency recognition. SSVEPs are frequency selective, subject-specific, and their power gets weaker as the frequency of the visual stimuli increases. Therefore, the gain and bandwidth of the filters are designed and tuned based on these characteristics while also incorporating harmonic SSVEP responses. The BIFBs are utilized in the feature extraction stage to increase the separability of classes. This method not only improves the recognition accuracy but also increases the total number of available commands in a BCI system by allowing the use of stimuli frequencies that elicit weak SSVEP responses. The BIFBs are promising particularly in the high-frequency band, which causes less visual fatigue. Hence, the proposed approach might enhance user comfort as well. The BIFB method is tested on two online benchmark datasets and outperforms the compared methods. The results show the potential of bio-inspired design, and the findings will be extended by including further SSVEP characteristics for future SSVEP based BCIs.


I. INTRODUCTION
S CIENTIFIC advances in neuroscience and biomedical engineering enabled a direct communication channel between the human brain and a computer. The electrical activity in the brain that is produced by neuronal post-synaptic membrane polarity changes can be monitored to detect the user's intentions [1]. A brain-computer interface (BCI) [2] analyzes the brain signals and translates them into commands for external devices such as a speller device, wheelchair, robotic arm, or a drone (Fig. 1). Since BCIs utilize the signals generated by the central nervous system, the primary target of this technology is people with severe neuromuscular disorders (e.g., amyotrophic lateral sclerosis, brain-stem stroke, spinal cord injury, and cerebral palsy). However, advanced BCI systems serve healthy people as well by providing an alternative way of communication, control, and security [3]- [5]. Hence, these systems have evolved to be a promising part of the body area network [6]- [10].
While there exist multiple approaches to measure brain activity, electroencephalography (EEG) is widely used in BCI applications because of its high temporal resolution, which is essential for BCIs to work as real-time systems [11]. In addition, EEG devices are inexpensive and portable. Various EEG signals could serve to drive BCIs. For example, a distinctive oscillation pattern in EEG is observed when a sensory stimulus such as visual or auditory is presented to a human. These oscillations are called as evoked potentials (EPs), and they disappear after a short period. If the stimulus is repeated at a regular rate, the EPs do not have time to decay, and it causes a periodic response which is called as steady-state evoked potentials [12]. More specifically, a periodic visual stimulus with a repetition rate higher than 6 Hz elicits steady-state visual evoked potentials (SSVEPs) which are more prominent in the occipital region of the brain [13], [14]. The targets that evoke SSVEPs are encoded in various ways [4], [15], and the users make a selection by shifting their attention to the desired target in SSVEP based BCIs. Among other BCI modalities which depend on other EEG signals (e.g., slow cortical potentials, sensorimotor rhythms, and event-related potentials), SSVEP based BCIs have the advantage of high information transfer rate (ITR) and short training duration to operate the device [16]. SSVEPs are sinusoidal-like waveforms, and they appear at the same fundamental frequency of the driving stimulus and its harmonics (Fig. 2) [13]. However, spontaneous oscillations (i.e., background activity), which are not related to the stimulation, exist in the EEG recordings as well and a robust recognition algorithm is required to build a reliable BCI system. Numerous methods have been proposed for SSVEP recognition in the last decade [16]- [22]. Power spectral density analysis (PSDA) is a typical approach since the distinctive features of SSVEPs are observed in the frequency domain [16]. However, PSDA is susceptible to noise, and long durations are needed to increase the signal to noise ratio (SNR). A multivariable statistical method, namely canonical correlation analysis (CCA) [17], [19] exploits the multiple channel covariance information to enhance SNR and provide a better recognition accuracy compared to PSDA. Simple implementation, high robustness, and better ITR performance have made CCA attractive in SSVEP recognition research. On the other hand, CCA is not efficient to extract the discriminative information embedded in the harmonic components of SSVEPs, and filterbank canonical correlation analysis (FBCCA) [20] is proposed to handle this issue. Although FBCCA captures the distinct spectral properties of multiple harmonic frequencies successfully, it neglects any correlation information between SSVEP responses at different frequencies [21]. Furthermore, this approach disregards the frequency selective nature of SSVEPs due to the utilization of wide-band filters which cover the whole stimuli bandwidth.
To fully exploit and further increase the potential of SSVEP based BCIs, it is necessary to employ an accurate SSVEP model in the recognition algorithm. For example, the inclusion of SSVEP harmonics in a recognition algorithm improves the accuracy [23] since the spontaneous EEG oscillations typically do not present any harmonic components [24]. Also, the subject-specific nature of SSVEPs is handled by an individualized parameter optimization and calibration (e.g., time-window duration, number of harmonics considered, and electrode loca-Alpha-band SSVEP response at 28 Hz tion) [16], [17]. Moreover, the SSVEP response is frequency selective, and its power gets weaker as the frequency of the stimuli increases [11], [13], [15], [18]. Although the power of EEG background activity decreases as well with the increase in frequency (approximately with a 1/f behavior [12]), the resultant SNR is still considerably low at high frequencies.
Hence, a visual stimulus at a high frequency can almost be indistinguishable in the presence of noise as shown in Fig. 3. This inherent feature not only results in a lower recognition accuracy but also causes exclusion of the stimulus frequencies that evoke weak SSVEP response and decreases the total number of available commands in a BCI system. This paper introduces bio-inspired filter banks (BIFBs) for improved SSVEP frequency recognition. The BIFBs are designed considering the inherent biological characteristics of SSVEPs, namely frequency selectivity, subject specificity, and harmonic SSVEP responses. They are utilized in the feature extraction stage to increase the separability of classes. The proposed approach is tested on datasets available online, and its performance is compared with the performances of various SSVEP frequency recognition methods. The preliminary results without an elaborate classification algorithm or a cross-validation procedure were presented in [25]. Also, a fair performance comparison with the utilization of unit filters is provided to validate the effectiveness of the proposed filter bank design in this study. The results show a notable ITR improvement with the bio-inspired design and highlight the promising potential of BIFBs in the high-frequency band, which causes less visual fatigue. Hence, the proposed method leads to more reliable, efficient, and user-friendly SSVEPbased BCI systems.
This article is structured as follows. Section II describes the performance metrics, evaluation methodology, and datasets. The proposed method is explained in detail, along with the comparison methods. Section III presents the performance of the SSVEP recognition algorithms and provides a thorough analysis of the results. Finally, Section IV summarizes the contributions and addresses future research directions.

A. Evaluation Metric
The most common measure to evaluate the performance of a BCI system is ITR [3], which can be expressed in bits/minutes as follows: where K stands for the number of equiprobable commands, s denotes the commands performed per minute, and δ represents the accuracy of target recognition. In general, the BCIs with high ITR have a large number of commands. However, K is fixed in these datasets, and the ITR can be boosted with the joint optimization of s and δ. Also, a threshold can be set either on s or δ based on user comfort.

B. Datasets and Pre-processing
Two publicly-available datasets are utilized in this study to test the proposed method. Dataset-A [18] consists of EEG recordings belong to four healthy subjects with normal or corrected to normal vision. Small reversing black and white checkerboards were presented to the participants sequentially (i.e., one stimulus at a time) at three different frequencies (8 Hz,14 Hz, and 28 Hz) during the recordings. The brain signal acquisition was performed at a sampling rate of 256 Hz with 128 active electrodes using the ABC layout standard 1 for electrode placement. The EEG recordings were re-referenced using the central Cz electrode and band-pass filtered from 6 Hz to 35 Hz. The subjects experienced a visual stimulus for 15 seconds in each trial. Each unique visual stimulus was repeated for five times, which corresponds to 60 trials (4 subjects x 3 stimuli x 5 repetitions) in total. Dataset-B [26], which is provided by another research institute, consists of EEG recordings belong to four healthy subjects as well. A single flickering box that changes color rapidly from black to white at seven different frequencies (6 Hz, 6.5 Hz, 7 Hz, 7.5 Hz, 8.2 Hz, 9.3 Hz, and 10 Hz) was used as the visual stimulus. The brain signal acquisition was performed at a sampling rate of 512 Hz with three electrodes (Oz, Fpz, Pz) using the 10-20 layout standard for electrode placement. The EEG recordings were referenced using the electrode Fz, and an analog notch filter at 50 Hz was applied to suppress the power-line noise. The subjects experienced a visual stimulus for 30 seconds in each trial. Each unique visual stimulus was repeated at least three times with 92 trials in total. 1 https://www.biosemi.com/headcap.htm An overview of these datasets is provided in Table I, and the reader is referred to individual references for a more detailed description of the datasets. Dataset-A is selected to include a stimulus at the high-frequency band that evokes weak SSVEP response, whereas Dataset-B is selected to deal with the frequency selectivity even in a narrow band.

C. Proposed Method
The pre-processed EEG signal from the occipital channel Oz is segmented with an overlap, and each segment is windowed using a Hamming function [27]. Afterward, the power spectral density of the signal is estimated by the following equation: where EEG[n] and w[n] represent the discrete EEG signal and Hamming window function, respectively. The features for SSVEP frequency recognition are extracted by multiplying S EEG with the frequency response of BIFBs. The filter banks are designed in such a way that they capture the inherent biological characteristics of the SSVEPs. It is known that the SSVEPs are frequency-selective, and their power gets weaker as the frequency of the visual stimuli increases [11], [13], [15], [18]. Figure 4 presents the average SSVEP response power to pattern reversal stimuli ranging from 5.1 Hz to 84 Hz [18]. Especially, the stimuli at the high-frequency bands elicit weak responses and make the recognition challenging. Consequently, the gain and bandwidth of the filters are designed considering the frequency-selective nature of SSVEPs. Assume that there are K target stimulus frequencies (f k ), where k = {1, ..., K}, in a BCI system. The array of filters in BIFBs is expressed as follows: where BW k and g k represent the bandwidth and gain of the k th filter, respectively. Initially, higher bandwidth and gain are set to frequencies with low SSVEP response power. Subsequently, these parameters are optimized for individual users in order to counter the subject-specific nature of SSVEP response [16], [17]. A grid search algorithm performed this hyper-parameter optimization through a manually specified subset of the hyper-parameter space [28]. It should be noted that the initial parameter guesses considering the average SSVEP response decrease the computational complexity. Also, SSVEPs occur at the fundamental frequency of the driving stimulus and its harmonics, whereas spontaneous EEG oscillations typically do not present any harmonic components [24]. Accordingly, filters at the SSVEP harmonic frequencies are included in the filter bank design (i.e., H K+1 BIF B [f ] for 2f 1 , ..., H K+K BIF B [f ] for 2f k ) as well to improve the recognition accuracy as shown in Fig. 5. Finally, the features are extracted using the BIFBs as follows: where x i represents the elements of feature vector X. The extracted features for SSVEP recognition are classified with a logistic regression model using the one-vs-all strategy. Assume K classes where each class represents a target stimulus frequency. The hypothesis function predicts whether a given input belongs to k th class or not, and it is formulated by the following equation: where g represents the sigmoid function,X denotes the augmented feature vector (i. e., [1, x 1 , ...x 2K ]) with a size of 2K + 1, and θ k stands for the mapping weight vector of k th class. θ k is chosen in such a way that it minimizes the cost function J(θ k ), which is a distance metric between the prediction and the actual class label (y), by the following equation [29]: where X (m) , y (m) ; m = 1, . . . , M represents the training set with M training examples and y {0, 1}. The leave-one-out cross-validation is performed to resample the training data for true objectivity and its suitability for small datasets [30]. The last summative term in Eq. 6 prevents over-fitting the classifier and its precision is controlled by the regularization parameter λ. J(θ k ) is minimized with a gradient descent algorithm, and optimal θ k is calculated for ∀k.
After the training stage, the probability that a given input belongs to each class is calculated using the hypothesis function in Eq. 5, and the class with the highest probability is labeled as a candidate frequency for recognition as follows: The candidate frequency is labeled as recognized (i.e., f = f c ) when the same f c occurs at least t times in the last T iterations, where the typical values for these parameters are three and four, respectively. If the selection criteria are not satisfied during the given period, it is evaluated as an unsuccessful recognition. A flowchart of the proposed BIFB method for SSVEP frequency recognition is presented in Fig. 7.

D. Comparison Methods
The performance of the proposed algorithm is compared with the performances of various SSVEP frequency recognition algorithms. PSDA and CCA are selected as comparison methods since they are the most common techniques in the literature to compare a new algorithm [19]- [21]. However, there is no training in these traditional approaches, and a direct comparison may not be proper. Therefore, the BIFBs are replaced with unit filters (UFs), and a similar classical training process is performed for classification to examine the effectiveness of the proposed bio-inspired filter design fairly. Also, the parameters are optimized/calibrated to maximize the ITR performance in all SSVEP frequency recognition methods.

1) UF:
It is an SSVEP frequency recognition method, which follows a similar procedure to the proposed scheme in Subsection II-C except for the utilization of BIFBs. Instead, the features are extracted with unit filters, and they are expressed as follows: 10 where D is the index for dataset, and BW D equals to 1 for Dataset-A whereas it is equal to 0.5 for Dataset-B. Since the only difference between BIFB and UF methods is the filter type utilized in the feature extraction stage (like a controlled experiment), any performance difference can be attributed to the filter bank design.
2) PSDA: The EEG signal from the occipital channel is pre-processed, and PSD is estimated similar to the proposed approach. Afterward, the peak of the spectrum is determined as the target frequency ( f ) in the traditional PSDA approach [16]. In this study, the harmonic responses are considered in the PSDA algorithm as well for a fair comparison. Initially, the class values, where each class represents a target frequency, are calculated by summing the energy in the fundamental frequency and harmonic bands. Subsequently, the class that has the maximum value is recognized as SSVEP target frequency as follows: 3) CCA: The final comparison method, CCA, is a multivariable statistical method that aims to reveal the underlying correlation between two sets of data [31] and has been widely used for SSVEP frequency recognition [17]. If A is a multi-channel EEG signal, and B is the Fourier series of a square-wave stimulus signal, CCA searches for the linear combination vectors (γ a , γ b ) that maximize the correlation between α = γ T a A and β = γ T b B by optimizing the following equation: The optimization problem in Eq. 11 can be solved by a generalized eigenvalue decomposition [32], and the maximum  Fig. 7. Flowchart of the signal processing stages of an SSVEP-based BCI using the proposed BIFBs.  correlation coefficient (ρ) is computed for each B k . Finally, the SSVEP target frequency is recognized as follows: A similar pre-processing procedure to PSDA is applied to the multi-channel EEG signal (i.e., A) in CCA as well.

III. RESULTS AND DISCUSSION
The proposed BIFB method for SSVEP frequency recognition is tested on two datasets that include EEG recordings of eight subjects in 152 trials. The system performance is evaluated in terms of mean recognition time (MRT), recognition accuracy, and ITR by implementing a leave-one-out cross-validation methodology. It is worth to note that ITR changes logarithmically with the number of available commands in Eq. 1. The number of commands in each dataset is different, and hence ITRs need to be interpreted separately. The performance of the proposed algorithm is compared with three baseline methods, and the results are listed in Table II  and Table III. The statistical significance of these results is examined by paired t-tests [33], and corresponding p-values are presented in Table IV. No multiple comparison correction is considered since the study is restricted to a small number of planned comparisons, and the results of individual tests are important [34].
The traditional PSDA approach requires longer time windows compared to the other three methods to provide sufficient accuracy, which leads to a longer MRT and a lower ITR. A shorter MRT not only improves the ITR but also diminishes the visual fatigue due to a reduced gazing duration. Also, PSDA, as well as CCA, is incapable of detecting stimuli in the high-frequency band. The low recognition accuracy of 28 Hz stimulus, which is presented in Table V, explains the poor performance results of these algorithms in Dataset-A. On the other hand, there are no high-frequency stimuli in Dataset-B, but the frequency selectivity decreases the ITR performances of PSDA, CCA, and UF.
PSDA and CCA have the advantage of not requiring training, and just a straightforward calibration that includes the selection of electrode locations, number of harmonics, and time window duration is sufficient to perform the recognition. However, these algorithms disregard the correlation information between the classes. A simple logistic regression model can capture the between-class information and enhance  performance. Another classification model may achieve better performance. However, it is beyond the scope of this study, and [35]- [37] can be referred for more detailed information. The SSVEP response is subject-specific, but the inter-trial variance is low within a subject. Therefore, one-time individualized training is acceptable to acquire a higher ITR. Furthermore, BIFB and UF implement the same classifier. However, a feature extraction stage with BIFB, which captures the underlying biological features of SSVEPs, increases the separability and outperforms UF for SSVEP frequency recognition in both datasets. User comfort is another important criterion in BCI design besides the ITR. It is reported that high-frequencies cause less visual fatigue induced by the flicker [16], [38]. The promising performance of BIFBs in the high-frequency band may let the designers include this low SNR band in their BCI system. As a result, the user discomfort caused by the flicker reduces, and also ITR increases due to the increase in number of available commands. Furthermore, the number of electrodes is critical for user comfort. Although it is preferable to have a dense sensor system while mapping the brain, it is not suitable for practical BCI applications. In this study, BIFB utilized the information from one electrode for the sake of simplicity. The results show that a single-channel algorithm can provide superior performance compared to a multi-channel algorithm (i.e., CCA), and enhance user comfort as well. However, the use of BIFBs is not restricted to singlechannel utilization, and recognition accuracy might be further improved by taking advantage of multi-channel information in the feature extraction stage. For example, a simple way to utilize the BIFBs with multi-channel EEG would be to apply them on signals from the occipital channels and pass the weighted average of the extracted features to the feature classification stage.

IV. CONCLUSIONS
A novel SSVEP recognition method that exploits the inherent biological characteristics of SSVEPs is introduced in this paper. The BIFBs capture frequency selectivity, subject specificity, and harmonic SSVEP responses in the feature extraction stage and enhance the separability of classes. The proposed method is tested on two benchmark datasets available online and outperforms several recognized recognition algorithms. The BIFBs are promising particularly in the highfrequency band where SNR is low. Hence, this method not only increases the ITR of an SSVEP based BCI but also might improve its user comfort due to less visual fatigue. The results show the potential of bio-inspired design, and the findings will be extended to include further SSVEP characteristics. First, the best pulse shape to utilize in the filter banks remains unknown. The triangular filters in this study might need to be replaced with another shape such as Gaussian or raised-cosine to improve the performance further. Second, the BIFBs should incorporate the time-characteristics of SSVEPs. The onsetdelay of the response is frequency selective [18] and including this distinct feature might increase the recognition accuracy as well. Last, the SSVEP response also strongly depends on the stimuli type [15], [39], and the BIFB adaptation considering the visual stimuli requires further investigation.
BCIs and their associated technologies will shape the future of communication, control, and security as a part of WBAN. To fully exploit and further increase the potential of these devices, it is necessary to employ an accurate model of the driving physiological signal in the recognition algorithm. Bioinspired designs such as the proposed BIFBs will be the key in enabling the development of reliable, efficient, and highperformance BCI systems.
Huseyin Arslan (S'95-M'98-SM'04-F'16) received the B.S. degree in electrical and electronics engineering from Middle East Technical University, Ankara, Turkey, in 1992, and the M.S. and Ph.D. degrees in electrical engineering from Southern Methodist University, Dallas, TX, USA, in 1994 and 1998, respectively. From January 1998 to August 2002, he was with the research group of Ericsson Inc., NC, USA, where he was involved with 2G and 3G wireless communication systems. He is currently a Professor of Electrical Engineering at the University of South Florida, Tampa, FL, USA, and the Dean of the College of Engineering and Natural Sciences at theİstanbul Medipol University,İstanbul, Turkey. His current research interests are on 5G and beyond, waveform design, advanced multiple accessing techniques, physical layer security, beamforming and massive MIMO, cognitive radio, dynamic spectrum access, interference management (avoidance, awareness, and cancellation), co-existence issues on heterogeneous networks, aeronautical (high altitude platform) communications, millimeter-wave communications and in vivo communications. He is currently a member of the editorial board for the IEEE Communications Surveys and Tutorials and the Sensors Journal. His research interests include deep machine learning theory and applications in semi-supervised and unsupervised settings, data-oriented applications of RFID systems in healthcare and food supply chains, and signal processing algorithms for brain-computer interfaces.