A Calibration-Free Hybrid BCI Speller System Based on High-Frequency SSVEP and sEMG

Hybrid brain-computer interface (hBCI) systems that combine steady-state visual evoked potential (SSVEP) and surface electromyography (sEMG) signals have attracted attention of researchers due to the advantage of exhibiting significantly improved system performance. However, almost all existing studies adopt low-frequency SSVEP to build hBCI. It produces much more visual fatigue than high-frequency SSVEP. Therefore, the current study attempts to build a hBCI based on high-frequency SSVEP and sEMG. With these two signals, this study designed and realized a 32-target hBCI speller system. Thirty-two targets were separated from the middle into two groups. Each side contained 16 sets of targets with different high-frequency visual stimuli (i.e., 31-34.75 Hz with an interval of 0.25 Hz). sEMG was utilized to choose the group and SSVEP was adopted to identify intra-group targets. The filter bank canonical correlation analysis (FBCCA) and the root mean square value (RMS) methods were used to identify signals. Therefore, the proposed system allowed users to operate it without system calibration. A total of 12 healthy subjects participated in online experiment, with an average accuracy of 93.52 ± 1.66% and the average information transfer rate (ITR) reached 93.50 ± 3.10 bits/min. Furthermore, 12 participants perfectly completed the free-spelling tasks. These results of the experiments indicated feasibility and practicality of the proposed hybrid BCI speller system.

A Calibration-Free Hybrid BCI Speller System Based on High-Frequency SSVEP and sEMG I. INTRODUCTION B RAIN-COMPUTER interface (BCI) aims to establish a direct communication mode between brain and external environment [1], [2]. Two important roles of BCI in rehabilitation are to replace and restore lost neurological function [3], [4]. When the BCI system acts as a replacement for lost neurologic function, it can be utilized to assist people with language or motor difficulties regain their ability to communicate and control, such as typing characters, controlling wheelchairs, operating household electrical appliances, etc. Alternatively, BCI systems can also be used to restore lost neurologic function. For example, controlling functional electrical stimulation (FES) through BCI can promote neuroplasticity and functional recovery by activating the body's natural efferent and afferent pathways, thereby promoting motor learning and neural reorganization. According to the different ways in which brain signals are collected, BCI systems can be divided into two categories. That is, invasive BCI [5], [6] and non-invasive BCI [7]. Invasive BCIs have a high signal quality which is conducive to the realization of high-precision brain signal decoding in the later stage. However, invasive methods have obvious drawbacks, such as surgical risk [8] and gradual degradation in the quality of recorded signals [9]. Currently, various methods such as electroencephalography (EEG) [10], magnetoencephalography (MEG) [11], functional magnetic resonance imaging (fMRI) [12], and near infrared spectroscopy (NIRS) [13], have been reported to noninvasively monitor the brain activity and build non-invasive BCIs [14]. Among them, EEG signals are widely used due to their non-invasive, low-cost, and high temporal resolution [15], [16]. In recent years, the experimental paradigms based on EEG-BCIs are growing vigorously [15], [16], [17]. Furthermore, the performance of various BCI systems has getting better. The highest information transfer rate (ITR) has reached more than 300 bits/min, a significant increase from the initial 20 bits/min [18], [19]. However, there is still a need to further improve the performance to approach natural human-computer interaction [20].
Recently, studies have proposed that hybrid systems outperform single systems [21], [22]. In particular, hybrid BCI (hBCI) systems that combine EEG and surface electromyography (sEMG) signals exhibit significantly improved system performance [23], [24], [25]. For example, Lin et al. used low-frequency (i.e., 6-11.6 Hz) steady-state visual evoked potential (SSVEP) and sEMG signals to develop a 60-target speller that achieved an ITR of 90.9 bits/min, significantly higher than the ITR of a single system (sEMG: 30.7 bits/min, SSVEP: 60.2 bits/min) [24]. In the study by Rezeika et al., a 30-target hBCI speller system was constructed based lowfrequency (i.e., 6.1-11.8 Hz) SSVEP and sEMG, the results showed that the hybrid system is much faster than the single system [26]. Chen et al. used low-frequency (i.e., 6 Hz, 8 Hz, 10 Hz and 15 Hz) SSVEP and sEMG to achieve a non-invasive transhumerus prosthesis control method, using SSVEP to increase the control of hand movement and enrich the needs of daily life [27]. Davarinia and Maleki have reported a combination of low-frequency (i.e., 5.88-11.11 Hz) SSVEP and EMG signals to build a system to predict elbow angle trajectory, the results showed that the introduction of SSVEP signal can increase the robustness of the dual-modality structure [28]. Although the above studies have confirmed that building hybrid systems based on SSVEP and sEMG can improve the performance of the system, the comfort of the system needs to be improved. Most of the existing hybrid BCIs built on sEMG and SSVEP adopt low-frequency SSVEPs to build hybrid systems. Low-frequency visual stimulation can cause stronger signal correspondence and thus facilitate signal detection [29], but it can also cause visual fatigue in subjects and reduce the comfort of the experiment [30]. Therefore, SSVEP-based systems need to be improved in terms of comfort.
The previous studies have demonstrated that flashing at a stimulus frequency above 30Hz can relieve visual fatigue and improve visual comfort [30], [31], [32], [33]. Therefore, in this study, we introduced high-frequency SSVEP to build a hBCI system. By utilizing high-frequency SSVEP and sEMG signals, this study designed and implemented a speller with 32 targets. Thirty-two targets were separated from the middle into two groups. Each side contained 16 sets of targets with different high-frequency visual stimuli (i.e., 31-34.75 Hz with an interval of 0.25 Hz) [34]. sEMG was utilized to choose the group and SSVEP was adopted to determine the target stimulus within the group. The sEMG signal performs group class identification by calculating its root mean square (RMS) value. The filter bank canonical correlation analysis (FBCCA) method was used to detect SSVEPs. Fig.1 shows the framework diagram of the system. Therefore, the proposed system allowed users to operate it without system calibration. In this study, the feasibility of the system is verified by offline and online experiments.

A. Subjects
Nineteen healthy participants, including 5 males and 14 females, with normal visual or corrected vision and aged 22-30 years, participated in the trial. Among them, the number of people participating in the offline experiment was 10, and the number of people online was 12. There were 3 people participating in both offline and online experiments. Several participants terminated their involvement due to graduation and were thus excluded from the continuation of the experiment. During the experiment, the subjects sat in a comfortable chair 70 cm in front of the computer screen. Each subject signed an informed consent prior to the experiment and received appropriate monetary compensation after the experiment. This study is approved by the Institutional Review Board of Tsinghua University.

B. System Design
The proposed hybrid BCI speller system included two input signals, viz., SSVEP and sEMG signals. The stimulation interface is shown in Fig. 2(A). The user interface of the proposed system is presented on the computer monitor (SAMSUNG C49HG90DMC, resolution: 3840 × 1080 pixels, refresh rate: 120 Hz). The 4 × 8 stimulation matrix is presented in the interface which includes 6 symbols (i.e., backspace, enter, comma, exclamation mark, space, question mark) and 26 English characters. Each square matrix size is 173 × 129 pixels. The interval between matrices is 100 pixels. There is a dotted line in the center of the interface that splits the 32 targets into left and right sides. Each group encoded 16 targets by 16 different high-frequency visual stimuli (i.e., 31-34.75 Hz with an interval of 0.25 Hz). The stimulation frequency for each target is shown in Fig. 2(B). Flicker at different stimulus frequencies is induced by sampled sinusoidal stimulation methods [35]. The target on the left side of the dotted line is encoded by the sEMG signal caused by the flexion movement. Further the extension movement is used to encode the target to the right of the dotted line. The sEMG signals were utilized to choose the desired group. A schematic diagram of the right-hand flexion and extension action is shown in Fig. 3. The Psychophysics Toolbox Version 3 under MATLAB was adopted to present visual stimuli.

C. Offline Expeimental Design
A total of 12 blocks were conducted in the offline experiment. Each block traverses 32 targets, and each target appears randomly, and the order in which the targets appear is determined by a random sequence generated by the program. Each target will be prompted by a 1 s red square before appearing. During the time indicated by the red square, subjects need to switch their attention to that goal, while preparing their wrists for flexion/extension movement (i.e., the flexion movement for the left group and the extension movement for the right group). After that, all the stimuli were made to flash simultaneously on the monitor for 2 s, and the subject made corresponding flexion/extension movements. After the flashing is over, there was a 1 s rest period to wait for the next cue. In order to ensure that they are not disturbed by eye movement artifacts during the 2 s mission, subjects need to be as unblinking as possible during the task. In order to reduce the visual fatigue of the subjects, a few minutes of rest will be carried out according to the actual situation of the subjects during the experiment.

D. Online Expeimental Design
The online experiment was divided into two parts, i.e., cued-spelling task and free-spelling task. The subject first did 9 blocks of cued-spelling tasks, followed by 3 blocks of free-spelling tasks. In the cued-spelling task, the same as in the offline experiment, requires traversing 32 targets, and each target appears randomly. According to the offline optimization results, the system performance was best when the target stimulation time was 1.8 s. That is, each target needed to consume 2.8 s, containing 1.8 s of stimulation time and 1 s of attention switching time. During the scintillation task, the subject needed to focus on the corresponding target and make corresponding flexion movement or extension movement. The red square prompt and the stimulus flashing task alternately, and when the stimulus task ends, the next red square immediately appears, while the recognition result of the previous target is feedback and if the recognition is correct, it would make a short drip sound. The free-spelling task required participants to spell out "HIGH SPEED BCI!". In the free spelling task, visual feedback replaces a short drop (i.e., the identified target was displayed on the top of the interface). Each trial also lasted for 2.8 s, i.e., 1.8 s for stimulus presentation and 1 s for attention switching. Subjects were allowed to delete misspelled letters using "backspace".

E. Data Acquisition
The SSVEP data was collected through Neuroscan system with a 64-conductor Ag/AgCl cap extended from the International 10-20 system at a sampling frequency of 1000 Hz. SSVEP data recorded data for only 9 channels in the occipital region ((i.e., Oz, O1, O2, POz, PO5, PO3, PO4, PO6, and Pz). The REF electrode between Cz and CPz is selected as the reference electrode, and the ground electrode is the GND electrode that in the middle of Fz and FPz. The electrode impedances are less than 10 k . The sEMG data is recorded using a patch electrode connected to a differential amplifier of NeuroScan's Synamps2 system. The patch electrode collects surface signals at the locations of the flexor carpal ulnar and extensor carpi radialis longus of the arm, as shown in Fig. 4. Prior to recording, alcohol is wiped on the subject's arm to reduce electrical impedance and make the electrodes fit better with the skinthe.

F. Amplitude and SNR of SSVEP
Data lengths of two seconds from offline experiments were extracted to analyze the amplitude spectrum and SNR of SSVEP signals. The frequency spectrum of SSVEP data was analyzed through fast Fourier transform (FFT). Since the length of time selected by SSVEP signal data was 2s, the frequency resolution was 0.5Hz when doing FFT transformation, which did not match the stimulation frequency interval of 0.25Hz. Therefore, the data length is extended to 4s by zero-filling method in the tail of the data, so that the frequency resolution is consistent with the stimulus frequency interval [36]. In the study of SSVEP, the SNR is an important measurement index [37]. The SNR at frequency f is the ratio of the amplitude of SSVEP at frequency f to the average amplitude value of q surrounding frequencies: where y( f ) is the amplitude of FFT at frequency f . In this study, q was set to 10.

G. Target Recognition of SSVEP
The FBCCA algorithm is an improvement over the canonical correlation analysis (CCA) algorithm. It provides better reading of harmonic information by introducing filter analysis [38]. The analysis process of the FBCCA algorithm is as follows: EEG signals are divided into M subbands using a bandpass filter bank, and the CCA algorithm is then used on the M subbands by calculating the correlation coefficient between each subband and signals of different stimulus frequencies. Then the correlation coefficients of M subbands are weighted sum and the maximum value is selected as the recognition frequency of the target. The CCA algorithm can calculate the value of the linear correlation coefficient between two multidimensional variables, and its core problem is to solve the optimization problem of equation (2). The X and Y signals are linearly combined to find the best W X and W Y so that the correlation coefficient between the x = X T W X and y = Y T W Y transformations is maximum, as shown in equation (2).
where X represents multi-dimensional EEG data. The reference signal is represented as Y , which generally consists of sine and cosine signals and its harmonics corresponding to the stimulus frequency . . . f i (i = 1, 2, , 16).
Here, Y f i is defined as where M h represents the number of harmonics. After analysis, the M h was 2 as the most reasonable. N means the number of sample points. The f s is the sampling rate. X and Y have the same data length. The basic process of FBCCA is as follows: the filter is designed to extract corresponding subband components, and the correlation coefficient between the component and the reference signal at different stimulus frequencies is then calculated separately on the subband. The typical correlation coefficient between the m th subband and the reference signal Y f i is: The correlation coefficient weight of the m th subband is defined as ω m , m ∈ [1M]. The typical correlation coefficient between the multi-channel EEG data X and Y f i is then computed as: The ρ i is used as the characteristic value for frequency identification. Finally, the frequency value corresponding to the maximum value ρ i was selected as the frequency of the SSVEP signal: Finally, the frequency recognition result of the SSVEP signal f target is output.

H. Target Recognition of sEMG
Significant differences in sEMG signals were observed between the extensor carpi radialis longus and flexor carpal ulnar during flexion/extension. The RMS value reflects the degree of muscle signal activity, as shown in Fig. 5, which shows the sEMG data processing results of all the subjects. The RMS value of the two channels is calculated respectively to reflect the degree of muscle activity of the extensor carpi radialis longus and flexor carpal ulnar. The RMS value of the j th channel is: where N represents the number of sample points and x i represents the amplitude of signal in time domain. By comparing the RMS of different channels, the movement of the subject is evaluated. If R M S 1 < R M S 2 the movement is judged as flexion, and the stimulus target is identified to the left. Otherwise, the movement is identified as extension, and the stimulation target is distributed on the right.

A. Amplitude and SNR of SSVEP
Calculate the total average amplitude spectrum and SNR of the SSVEP signal recorded under the Oz electrode at 33 Hz, and the results are shown in Fig. 6. As shown in Fig. 6, both the amplitude spectrum and SNR have significant peaks at 33 Hz (33 Hz: 0.30 µV, 6.39 dB) and 66 Hz (66 Hz: 0.13 µV, 4.37 dB). Fig. 7 shows the variation of the mean amplitude spectrum and SNR with stimulation frequency and response frequency. There are significant peaks at both the fundamental frequency and the second harmonic. Furthermore, the peak at the second harmonic is significantly lower than at the fundamental frequency. Therefore, the number of subbands M was 2. The signal at the fundamental frequency and second harmonic was selected for analysis and processing.

B. Offline SSVEP Detection
The results of SNR analysis provided a reference for the design of the filter frequency band. The starting frequency of the filter subband is m×31 Hz. And the cut-off frequency of the filter subband is 90 Hz. When designing the bandpass filter, in order to ensure that the fundamental frequency is not distorted, the bandwidth of 2 Hz is increased at the starting frequency. The passband frequency of the first subband is 29∼90 Hz, and the passband frequency of the second subband is 59∼90 Hz. In addition, the parameters of the system (the weight vector of the subband components and the data length) were optimized based on offline results to further improve the performance of the system. And the optimal parameters were determined by using ITR as the measurement index in the optimization process. To simplify the process, parameter optimization was identified by grid search method. The weight coefficient [ω 1 ω 2 ] were respectively limited to the range of 0∼1, and the interval was 0.1 and their sum was 1. The data length was traversed 0.2 to 2 s, and ITR was calculated every 0.2 s. Fig. 8 shows the corresponding ITR for different data lengths and weights based on SSVEP signals. According to the grid search results, the weight coefficient ω 1 , ω 2 , and data length t were set to 0.6, 0.4, and 1.8, respectively, for which the proposed SSVEP-based BCI performance was optimal. The maximum ITR is 65.83 ± 3.95 bits/min. The corresponding classification accuracy and ITR under the optimal weight coefficient (i.e., [ω 1 ω 2 ] = [0.6 0.4]) and different data lengths was shown in Fig. 9.

C. Offline sEMG Detection
This study analyzed the average classification accuracy of sEMG under different data lengths (see Fig. 9). As shown in Fig. 9, the classification accuracy rate first increases and then stabilizes. In order to keep consistent with the optimized time of SSVEP signal, sEMG signal was adopted for 1.8 s and the classification accuracy reached 98.93 ± 0.41%.

D. Offline Performance of The Entire hBCI System
The performance of the entire hBCI depends on the detection performance of SSVEP and sEMG. As shown in Fig. 9, the hybrid system achieves optimal performance with a data length of 1.8s. The highest ITR is 83.12 ± 4.66 bits/min. The classification accuracy corresponding to the highest ITR is 87.42 ± 2.87%. Since the accuracy of sEMG detection is almost 100%, the performance of the entire hBCI relies heavily on the classification performance of the SSVEP signal. As shown in Fig. 9, the classification accuracy of the entire hBCI system is similar to that of the SSVEP-based BCI system.
In addition, one-way ANOVA was performed to study the classification accuracy of sEMG under different stimulus frequencies (F (15,144) = 0.71,P = 0.78) and SSVEP signals under flexion and extension (F (1,18) = 0.42,P = 0.52). The results showed that there was no significant influence between sEMG and SSVEP signals. Finally, the correlation between sEMG and SSVEP classification accuracy was analyzed (ρ = 0.18), and the results also showed that there is no correlation between them.

E. Online Performance of The Entire hBCI System
Obtain the optimal parameters through offline data analysis and build an online system for verification. Online results for all subjects are shown in Table I. The results in Table I showed that the classification accuracy of the online system has been achieved 93.52 ± 1.66% and an average ITR of 93.50 ± 3.10 bits/min. Table II showed the results for the freespelling tasks. The results showed that all subjects successfully completed the spelling task.

IV. DISCUSSION
In this study, a 32-target hBCI speller system was realized by combination of high-frequency SSVEP and sEMG. It was the time that high-frequency stimulation has been introduced into a hybrid BCI based on SSVEP and sEMG. And the ITR reached 93.50 ± 3.10 bits/min, the highest level in the hybrid BCI based on high-frequency SSVEP and sEMG. The results of two online tasks (cued-spelling task and free-spelling task) verify the effectiveness of the proposed system, which can be used to communicate effectively with the outside world.
Currently, a majority of the existing hybrid BCIs based on SSVEP and sEMG adopt low-frequency stimuli to elicit SSVEPs. Although low-frequency can induce a stronger  [39], [40], it can cause visual fatigue in subjects. In contrast, high-frequency can alleviate fatigue problems [41]. Although the classification accuracy and ITR of SSVEP induced by high-frequency stimulation are lower than those of low-frequency stimulation, the classification accuracy of high-frequency stimulation system can reach more than 80%, which meets the practical standard. In terms of SNR, low frequency and high frequency stimuli produced almost the same level of SNR. Therefore, in order to balance the performance and comfort of the system, it is necessary to study the high-frequency stimulation system. Hence, high-frequency visual stimulation (i.e., 31-34.75 Hz) was used in this study to induce SSVEPs. And FBCCA algorithm was used to identify signals, which can effectively extract harmonic information. Fig. 6 and Fig. 7 show that the SNR of SSVEP has an obvious peak at the fundamental and harmonic frequencies, which proves that it is feasible to extract harmonic information by FBCCA for frequency recognition. Table I showed that the hBCI system obtained an average accuracy of 93.52 ± 1.66%. In general, if the accuracy is higher than 80%, the BCI system is considered feasible and can be considered to have achieved effective communication [42]. In addition, the system proved the effectiveness of FBCCA in the application of high-frequency stimulation system. At present, there are few studies on high-frequency SSVEPs and this study can provide a useful fundation for high-frequency SSVEPs research.
Additionally, the sEMG detection is also very important for hybrid BCIs. In the process of sEMG signal analysis, Lin et al. [24] and Chai et al. [43] extracted the envelope of sEMG signals and then analyzed them using the threshold algorithm. In contrast with the abovementioned studies, the current study directly uses the mean value of the signal amplitude spectrum as the feature for identification, and the algorithm is fairly simple. The sEMG classification accuracy reaches 98.93% (see Fig. 9). High classification accuracy results have confirmed that this method is feasible.
As far as we know, the performance of the hybrid system proposed in this paper is the highest ITR (93.50 bits/min) among the reported hybrid BCIs built on SSVEP and sEMG. The reasons for the best ITR performance of our proposed system are mainly in two aspects. On the one hand, sEMG signals were added to construct a hBCI system extended the system's target (from 16 to 32 targets); On the other hand, FBCCA algorithm was used to identify the target. Compared with CCA algorithm, the classification accuracy of the system can be improved under the same data length [38], thus shortening the target stimulation time, and further improving the ITR of the system. It may be noted that in comparison, the target number of the system designed by Lin et al. was 60, which is much higher than the system proposed in this paper, but the recognition performance of CCA algorithm for target recognition is far lower than that of FBCCA algorithm [38]. Therefore, the ITR (90.9 bits/min) is lower than the system proposed in this study. Similarly, in Rezeika et al. [26], although the identification accuracy of Minimum Energy Combination method (MEC) algorithm reached 100%, the time required for each command was 3-8s, which is much higher than the time required by our proposed system, resulting in low ITR (37.37 bits/min) performance. In Chai et al. [43], CCA algorithm was used to identify targets, and the classification accuracy reached 100% when the data length was 4s. However, the system only had 8 commands, which is far lower than our system, so the ITR (45 bits/min) is smaller than the system proposed in this research. In addition, the ratio of fundamental frequency and second harmonic weight coefficient of FBCCA will affect the classification effect of the target. When the data length is constant, ITR presents a trend of first rising and then decreasing with the increase of the weight coefficient ω 1 (Fig. 8), which is related to the fact that SNR at the fundamental frequency is the strongest, and SNR at the harmonic decreases with the increase of the harmonic number. Therefore, we optimized the weight coefficient and data length according to the offline experimental data, and built the online system according to the optimized results to achieve the optimal system performance.
The present system used unsupervised FBCCA method to identify SSVEP signals and sEMG signals were classified according to their average amplitude. Therefore, the entire system allowed users to operate it without the need of system calibration. Although the performance of the proposed system had been improved, the system needs to be further perfected in order to apply the system to real life as soon as possible. Firstly, more efficient frequency recognition algorithm can be used to optimize parameters and shorten the stimulation time of SSVEP signals to improve ITR. Secondly, due to individual differences, the target recognition accuracy of each subject is different, so the number of times that they need to select the desired target in the process of completing the free-spelling tasks is also different. In the future, the parameters can be optimized according to the individual situation of subjects to reach optimal system performance. Thirdly, the system was only used by the subjects for a short period of time, and no effect of muscle fatigue on system performance was found. In the next work, we will further investigate whether muscle fatigue caused by prolonged use of the system affects the performance of the system. Finally, this study used healthy subjects to test and validate the system. In future work, we will attempt to use motor-impaired patients to further verify the system.

V. CONCLUSION
The aim of this study is to use high-frequency SSVEP to build a hybrid BCI system. We designed and realized a 32-target hybrid BCI speller system. Two predefined wrist movements corresponded to two group. Each group consisted of 16-target high-frequency SSVEP-based BCI. sEMG signal was utilized to select the group with target stimulus. Further, high-frequency SSVEPs were used to choose the target stimulus within the group. sEMG and SSVEP signals were identified by time domain analysis and FBCCA, respectively. This allowed users to operate the proposed system without system calibration. The system was tested in healthy subjects with an average accuracy of 93.52 ± 1.66% and a mean ITR of 93.50 ± 3.10 bits/min. It was the highest ITR in hBCI based on SSVEP and sEMG, and we used high-frequency stimulation to relieve participant fatigue. Both online and offline results demonstrate the effectiveness of the proposed system and lay a foundation for the research of hBCI based on sEMG and SSVEP.