A Spectrally-Dense Encoding Method for Designing a High-Speed SSVEP-BCI With 120 Stimuli

The practical functionality of a brain-computer interface (BCI) is critically affected by the number of stimuli, especially for steady-state visual evoked potential based BCI (SSVEP-BCI), which shows promise for the implementation of a multi-target system for real-world applications. Joint frequency-phase modulation (JFPM) is an effective and widely used method in modulating SSVEPs. However, the ability of JFPM to implement an SSVEP-BCI system with a large number of stimuli, e.g., over 100 stimuli, remains unclear. To address this issue, a spectrally-dense JPFM (sJFPM) method is proposed to encode a broad array of stimuli, which modulates the low- and medium-frequency SSVEPs with a frequency interval of 0.1 Hz and triples the number of stimuli in conventional SSVEP-BCI to 120. To validate the effectiveness of the proposed 120-target BCI system, an offline experiment and a subsequent online experiment testing 18 healthy subjects in total were conducted. The offline experiment verified the feasibility of using sJFPM in designing an SSVEP-BCI system with 120 stimuli. Furthermore, the online experiment demonstrated that the proposed system achieved an average performance of <inline-formula> <tex-math notation="LaTeX">${92.4}7\pm {1.83}\%$ </tex-math></inline-formula> in online accuracy and <inline-formula> <tex-math notation="LaTeX">${213.23}\pm {6.60}$ </tex-math></inline-formula> bits/min in online information transfer rate (ITR), where more than <inline-formula> <tex-math notation="LaTeX">${75}\%$ </tex-math></inline-formula> of the subjects attained the accuracy above <inline-formula> <tex-math notation="LaTeX">${90}\%$ </tex-math></inline-formula> and the ITR above 200 bits/min. This present study demonstrates the effectiveness of sJFPM in elevating the number of stimuli to more than 100 and extends our understanding of encoding a large number of stimuli by means of finer frequency division.


I. INTRODUCTION 35
A BRAIN-COMPUTER interface (BCI) offers a direct 36 communication path between the brain and the out-37 side world by translating the brain measurements associated 38 with sensation, perception and cognition into commands or 39 objective reports [1]. The BCI technology can be broadly 40 categorized into invasive and non-invasive paradigms; invasive 41 BCI is emerging in clinical applications and non-invasive 42 BCI expands the scope to non-clinical daily applications. 43 Among the non-invasive paradigms, steady-state visual evoked 44 potential based BCI (SSVEP-BCI) [2], [3] is widely used 45 in research along with its counterparts of P300-based BCI 46 and motor imagery BCI. Compared with its counterparts, 47 the SSVEP-BCI usually has a lower BCI-illiterate rate [4] 48 and a higher information transfer rate (ITR) [3], which are 49 attributed to the high signal-to-noise ratio (SNR) of SSVEP. 50 Physiologically, SSVEP is a time-locked and frequency-tagged 51 brain response elicited by flickers or checkerboards alternating 52 at a certain stimulus frequency. The frequency-tagged attribute 53 of SSVEP makes it a prime candidate for channel encoding, 54 where the stimulus of each target can be efficiently encoded 55 by the widely used joint frequency-phase modulation (JFPM) 56 [3], [5], [6], [7], [8], [9], [10], [11]. 57 The encoding approach of JFPM is critical in implementing 58 a high-speed SSVEP-BCI and has significant implications 59 in the development of the technology as well. Inspired by 60 frequency-division multiple access (FDMA) in the commu-61 nication system [12], JFPM configures each stimulus with 62 a stimulation frequency that is equally spaced within a fre-63 quency interval. To further increase the separation ability 64 between stimuli, an initial phase is added in the modulation 65 and usually adjacent stimuli have distinct phase information. 66 This joint modulation offers a high discriminability between 67 stimuli for a short data length, which is a major advantage 68 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and critically important to implementing a high-performance number of stimuli and BCI performance. Although impressive 115 progress has been achieved, this area of research remains in its 116 infancy, with many issues that await further investigation. For 117 instance, the average performances reported in prior works 118 cannot simultaneously achieve high accuracy and high ITR. 119 Additionally, it remains poorly understood whether the widely 120 used JFPM is capable of implementing an SSVEP-BCI system 121 with over 100 stimuli. 122 To address these issues, this study utilized JFPM to encode 123 a large number of stimuli and validate a 120-target SSVEP- theoretically sound to encode such a high volume of stimuli 127 in the SSVEP spectrum. Under this assumption, a spectrally-128 dense JFPM (sJFPM) is hereby proposed by efficiently tagging 129 a low-and medium-frequency band with a frequency interval 130 of 0.1 Hz. A state-of-the-art task-related component analysis 131 (TRCA) was then adopted in the target recognition. To the 132 knowledge of the authors, this is the first study that expands 133 the number of stimuli encoded by JFPM to over one hundred, 134 which is considered a challenging problem by the previous 135 study [19]. To validate the proposed system, an offline exper-136 iment was designed at first to verify the effectiveness of the 137 proposed system and optimize the system parameters. In a 138 further attempt to identify the ground-truth performance of 139 the system, an online experiment testing 13 healthy subjects 140 was then performed. This study recruited 18 graduate students as healthy volun-144 teers (eight males and ten females). The age of the subjects 145 ranged from 23 to 28 with an average of 23.9 ± 1.6 years 146 (mean ± standard deviation). Twelve of them participated in 147 the offline experiment and 13 participated in the online exper-148 iment. Seven subjects participated in both experiments. All 149 subjects were right-handed and had normal or corrected to nor-150 mal vision. This study was approved by the institutional review 151 board of Tsinghua University (NO. 20200020), and informed 152 consent was signed by subjects before experimentation. An SSVEP-BCI brain speller was designed in this study 155 with 120 stimuli, which were aligned in a 6 × 20 matrix. 156 Based on JFPM, the frequency and initial phase information 157 was encoded as follows: where i ( j ) is the row (column) index of the stimuli, and f 161 ( ) is the frequency (initial phase) interval that starts with 162 the lower limit f 0 ( 0 ).
In conventional JFPM, the lower limit f 0 is configured to is forming a 40-target SSVEP-BCI speller [3], [6], [24]. The  This study recorded nine channels of EEG data using 230 SynAmps2 (Neuroscan Inc., Charlotte, USA) for both offline 231 and online experiments. The nine channels were from the 232 classical occipital montage [24] in the international 10-20 233 system, i.e., Pz, POz/Oz, PO3/4, PO5/6 and O1/O2, which 234 were also used for online analysis in target recognition. The 235 impedances of the channels were maintained below 20 k 236 and the reference channel was set at Cz. The sampling rate 237 was set at 1000 Hz, and EEG data were synchronized to the 238 event triggers of the visual stimuli via a parallel port. EEG 239 data were acquired in an electromagnetic shielding room to 240 reduce environmental noise, and the power-line interference 241 was removed by a hardware notch filter. The data were then 242 downsampled to 250 Hz for offline and online analysis. This study used a state-of-the-art task-related component 245 analysis (TRCA) [6] for target recognition. The performance 246 of the proposed BCI system in target recognition was evaluated 247 using classification accuracy and information transfer rate 248 (ITR). The metric of ITR in bits per min (bits/min) is defined 249 as [27]: where M is the number of stimuli, P the classification 253 accuracy, and T (s) the overall time for target selection and 254 gaze shift. A gaze shift time of 1 s was used to calculate the 255 ITR in both offline and online analyses. 256 1) Offline Analysis: First, parameter selection was conducted 257 to determine the optimal number of sub-bands (N sb ) and stim-258 ulation duration (T s ). Based on the sJFPM scheme, the filter 259 bank was designed with sub-bands ranging from m × 12 Hz 260 to 90 Hz, where m is the index of sub-band that ranged from 261 1 to N sb . The stimulation duration T s can be determined by a 262 sliding window, in which the onset was set at t s + d, where 263 t s is the starting time point of the stimulation. Note that d is 264 the latency including the visual delay and system delay and it 265 was set to 140 ms [3]. A parameter selection was performed 266 for N sb and T s that varied between 1 and 7 and ranged from 267 0.2 s to 2 s with a step of 0.1 s, respectively. The parameters 268 that yielded the best BCI performance were used in the online 269 experiment. for each stimulus frequency using the following formula [9]: where P( f ) is the spectral power for the frequency f , and interval, there were 10 · δ subsets that comprised N c stimuli, 305 and we performed a N c -target classification was performed 306 on each subject, where N c = 12 · δ −1 . The performance of 307 classification was then evaluated by the maximum average ITR 308 across subsets, blocks and subjects.

309
A 6-fold leave-one-block-out cross validation was per-310 formed in the offline analysis, where in each fold one block 311 was used as the test data and the remaining were used as 312 training data. For multiple comparisons in the offline analy-313 sis, a repeated measures analysis of variance (ANOVA) was 314 applied. To account for the violation of sphericity, as assessed 315 by Mauchly's test of sphericity, Greenhouse-Geisser correc-316 tion was then employed. Post hoc comparisons using t-test 317 with Bonferroni correction were conducted when there was a 318 statistically significant main effect ( p < .05). The statistical 319 analyses were performed in SPSS Statistics 26 (IBM, Armonk, 320 NY, USA). Unless otherwise stated, data were presented as 321 mean ± standard error in this study.   Figure 6 illustrates the change in performance with a vary-376 ing number of training blocks. One-way repeated measures 377 ANOVA revealed a statistically significant difference between 378 different number of blocks in accuracy, F(1.34, 14.743) = 379 43.127, p < .001, and in ITR, F(1.714, 18.853) = 118.439, 380 p < .001. The result showed that the BCI performance 381 increased significantly as more training data were available. 382 By leveraging two blocks of training data, the average ITR 383 surpassed 100 bits/min, i.e., 141.48 ± 16.07 bits/min. Using 384 three blocks of training data, the average accuracy surpassed 385 80%, i.e., 83.67 ± 4.13%. Of note, with insufficient training 386 data of one block, three subjects achieved relatively higher 387 performance, i.e., S5 (97.64%), S8 (78.75%) and S7 (77.08%). 388 To investigate the misclassification pattern, the confusion 389 matrix was constructed as shown in Figure 7 A. The number 390 of trials greater than five, which were 72 in total, were colored 391 blue for contrast. A diagonal due to the high accuracy at 2 s 392 is clear, and errors in adjacent stimuli could also be identified. 393 On closer examinations of errors, Figure 7 B and C depict the 394 accuracy and R-squared maps for each stimulus, respectively. 395 Here, the accuracies and R-squared features were arranged 396 according to the user interface of the speller. For the accuracy 397 map, stimuli in the center left region of the speller were 398 recognized with higher accuracy (>90%), whereas the right 399 part had a higher error rate. A similar pattern was observed 400 in the R-squared map, where a higher value as an indicator 401 of better discrimination ability was detected in the left center 402 region. The distribution of the SNR was further delineated to 403 probe into the causal role. Figure 7 D illustrates the bar plot 404 of the SNR values with respect to each stimulus frequency, 405 indicating a decreasing tendency (r = −0.964, p < .001) 406 in SNR as the stimulus frequency increases. To reveal its 407 spatial distribution, the SNR map was illustrated in a similar 408 fashion in Figure 7 E. The result showed that the SSVEP trials 409 evoked by the stimuli in the left region of the speller (column 410 1 − 10) have a significantly higher SNR than that in the right 411 region (left: −10.433 ± 0.138 dB; right: −13.109 ± 0.127 dB; 412 p < .001). 413 The relationship between frequency interval and simulated 414 BCI performance is illustrated in Figure 8 Table I lists the individual and average classification accu-426 racy and ITR recorded in the online experiment among 13 sub-427 jects. The average accuracy achieved was 92.47 ± 1.83%, 428 while the ITR for the BCI system was 213.23 ± 6.60 bits/min. 429 As for the individual performance, over 75% of the subjects 430 (10/13) achieved an online accuracy above 90% and an online 431    The results demonstrated that the 120-target brain speller 445 could achieve an average classification accuracy above 90% Three issues concerning sJFPM and target recognition are 454 further discussed in the following paragraph. First, as for 455 the frequency interval in sJFPM, the BCI performance was 456 measured by means of sampling from the stimuli, suggesting 457 that the traditional JFPM could be scaled to sJFPM to expand 458 the number of stimuli without inhibiting the performance in 459 ITR. The frequency interval of 0.2 Hz is widely used in the 460 implementation of high-speed SSVEP-BCI spellers [3], [6], 461 [9], [24], [29], whereas the frequency interval of 0.1 Hz has 462 received little attention [30]. The usage of 0.1-Hz frequency 463 interval enables the BCI system to configure a broad array of 464 stimuli, and as suggested by the result in Figure 8, the ITR of 465 the system does not suffer from a degradation. This phenom-466 enon of ITR is attributable to the compromise between the 467 number of stimuli M and classification accuracy P in Eq.   (Figure 7 D). Third, as for the target 485 recognition, TRCA was used to validate the system validation, 486 and other state-of-the-art methods [15] could be applied to 487 further reduce the error rate.

488
Compared with the existing methods for a large number of 489 stimuli in BCI, the present study based on the JFPM encoding 490 is characterized by the following. In retrospect, the JFPM 491 encoding method has long demonstrated its excellence in the 492 implementation of a high-speed BCI. For the 40-target BCIs, 493 previous studies [3], [6], [8] reported that BCI systems based 494 on JFPM could achieve an average ITR of over 300 bits/min, 495 which is the state-of-the-art performance in non-invasive BCIs 496 to the knowledge of the authors. For the regime of a large 497 number of stimuli, the effectiveness of JFPM was further 498 demonstrated in the present study by implementing a high-499 speed BCI, which resolves the doubt about the feasibility of a 500 high volume of stimulus frequencies for BCIs. The high-speed 501 performance of JFPM is attributable to the fact that JFPM 502 elicits a remarkably high SNR of brain signals (i.e., continuous 503 SSVEP) compared with other methods in visual BCIs. In con-504 trast to the P300 [10] and c-VEP [21] signals, SSVEP boasts a 505 high single-trial SNR, which removes the hassle of repetition 506 of visual stimuli multiple times and thereby increases the ITR 507 of the system. In comparison with MFSC [19], JFPM provides 508 a continuous encoding of SSVEP and elevates the SNR as 509 the data length increases, whereas in MFSC the transition 510 of stimulus frequencies interrupts the steady state of VEP 511 Fig. 9. A keyboard interface for one-keystroke Chinese character input. Here each stimulus denotes 1−4 pinyin-based syllables, each of which corresponds to Chinese characters for selection. The alphabet above the first line of stimuli denotes the initial letter of the syllable for each column. and possibly offsets the amplitude of brain response. Because

550
The contribution of the present study is three-fold. First, 551 methodologically, we extended the conventional JFPM to the 552 spectrally-dense regime that prevents spectral overlap and 553 encodes a large number of stimuli in the low-and medium-554 frequency band. In system implementation, we successfully 555 developed an SSVEP-BCI system with 120 stimuli using 556 sJFPM and the proposed system can achieve a high-speed 557 performance as validated in online experiment. For practical 558 applications, the proposed BCI system has the potential to 559 enable a variety of new applications, e.g., BCI systems for 560 one-keystroke Chinese character input. Nevertheless, this study 561 is proposed as a proof-of-principle demonstration, which has 562 its limitations, leaving room for its improvement in future 563 work. To improve the practical utility of the BCI system, its 564 ease of use can be improved in terms of EEG cap, standard 565 monitor and the effort in reducing calibration time. Dry EEG 566 cap [33] and pre-gelled EEG cap [34] are more practical and 567 can be used to replace the gel-based EEG cap in real-world 568 applications. A standard monitor can be employed for stimulus 569 presentation in our future work to make the proposed BCI 570 system more readily available. Cross-stimulus transfer learn-571 ing [35] could be employed to overcome the calibration burden 572 brought about by the increased number of stimuli. Other 573 transfer learning approaches in SSVEP-BCI [36], [37], [38] 574 have the potential to further increase the system performance 575 by leveraging data from other subjects [36], [37] or other 576 sources [38]. Apart from the effort in reducing calibration 577 time, visual fatigue can be mitigated by designing a paradigm 578 with more visual comfort [39]. For the improvement of BCI 579 performance, more electrodes can be employed to further 580 enhance the ITR of the 120-target system [15]. Furthermore, 581 a dynamic stopping strategy [40] could be implemented to 582 tackle the individual difference in parameter selection and 583 thereby enhance the ITR of the system. The present study proposed a spectrally-dense joint 586 frequency-phase modulation (sJFPM) encoding method to 587 design a high-speed steady-state visual evoked potential based 588 brain-computer interface (SSVEP-BCI) system with 120 stim-589 uli. Two experiments, an offline and an online, involving 590 18 subjects were conducted to optimize and validate the sys-591 tem performance. The results demonstrated that the proposed 592 120-target BCI system based on sJFPM could achieve an 593 online accuracy of 92.47 ± 1.83% and an online ITR of 594 213.23 ± 6.60 bits/min. By means of using finer frequency 595 division to encode a high volume of stimuli, the present 596 study provides insight into the JFPM method and offers an 597 opportunity for BCI to be involved in new BCI applications, 598 contributing to the effort for developing novel non-invasive 599 BCI systems.