Phase-Locked Time-Shift Data Augmentation Method for SSVEP Brain-Computer Interfaces

Steady-state visual evoked potential (SSVEP) based brain-computer interfaces (BCIs) have achieved an information transfer rate (ITR) of over 300 bits/min, but abundant training data is required. The performance of SSVEP algorithms deteriorates greatly under limited data, and the existing time-shift data augmentation method fails to improve it because the phase-locked requirement between training samples is violated. To address this issue, this study proposes a novel augmentation method, namely phase-locked time-shift (PLTS), for SSVEP-BCI. The similarity between epochs at different time moments was evaluated, and a unique time-shift step was calculated for each class to augment additional data epochs in each trial. The results showed that the PLTS significantly improved the classification performance of SSVEP algorithms on the BETA SSVEP datasets. Moreover, under the condition of one calibration block, by slightly prolonging the calibration duration (from 48 s to 51.5 s), the ITR increased from ${40.88}\pm {4.54}$ bits/min to ${122.61}\pm {7.05}$ bits/min with the PLTS. This study provides a new perspective on augmenting data epochs for training-based SSVEP-BCI, promotes the classification accuracy and ITR under limited training data, and thus facilitates the real-life applications of SSVEP-based brain spellers.


I. INTRODUCTION
B RAIN-COMPUTER interfaces (BCIs) directly decode users' intentions from electrophysiological signals to control the external devices without the neuromuscular pathway [1].Among the noninvasive modalities, the steady-state visual evoked potential (SSVEP)-based BCI has attracted widespread attention due to its advantages in high signal-tonoise rate (SNR) and little user training time [2].In recent The authors are with the State Key Laboratory of Mechanical System and Vibration, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: mexyzhu@sjtu.edu.cn;mengjianjunxs008@sjtu.edu.cn).
Target detection methods play an important role in enhancing the performance of SSVEP-BCI [11].In the past few decades, the detection methods have evolved from detecting with single channel, e.g., power spectrum density analysis (PSDA) [12], into detecting with multiple channels, e.g., canonical correlation analysis (CCA) [13], minimum energy combination (MEC) [14], multivariate synchronization index (MSI) [15], improved versions of CCA [16], [17], and filterbank canonical correlation analysis (FBCCA) [7].Moreover, training-based detection methods, especially spatial filteringbased algorithms, have been developed recently.The spatial filtering methods have gained wide attention and have proved to be effective in the field of BCI [18], [19].In SSVEP-BCI, the spatial filters are calculated using the individual training data to reduce the non-relevant brain activities and noises, and extract the relevant SSVEP activity to improve the signal-tonoise ratio (SNR) from the given multi-channel EEG signal.The extended CCA (eCCA) proposed by Chen et al. [20] calculated the spatial filters using CCA between temporal templates, artificial reference signal, and the test data.The ensemble task-related component analysis (eTRCA) proposed by Nakanishi et al. [8] derived the spatial filters by solving the inter-trial covariance maximization problem, and it greatly promoted the classification performance.The extended versions of eTRCA [21], [22] were proposed thereafter to further improve the performance of eTRCA.The task-discriminant component analysis (TDCA) proposed by Liu et al. [23] sought to find a uniform spatial filter of all classes with a discriminative model, and it significantly outperformed eTRCA in public datasets and self-collected dataset.Despite the fact that spatial filtering based detection methods have achieved high performance in SSVEP-BCI, they require a large amount of individual training data.For example, the eCCA and eTRCA require a calibration session of 12 blocks with each block containing 40 trials before applying the SSVEP-BCI system for online detection.The collection of sufficient training data is time-consuming and might cause a visual fatigue, resulting in user's low acceptance of the BCI system.Therefore, researchers have explored the feasibility of detecting targets under limited SSVEP training data.For example, transfer learning-based methods [24], [25] can use the training data from existing subjects in the same experiment or training data collected from other experiments using different acquisition devices to train the detection model for the new subject directly.Generative adversarial networks (GAN) [26] can generate multiple synthetic data given the existing data from other subjects or experiments and little data from the new subject.By doing so, the new subject can be free from the long training time.However, these methods still need to collect a large amount of data in advance, and fine-tuning is required for numerous parameters in the model.Some researchers utilize the existing but unused data for detection model training, including using data from other classes which had the neighbor frequencies [27] or locations on the screen [28] to the target class to train the spatial filters or using the unlabeled test data for obtaining new spatial filters for detection [29].Another attempt is to perform data transformation to the data epochs to obtain multiple new augmented epochs.Luo et al. [30] proposed source aliasing matrix estimation (SAME) to augment new data by combining the derived SSVEP source component from the training data and the estimated Gaussian noise.Li et al. [31] augmented new data by performing sample-based transformations, including performance-measure-based (PMB) time wrap, frequency noise addition, and frequency masking onto the data epochs.
Time-shift augmentation on the experimental data is another frequently used method in BCI field to extract more training samples for the model training [32], [33].In the offline calibration data of SSVEP-BCI, each trial duration is set to 3 s or 5 s, and data epochs of [d, d + T w ] s in each trial is extracted for training the spatial filtering algorithms.d indicates the SSVEP latency of 0.13 s or 0.14 s [34], [35], and T w indicates the selected window length for online decoding, which is set to around 0.6 s and is much shorter than the trial duration.However, low classification performance was obtained in previous SSVEP studies [36], [37] when timeshift augmentation method was directly used to augment more data epochs for training the spatial filtering algorithm.The possible reason is that it violates the time-and phase-locked requirement between different data epochs [30], [36], [37].Since subjects steadily gaze at the cued stimulation target for the entire trial, multiple epochs with similar SSVEP characteristics could be extracted within one trial.Thus, the feasibility of applying the time-shift augmentation method to spatial filtering-based algorithms needs to be explored further.
This study proposes a novel time-shift data augmentation method to augment more data epochs for training-based SSVEP algorithms.In detail, the similarity between the original data epoch [d, d + T w ] s and data epochs at different time moment [a t , a t + T w ] s within one trial was evaluated.Based on the similarity results, the phase-locked time-shift (PLTS) data augmentation method was proposed, in which a unique time-shift step for each SSVEP class was set to augment additional epochs within one trial.Then, the proposed PLTS augmentation method was verified on the state-of-the-art (SOTA) spatial filtering algorithms, i.e., eTRCA and TDCA, for the Benchmark and the BETA dataset.Finally, PLTS was implemented in a situation of short calibration time, the single calibration block condition, and it showed significant improvement, which promoted the real-life applications of SSVEP-BCI.
The paper is organized as follows.The training-based SSVEP-BCI, the data augmentation process, the used public datasets, and evaluation metrics are described in Section II.The results are presented in Section III, with discussion and conclusion followed in Sections IV and V, respectively.

A. Training-Based Method in SSVEP-BCI
As shown in Fig. 1, in SSVEP-BCIs, N f visual stimuli are presented to subjects, and the experiment consists of N b blocks, each including N f trials corresponding to N f stimuli.Each trial starts with a visual cue indicating the target stimulus, which lasts for 0.5s.Subjects are instructed to shift their gaze to the target within the cue duration and gaze at it.After the cue, all stimuli start to flicker for T n s.Then, the screen is blank for 0.5s before the next trial begins.Epoch of [d, d + T w ] s in each trial is extracted, in which the time moment 0 indicates the stimulus onset, d indicates the SSVEP response latency [34], and T w indicates the time window length for training and decoding.
The SSVEP detection model can be trained using the individual calibration data X . n indicates the stimulus class index, and N f is the number of stimulus classes in total; j indicates the index of the training sample, and N b is the number of blocks / training samples for each class; N ch indicates the number of channels for each epoch, and N s indicates the data length extracted in each trial (which is equal to T w × f s , f s indicates the sampling rate).First, the spatial filter W n ∈ R N ch ×1 is calculated (the detailed calculation process of spatial filter in eTRCA and TDCA could be found in [8] and [23]): Then, the temporal template for each class is obtained by averaging across different training samples: Finally, given a test data X t ∈ R N ch ×N s , the correlation coefficient between the projected temporal template and the projected test data is calculated: and the target class can be identified by choosing the following equation: The filter bank technique is usually implemented along with the training-based SSVEP detection methods, in which data are decomposed into N m sub-band components to utilize the harmonic information.In detail, the m-th zero-phase Chebyshev Type I infinite impulse response (IIR) filters with the lower and upper cut-off frequencies of [8m, 90] Hz is applied to both the training data and the test data.After calculating the above-mentioned spatial filters, temporal templates, and the correlation coefficients in each sub-band component, a weighted sum of squares of the combined correlation coefficients is calculated and the class with the maximum weighted correlation is selected as the target class: w m is defined according to [7]: B. Data Augmentation The similarity is measured by the correlation between x o and x a : The grid search method can be used, in which the time moment a is set to to calculate the similarity between x o and all possible x a , and the time moment with the maximum similarity is selected to be the optimal a.
2) PLTS Augmentation Method: In SSVEP-BCI, the augmented epoch x a shows high similarity with the original epoch x o periodically with the change of time moment a, and this period is equal to the reciprocal of the stimulation frequency of the trial (see Section III-A for detail).Therefore, the phase-locked time-shift (PLTS) SSVEP augmentation method is proposed in this study: given a trial data of n-th class (the stimulation frequency is f n Hz), in addition to the original epoch, time-shift augmentation can be performed with a unique step s n : Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
and augmented epochs [a t , a t + T w ] s can be severed as additional training samples.The time moment a is: In practice, total N a augmented epochs are extracted within one trial (the value of N a is determined in Section III-B).Note that the PLTS SSVEP augmentation method is different from the conventional time-shift augmentation method because the step s of time shift in PLTS is different for each class, whereas s in the conventional method is set to a fixed value, e.g., time samples of 50ms or 100ms, for all class.By implementing the PLTS augmentation method, the number of training samples for each class in N b blocks can be enlarged from N b to N b ×(1+ N a ).The pseudo code for PLTS augmentation and model training is described as follows, and the process of PLTS is summarized in Fig. 1.

Algorithm 1 PLTS Data Augmentation
Input: the original four-dimension data matrix X o ∈ R N ×N ch ×N b ×N f (N is the total data sample number; N ch is the channel number), the n a -th augmented epoch extracted from n b -th block X ∈ R N s ×N ch , the stimulation frequency for n-th class f n , the time-shift step for n-th class s n , the sampling rate f s , the data sample number for each extracted epoch N s , the number of training block N b , the number of augmented epochs N a Output: Append X into X a 7 Train the detection model using X a C. Evaluation 1) Data Used in This Study: The similarities between the original epoch and the augmented epochs were evaluated by analyzing the SSVEP public dataset [38].Ten healthy subjects participated in 15 blocks of a cued-spelling task on a 4×3 matrix of a virtual keypad, each block contained 12 trials.The frequencies of the flickering stimuli ranged from 9.25Hz to 14.75Hz with an interval of 0.5Hz, and the phase was set to 0 initially and incremented with 0.5π for different stimuli.The stimulation duration of each trial was 4 s, and the sampling rate was 256 Hz.The epoch data of channel Oz was used to calculate the similarity.
The value of N a in the proposed PLTS data augmentation method was determined using the Benchmark dataset [34], and the performance of PLTS was verified on the BETA dataset [35].For the Benchmark dataset, 35 subjects participated in six blocks of a cued-spelling task on a 5 × 8 matrix of a virtual keyboard.For the BETA dataset, 70 subjects participated in four blocks of a cued-spelling task on a QWERTY virtual keyboard.The stimulus frequencies for Benchmark and BETA dataset ranged from 8.0Hz to 15.8Hz with an interval of 0.2Hz, and the phase was set to 0 initially and incremented with 0.5π for different stimuli.The stimulation duration was 5s in the Benchmark dataset, and 2s or 3s in the BETA dataset.The sampling rate was 250Hz.Signals from nine classical EEG channels (Pz, PO5, PO3, POz, PO4, PO6, O1, Oz, and O2) were chosen for detection.The SSVEP response latency d was set to 140ms in the Benchmark and 130ms in the BETA.Time window length T w was set to 0.3s, 0.4s, . . ., 1s.
2) Performance Comparison: In this study, the SSVEP detection methods (eTRCA [8] and TDCA [23]) with and without the proposed PLTS data augmentation method were compared to evaluate the performance of PLTS.Moreover, the SAME [30] data augmentation method used in SSVEP BCI was also included for comparison.
3) Evaluation Metrics: The correlation coefficient was used to evaluate the similarity between the original epoch and augmented epochs.The classification accuracy and information transfer rate (ITR) were evaluated using k-fold cross-validation (k = 6 for the Benchmark dataset, and k = 4 for the BETA dataset).The ITR in units of bits per min (bpm) is calculated by [39]: where M indicates the number of classes, P indicates the classification accuracy, and T in a unit of second indicates the target selection time, which is the sum of gaze time T w and the 0.5s gaze shifting time.4) Statistical Analysis: When applicable, results were expressed as mean ± SEM (standard error of the mean) unless otherwise stated.The error bars shown in the figures indicated the SEM.One-way and two-way repeated measures analysis of variance (ANOVA) was applied to test the difference in classification accuracy and ITR of eTRCA and TDCA in different numbers of training blocks N b and different augmentation conditions.The Greenhouse-Geisser correction was used if the data didn't conform to the sphericity assumption by Mauchly's test of sphericity.All post hoc pairwise comparisons were Bonferroni corrected.The alpha level of statistical significance was set at 0.05.

A. Similarity Between Original AND Augmented Epochs
To find the optimal augmented epoch, the similarity between original epoch x 0 and augmented epoch x a at the time moment a of [d × f s + 1, d × f s + 256] time samples was computed.Fig. 2(a) shows examples of similarity changes with the time moment a at 9.75Hz, 10.75Hz, and 14.75Hz respectively.The blue dotted curve indicates the calculated similarity for each subject, and the black curve indicates the average similarity across all subjects.It can be seen that the maximum similarity occurs periodically at all three frequency conditions.The time moments of the first three maximum similarity was 26, 52, 79 for 9.75Hz, 24, 48, 72 for 10.75Hz, and 18, 34, 52 for 14.75Hz, respectively.The period, i.e., the interval between the time moments, decreases with the increase of the stimulation frequency, and moreover, it is equal to the period of the  f n .Furthermore, the similarity for all stimulation frequencies is illustrated in Fig. 2(b), in which the phenomenon of periodic maximum similarity exists at all frequency conditions, and the period is also decreased with the increase of the stimulation frequency.Therefore, given a trial data of stimulation frequency f n , optimal augmented epochs can be obtained by extracting epochs at time moment a = p × f s f n , p = 1, 2, . . ., which show maximum similarity to the original epoch.p indicates the ordinal number of augmented epochs.
To evaluate the changes of periodically occurred maximum similarity, the first 15 similarity between the original epoch and p-th augmented epoch for each stimulation frequency and each subject was calculated and averaged, as shown in Fig. 3.The similarity shows a trend of decrease with the increase of ordinal number p, but it remains above 0.4 for all augmented epochs.

B. PLTS Parameter Optimization
To explore the classification accuracy improvement when using PLTS augmentation method with different augment number N a , a grid search method was conducted to evaluate eTRCA with PLTS for the Benchmark dataset.As shown in Fig. 4, with the increase of augment number N a , the classification accuracy first increased, then remained stable or slightly decreased.Two-way repeated measures ANOVA revealed that two within-subject factors N b and N a and their interaction were statistically significant for the Benchmark dataset (

C. Classification Performance With PLTS
To testify the efficacy of PLTS augmentation method, the eTRCA and TDCA performance without any data augmentation method (Original), with PLTS (w/PLTS), and with SAME (w/SAME) were compared, as shown in Table I.The results showed that PLTS promoted the classification accuracy of eTRCA by 49.60%, 9.80%, and 5.00% for 1, 2, and 3 training blocks respectively, and it promoted the classification accuracy of TDCA by 33.67%, 2.86%, and 0.57%.When compared to the SAME, the PLTS promoted the classification accuracy of eTRCA by 12.78%, −0.43%, and −1.04% for 1, 2, and 3 training blocks respectively, and it promoted the classification accuracy of TDCA by 4.95%, 0.19%, and −0.92%.Two-way repeated measures ANOVA showed that augment methods, training blocks, and their interaction were statistically significant for both eTRCA and TDCA ( p < 0.001).One-way repeated measures ANOVA and post hoc test revealed that the classification accuracy of PLTS was significantly higher than that without any data augmentation for both eTRCA and TDCA for all training blocks conditions ( p < 0.001); compared with the SAME, classification accuracy of PLTS for both eTRCA and TDCA was significantly higher for one Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

D. Spatial Filters and Temporal Templates Improvement by PLTS
To further explore the contribution of PLTS to SSVEP detection methods, four augmentation conditions were considered, including classification without PLTS augmentation (w/oPLTS), classification with updated spatial filters after utilizing PLTS (w/PLTS-sf), classification with updated temporal templates after utilizing PLTS (w/PLTS-t), classification with both updated spatial filters W n and temporal templates χ n after utilizing PLTS (w/PLTS).As shown in Fig. 6, performance without PLTS had the lowest performance, and that with PLTS achieved the highest performance for both eTRCA and TDCA, and all numbers of training blocks N b .Two-way repeated measures ANOVA revealed that two within-subject factors N b and augmentation conditions and their interaction were statistically significant in eTRCA and TDCA ( p < 0.001).One-way repeated measures ANOVA and the post hoc test showed that the performance with PLTS was significnatly higher than all other three conditions in both eTRCA and TDCA for all training block conditions, except for N b = 3 in TDCA.The performance of w/PLTS-sf was significantly higher than that of w/PLTS-t in both eTRCA and TDCA for N b = 1.

E. Performance Under Single Calibration Block
It is necessary for training-based SSVEP-BCI to diminish the calibration time before online application since a long calibration time causes visual fatigue and users' low acceptance of the BCI system.In this study, four simulated conditions of calibration session before the online decoding were compared, including (1) only one calibration block with a trial duration of T o (N b = 1); (2) only one calibration block with the expanded trial duration of (T o + 1 × 1 f n ), in which PLTS with one augmented epoch was implemented (N b = 1, N a = 1); (3) only one calibration block with the expanded trial duration of (T o + N a × 1 f n ), in which PLTS with a total of N a augmented epochs was implemented (N b = 1, N a = 6); (4) two calibration blocks (N b = 2).Supposing that the gaze shift time for each trial was 0.5 s, and the rest time between training blocks was 60 s, then the calibration time D (unit: second) for the four conditions were:

IV. DISCUSSION
This study proposes the PLTS augmentation method for SSVEP-BCI under limited training data.The state-of-the-art algorithms eTRCA and TDCA showed improved performance when combined with the PLTS.Especially when the training block was one, the classification accuracy improvement of eTRCA was 49.60%, and TDCA's was 33.67%.Moreover, the PLTS significantly outperformed the SAME by 12.78% and 4.95% for eTRCA and TDCA, respectively.The results validated the efficacy of the proposed PLTS data augmentation method.

A. New Time-Shift Augmentation Strategy -PLTS
Data augmentation method could improve the classification performance given the limited amount of training data [40].Particularly, the time-shift data augmentation method with a fixed step of 50 ms or 100 ms is usually implemented in the BCI field to extract additional training samples in limited training data [32], [33].However, this augmentation method resulted in low performance when combined with the spatial filtering-based SSVEP algorithms [36], [37].One of the reasons is that, in spatial filtering-based SSVEP algorithms, all the training samples in one class are temporally averaged to eliminate the noise contained in EEG signals and to obtain the pure SSVEP component, i.e., the temporal templates referred Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
to in this study.To do so, the initial phases between training samples must maintain the phase-locked requirement, or the SSVEP component would be canceled out.For example, the sum of two signals with the same period but a phase difference of π is 0. The same initial phase of stimulation in each trial and the precise timing in the SSVEP-BCI experiment guarantee the phase-locked requirement [34], [35], but the augmenting process causes different phases between the original training samples and the augmented training samples, resulting in low classification performance.
In this study, PLTS augmentation method augmented new epochs based on the period of the stimulation frequency f n Hz, so that the phase difference between the SSVEP component contained in the original epochs and those contained in the augmented epochs were integers of 2π, which was equivalent to 0, so that the phase-locked requirement was maintained and the additional augmented training samples enhanced the classification performance.The main difference between the PLTS augmentation method and the conventional augmentation method is that, by utilizing the stimulation frequency information of each class, the time-shift step in PLTS is unique for each class, which is f s f n data samples rather than a fixed step of 50 ms or 100 ms, so that the SSVEP component can be obtained after the temporal averaging and the target can be decoded accurately.

B. Single Calibration Block in SSVEP-BCI
The training-based SSVEP detection methods show advanced classification performance compared to the trainingfree methods [35], but they need sufficient calibration data to train the detection model.Previous studies reported low performance obtained by the training-based methods when only one training block was available [8], [23].Since a relatively long stimulation duration of 3 s or 5 s was set in the offline dataset, instead of extracting only one epoch as the conventional studies did, this study proposed the PLTS augmentation method to extract multiple epochs within one trial.Therefore, the eTRCA and TDCA with the PLTS achieved significantly higher classification accuracy, with the improvement ranging from 34.1% to 62.5%.
When conducting a calibration session before the online SSVEP-BCI, the stimulation duration in each trial is set to the optimal time window T o determined in the offline analysis, so the PLTS can't be implemented directly since there is no additional data in each trial to perform the time shift.Nevertheless, with a slightly prolonged trial duration (T o + 1 f n ) and PLTS implementation, the classification performance can be greatly improved.In this study, the ITR was 5.42-40.88bits/min when only one calibration block was used for training, and the ITR was enhanced to 95.56-163.57bits/min by PLTS when still one calibration block but with a slightly prolonged duration was used for training.Note that given the stimulation frequency ranging from 8 to 15.8Hz, the extra stimulation time 1  f n in each trial was 0.064-0.125s, which is nearly imperceptible since the average human reaction time is 0.2 s [41].The total extra time was 3.5s, which was small compared to the one-block duration of 48 s (less than 8 %).To sum up, by slightly prolonging the stimulation duration in each trial and implementing the PLTS, the online SSVEP-BCI system with a high ITR can be achieved under one calibration block.

C. Explanation of the Periodic Similarity
In this study, the similarity between the original epoch x o and augmented epochs x a at different time moment in each trial was measured by the Pearson correlation coefficient, and results showed that the maximum correlation occurred periodically, and the period was equal to f s f n , where f n is the stimulation frequency of the given trial.Given two signals, their Pearson correlation coefficient can be high only if they have a similar period and phase.Since the evoked SSVEP in each trial was a periodic signal, x o and x a at different time moments, all had the same period; their correlation reached the maximum when the time moment difference between x o and x a was integer multiples of period k × f s f n , in which they had the same phase (phase-locked to each other).Since the nonstationary characteristics and noise contained in the measured SSVEP signals, the maximum correlation was less than 1 and varied with time.
Consider the case of zero mean and unit variance in equation ( 8), it can be simplified into: which was equivalent to the definition of the autocorrelation function (ACF) R x,x (n): The ACF response of a signal with the period T n is also a periodic signal, and the maximum correlation occurs periodically at k × T n , which coincided with our calculated similarity in Fig. 2. Therefore, these results validated the rationality of extracting epochs at time moments k× f s f n for data augmentation.

D. PLTS Implementation
PLTS is an augmentation data method for SSVEP-BCI and can be effectively to extract augmented epochs in each trial is required, and the spatial filtering calculation, temporal template calculation, and the detection are all the same with the conventional algorithm implementation.

E. Limitations and Future Works
Despite the efficacy of the proposed PLTS was validated in this study, there is still room for further improvement.First, as shown in Fig. 2(b), the similarity level between the augmented and original epochs was different for classes of different stimulus frequencies.This indicates that the optimal augmentation number N a might be different for each class, and finding the optimal N a for each individual class could further improve the performance of the PLTS.Second, when the training block N b was one, the performance improvement brought by the PLTS augmented epochs was mainly achieved by estimating valid spatial filters, rather than valid temporal templates, as Fig. 6 indicated.Similar phenomenon was also found in the SAME study [30].The possibility of acquiring a valid temporal template using augmented epochs will be investigated in the future.Third, the performance enhancement of the PLTS was less superior when N b > 1 compared to the condition of N b = 1.Meanwhile, the SAME was comparable to the PLTS when N b = 2, and it outperformed the PLTS by 1% when N b = 3.Therefore, the proposed PLTS and the existing data augmentation methods, e.g., the SAME, might be combined using feature fusion technique [42], which may further promote the classification performance, and will also be explored in our future study.
V. CONCLUSION This study proposes a new time-shift data augmentation method, PLTS, for SSVEP-BCI.By setting a unique timeshift step for each SSVEP class, the new augmented epochs retain similar SSVEP characteristics to the original epochs, and, therefore, can be used for training the spatial filteringbased detection model.The results on two public datasets illustrate that PLTS can significantly improve the performance of the SOTA algorithms, eTRCA and TDCA.Furthermore, the PLTS can solve the insufficient performance problem under a single calibration block, which promotes the ITR from 34.90 ± 6.25 bits/min to 163.57± 6.25 bits/min.Therefore, PLTS advances the SSVEP-BCI performance under limited training data and facilitates real-life applications in high-speed brain spellers.

Manuscript received 24
May 2023; revised 16 September 2023; accepted 18 September 2023.Date of publication 10 October 2023; date of current version 24 October 2023.This work was supported in part by the National Natural Science Foundation of China under Grant 52175023 and Grant 91948302.(Corresponding authors: Jianjun Meng; Xiangyang Zhu.)

Fig. 1 .
Fig. 1.Methodology overview.In SSVEP-BCI, the calibration session includes N b blocks, each containing N f trials corresponding to all N f visual stimuli with different frequencies and phases.In each trial, subjects are required to gaze at the cued stimulus for T n s.Conventionally, data epoch of [d, d + T w ]s in each trial is extracted for model training, where d indicates the SSVEP response latency and T w indicates the data length for training and decoding.In the proposed PLTS augmentation method, a unique time-shift step is assigned for each SSVEP class and therefore (1 + N a ) epochs can be extracted in each trial, providing more data for training the model.

1 )
Finding Epochs With High Similarity: To fully utilize the experiment data, apart from the original epoch x o [d, d +T w ] s, we hypothesize that data in [d +T w , T n ] s in each trial can also be extracted for SSVEP detection model training since subjects steadily gaze at the flickering stimulus, and the SSVEP is evoked for the entire trial.In detail, the augmented epoch x a extracted from [a t , a t + T w ] s should have similar SSVEP characteristics with x o , and can be served as a new training sample for the detection model.a t indicates the start time moment for extracting the augmented epoch, and (a t + T w ) indicates the end time moment for extracting it.The optimal augmented epoch x a can be determined by finding the time moment a with the maximum similarity (where a = a t × f s ):

Fig. 2 .
Fig. 2. Similarity between the original epoch and the augmented epochs.The time moment of the augmented epochs ranges from d × f s + 1 to d × f s + f s .(a) Example similarity calculated using class data of 9.75 Hz, 10.75Hz, and 14.75 Hz, respectively.The blue dotted curve indicates the calculated similarity for each subject, and the black curve indicates the average similarity across all subjects.(b) Similarity of all classes from 9.25 Hz to 14.75 Hz.Color indicates the similarity at each time moment of each class.

Fig. 3 .
Fig. 3.The change of periodically occurred maximum similarity.The x-axis indicates the ordinal number p of the augmented epochs, the y-axis indicates the similarity between the original epoch and the pth augmented epoch.

Fig. 4 .
Fig. 4. The classification accuracy of eTRCA under different training block number N b and different augmentation number N a for the Benchmark dataset presented in (a) heatmap figure and (b) curve figure.The accuracy value in the heatmap was normalized to [0, 1] for each N b to better illustrate the accuracy change with different N a .

Fig. 5 .
Fig. 5.The classification accuracy distribution of all subjects for the dataset when (a) N b = 1, (b) b = 1, and (c) N b = 1, respectively.Each blue circle indicated the accuracy for each subject.The x value indicates the accuracy obtained by the subject without the PLTS, the y value indicates the accuracy obtained with the PLTS.

Fig. 6 .
Fig. 6.The classification accuracy under different augmentation conditions, for (a) eTRCA and (b) TDCA, respectively.Conditions include the algorithm without PLTS (Original), with updated temporal templates (w/PLTS-t), with updated spatial filters (w/PLTS-sf), and with both the updated temporal templates and updated spatial filters (w/PLTS).The green brackets indicate p < 0.05, the blue brackets indicate p < 0.01, and the red brackets indicate p < 0.001.

Fig. 7
illustrates classification accuracy and ITR under different time window length T o for online decoding and different calibration conditions.For T o = 0.7 s, the condition (N b = 1) required a short calibration time D 1 = 48 s, but it had extremely low ITR for eTRCA and TDCA, which impractical for online applications.The condition (N b = 1, N a = 1) required a slightly longer calibration time D 2 = 48 + 3.50 = 51.5 s, but achieved much higher ITR compared to the condition (N b = 1), i.e., 95.56 ± 7.34 vs 6.47 ± 1.30 bits/min for eTRCA, and 122.61 ± 7.05 vs 40.88 ± 4.54 bits/min for TDCA.The condition (N b = 1, N a = 6) required a calibration time D 3 = 48 + 21.0 = 69s, and had comparable ITR, and lower ITR compared to the condition (N b = 2) for eTRCA and TDCA, respectively (note that the calibration time D 4 = 156 s for the condition N b = 2).One-way repeated measures ANOVA was used to evaluate the difference among the maximum ITR achieved by different calibration conditions, and the results revealed that the calibration condition was statistically significant in eTRCA (F (1.28, 88.40) = 375.65,p < 0.001) and TDCA (F (1.23, 84.93) = 589.22,p < 0.001)).These results indicated that the problem of low SSVEP performance with one training block can be solved by slightly prolonging the calibration time by 3.5 s and implementing the PLTS augmentation method.

Fig. 7 .
Fig. 7.The classification performance achieved by different calibration conditions.Sub-figures (a) and (b) indicated the accuracy of eTRCA and TDCA, and sub-figures (c) and (d) indicated the ITR of eTRCA and TDCA, respectively.The orange line indicates the condition of only one calibration block with the trial duration of optimal window length T o , which is selected by the offline analysis.The purple line indicates the condition of only one calibration block with a prolonged trial duration of T o + 1 f n for each class of f n Hz.The blue line indicates the condition of only one calibration block with a prolonged trial duration of T o + 6 × 1 f n .The green line indicates the condition of two calibration block with the trial duration of T o .Note that PLTS is implemented in the conditions with the prolonged trial duration, whereas the conditions with the trial duration of T o do not allow for PLTS implementation due to the lack of extra data in each trial.
combined with the spatial filtering based training methods.With the PLTS, additional time is required to train the detection model since the number of epochs increases from N f ×N b ×1 to N f ×N b ×(1 + N a ).Still, the training time is acceptable since the original training time of eTRCA and TDCA is relatively short.When a PC with a 2.90 GHz CPU is equipped, and time window is set to 0.7 s and the training block number is 5, the training time is 0.21 s for eTRCA and 1.16 s for TDCA, and the training time with PLTS of N a = 3 is 0.82 s for eTRCA and 5.89 s for TDCA.Besides, it is easy to implement PLTS into the training-based SSVEP-BCI because only one additional step of the time-shift process Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I CLASSIFICATION
PERFORMANCE COMPARISON AMONG ALGORITHMS WITH DIFFERENT AUGMENTATION METHODS INCLUDING ORIGINAL, SAME, AND PLTS