A Time-Local Weighted Transformation Recognition Framework for Steady State Visual Evoked Potentials Based Brain–Computer Interfaces

Canonical correlation analysis (CCA), Multivariate synchronization index (MSI), and their extended methods have been widely used for target recognition in Brain-computer interfaces (BCIs) based on Steady State Visual Evoked Potentials (SSVEP), and covariance calculation is an important process for these algorithms. Some studies have proved that embedding time-local information into the covariance can optimize the recognition effect of the above algorithms. However, the optimization effect can only be observed from the recognition results and the improvement principle of time-local information cannot be explained. Therefore, we propose a time-local weighted transformation (TT) recognition framework that directly embeds the time-local information into the electroencephalography signal through weighted transformation. The influence mechanism of time-local information on the SSVEP signal can then be observed in the frequency domain. Low-frequency noise is suppressed on the premise of sacrificing part of the SSVEP fundamental frequency energy, the harmonic energy of SSVEP is enhanced at the cost of introducing a small amount of high-frequency noise. The experimental results show that the TT recognition framework can significantly improve the recognition ability of the algorithms and the separability of extracted features. Its enhancement effect is significantly better than the traditional time-local covariance extraction method, which has enormous application potential.

SSVEP [9], [10] is generated by the visual cortex of the brain after a visual stimulus (e.g., an image or light flashing at a fixed frequency), and is a frequency-specific periodic EEG signal consisting of multiple discrete frequency components.The frequency-domain components of SSVEP are essentially constant and contain a fundamental frequency component that is the same as the stimulus frequency as well as higher-order harmonic components [9].Therefore, SSVEP recognition algorithms focus on its frequency domain features [10], [12].Power Spectral Density Analysis (PSDA) [10], which was the earliest applied to SSVEP identification, makes good use of the frequency domain characteristics of single-channel SSVEP.Recently, single-channel SSVEP recognition algorithms are gradually replaced by multi-channel SSVEP algorithms for recognition, such as Minimum Energy Combination (MEC) [13], canonical correlation analysis (CCA) [14], Likelihood Ratio Test (LRT) [15] and Multivariate Synchronization Index (MSI) [16], etc.Currently, many researches focus on the improvement of MSI and CCA.Zhang et al. [17] presented an extension of MSI (EMSI) which used the time-delay embedding method to supplement the spatial and temporal features of the SSVEP signal.Qin et al. [18] proposed a filter bank-driven MSI (FBMSI) algorithm by applying the idea of filter bank to MSI.
In the existing research, there are many studies on the improvement of CCA.Zhang et al. [19] described a multiset CCA (Mset-CCA) algorithm that extracts common features of SSVEP from multiple sets of EEG data recorded at the same stimulation frequency and optimizes the reference signal based entirely on training data.Liu et al. [20] proposed Deep multiset CCA (DMCCA) by applying neural networks to add nonlinear representations of training data and sine-cosine reference signals to individual reference templates.To avoid extracting common features from noise components, Jiao et al. [21] designed a multilayer correlation maximization (MCM) model to improve the generalization ability of Mset-CCA.Nakanishi et al. [22] incorporated the individual SSVEP training data into the reference signal of CCA and proposed the individual training data CCA (ITCCA).Chen et al. [23] applied the filter bank analysis to the CCA algorithm, and proposed filter bank CCA (FBCCA).Islam et al. [24] enhanced the identifiability of SSVEP signals in an unsupervised training method and proposed binary subband-based CCA (BsCCA).Nakanishi [25] combined the standard CCA and ITCCA to propose a more robust extended CCA algorithm (eCCA).
Different from the above ideas, Nakanishi et al. [26] recently proposed task-related component analysis (TRCA) by maximizing the reproducibility during the task period to enhance the SSVEP component-related target task in the EEG signal.Liu et al. [27] further improved the performance of SSVEP-BCI by solving the redundancy problem of spatial filters in TRCA and proposed task-discriminant component analysis (TDCA).
The above methods all present satisfactory recognition performance in SSVEP-BCI, but they ignore the time-local information of EEG signals during constructing covariance models.EEG is a non-stationary time-varying signal [28], [29], previous studies have shown that the time-local covariance extraction (TE) approach can help the algorithms to improve the recognition performance of BCI [30], [31], [32], [33].Zhang et al. [34] embedded the time-local information to MSI to capture the discriminative features of the SSVEP signal more effectively.Jin et al. [35] introduced time-local information to the TRCA algorithm to further improve the recognition accuracy of TRCA.Shao et al. [36] added time-local information to the fundamental frequency sub-band of the FBCCA algorithm and obtained significantly better classification performance than FBCCA.However, the optimization effect of the TE method only can be observed from the recognition results and cannot explain the enhancement principle of time-local information.In this paper, we proposed a time-local weighted transformation (TT) recognition framework from the perspective of weighted transformation (WT).The framework directly embeds the time-local information into EEG through WT.In the frequency domain, we can observe the influence mechanism of time-local information on the SSVEP signal: On the premise of sacrificing part of the SSVEP fundamental frequency energy, low-frequency noise is suppressed.At the same time, at the cost of introducing a small amount of high-frequency noise, the harmonic energy of SSVEP is enhanced.This study combined two common basic algorithms CCA and MSI to evaluate the improvement effect of the TT framework and compare it with the traditional TE method.The experimental results showed that the TT recognition framework can significantly improve the recognition ability of the algorithms, and the enhancement effect is significantly better than the TE method.Compared with the TE method, TE provides higher feature separability, and has a wide range of application prospects.
The remainder of the paper is organized as follows: Section II describes the used datasets and preprocessing process.Section III introduces the algorithms used: CCA, MSI, the TE method, and the proposed TT recognition framework.Section IV shows the principle of TT improvement and the recognition results of each algorithm under the TT recognition framework.Section V discussed the factors that may affect the results.Finally, Section VI concluded this paper.

II. DATASETS AND PREPROCESSING A. Dataset 1
Dataset 1 includes the EEG data from 9 healthy subjects (6 males and 3 females, with normal or corrected-to-normal vision), none of whom had previously participated in the SSVEP experiment.The experiment consisted of 30 blocks for each subject, and each block contained 12 trials.Each target stimulus was randomly presented twice at a stimulus frequency of 10-15 Hz (with a 1 Hz interval).During the experiment, the target frequency square flashed for 3 s after a 1 s cue.The EEG signal was recorded using a 16-channel wireless physiological signal acquisition device (NEURACLE).Each channel was placed according to the international 10-20 system standard, and the sampling rate was 1000 Hz.The impedance of the channel was below 15 k .

B. Dataset 2
Dataset 2 included the EEG data from 6 healthy subjects (4 males and 2 females) with normal or corrected normal vision, all of whom had participated in the SSVEP experiment of Dataset 1.The experiment consisted of 6 blocks for each subject, and each block contained 12 trials.Other experimental conditions were the same as in Dataset 1.

C. Benchmark Dataset
The Benchmark dataset [37] included the EEG data from 35 healthy subjects (18 males and 17 females, with normal or corrected vision).The experiment consisted of 6 blocks for each subject, and each block contained 40 trials.Each target stimulus was randomly presented once at a stimulus frequency of 8-15.8Hz (with a 0.2 Hz interval).During the experiment, the target frequency square flashed for 5 s after a 0.5 s cue.The EEG signal was recorded using a 64-channel Synamps2 system (Neuroscan) with a sampling frequency of 1000 Hz.The impedance of the channel was lower than 10 k .To keep the stimulus frequency consistent among the three datasets, the EEG data at 10-15 Hz (1 Hz interval) were selected for subsequent analysis.

D. Data Pre-Processing
In this paper, Dataset 1 was used for the effect and parameter analysis of the TT framework, and Dataset 2 and Benchmark Dataset were used for effectiveness verification.
At first, the EEG signals in the three datasets were downsampled to 250 Hz, where 9 classical parietal and occipital channels (Pz, P3, P4, POz, PO7, PO8, Oz, O1, and O2) were selected.Considering the time delay in the vision system [38], the data were processed with a delay of 0.14 s.Therefore, the EEG intercept range is [1.14, 4.14] s for Dataset 1 and Dataset 2, and [0.64, 3.64] s for the Benchmark dataset.
All data were band-pass filtered using a zero-phase Chebyshev type-I filter in 10 -90 Hz, and an additional 2 Hz bandwidth [23] was added on both sides of the passband during implementation.Simultaneously, a 50 Hz notch filter is used to remove industrial frequency interference.

III. METHODS
This section first describes the SSVEP recognition algorithms combined in our framework, i.e.MSI and CCA, then illustrates the TE method compared in this paper, and finally introduces the TT recognition framework proposed in this paper.

A. Multivariate Synchronization Index Algorithm
The core of the MSI-based frequency identification algorithm is to calculate the synchronization index between two multi-dimensional signals.Assume that the matrices X and Y represent two multidimensional signals respectively, , N 1 and N 2 represent the signal dimensions, and M represents the signal sample length.
First, the joint correlation matrix between X and Y should be computed as: In the joint correlation matrix, the autocorrelation matrices X X T and Y Y T will affect the calculation of the synchronization index.To reduce the negative impact that is not conducive to SSVEP identification, it is necessary to compute the joint whiten correlation matrix first.The whitening matrix is as follows: After the transformation, the new joint correlation matrix is shown as follows: The new joint correlation matrix R is decomposed into eigenvalues to obtain P eigenvalues (λ 1 , λ 2 , . . ., λ P ), P = N 1 +N 2 , and then the eigenvalues λ i are normalized as follows: The synchronization index of input signals X and Y are calculated using the normalized eigenvalues: The synchronization index between the SSVEP signal and each sine cosine reference signal Y f is calculated separately, and the recognition target frequency f target is determined by (6).
The reference signal Y f is constructed as follows, where f s denotes the sampling frequency, N s denotes the signal sampling point, N h denotes the number of harmonics, and f denotes the stimulation frequency.

B. Canonical Correlation Analysis
Different from the recognition principle of MSI, CCA is to obtain the projection vectors W x and W y by solving the optimization problem of (8), where E[ ] denotes the mathematical expectation.The multi-channel input signals X and Y are compressed into a one-dimensional vector using the projection vector so that there is a maximum correlation coefficient ρ between W T X X and W T Y Y .Finally, the target frequency is obtained by identifying the position of the maximum value in the feature vector sequence.

C. Time-Local Covariance Extraction
In the derivation and calculation process of MSI and CCA, it is necessary to calculate the covariance matrix, which is also a necessary step of many SSVEP algorithms [18], [26], [27].However, the traditional covariance modeling process ignores the time-local information of the EEG signal.Studies have shown that [34], [35], and [36] TE can excavate the temporal discriminant structure of the SSVEP signal, remove temporal artifact noise, and play a positive role in the identification of SSVEP.
First, the adjacency matrix and multi-channel signals are defined as , respectively.M represents the signal length and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.N represents the signal dimension.The time-local covariance matrix C can then be expressed as: Then, the diagonal matrix D and Laplace matrix L are defined as respectively.Thus, (9) can be simplified as follows: The adjacency matrix W is defined by Tukey's tricube weighting function [34], [35], [36]: where τ represents the local time range, and r represents the function order.In general, r is set to 3 with τ ≥2 because when τ < 2 i.e., τ = 1, W will be a matrix consisting of zeros.
Compared with the covariance matrix used in the standard CCA and MSI, the time-local covariance matrix generated by ( 9) -( 11) provides more discriminable information, thus improving the recognition ability of the algorithm [34], [36].

D. Time-Local WT Recognition Framework
The TE method extracts time-local information by embedding the Laplace matrix L in the covariance.However, the enhancement effect of TE only can be observed in the recognition results, and the corresponding improvement principle is not clear.Therefore, we proposed a TT recognition framework from the perspective of WT.The framework uses the Laplace matrix L for WT to directly embed the time-local information into the EEG signal, and the structure is shown in Fig. 1.
First, both the EEG signal X and the reference signal Y are time-local WT by introducing the Laplace matrix L.Then, the autocovariance and cross-covariance of the two are calculated separately and brought into the identification algorithm using covariance calculation.Finally, the target frequency of SSVEP is obtained by identifying the position of the maximum value in the feature vector sequence.In theory, the TT recognition framework can be applied to most SSVEP recognition algorithms using covariance calculation.Considering the complexity and work volume of improved algorithms, this study only uses the commonly used algorithms MSI and CCA to analyze the performances and the enhancement principle of the TT framework.For ease of description, MSI and CCA combined with TT are named TTMSI and TTCCA, respectively, and combined with TE are named TMSI and TCCA, respectively.

A. Influence Mechanism of TT on SSVEP Signal
The frequency domain features are the most significant features of the SSVEP signal [9], [10], [14], [16], so this section explores the influence mechanism of TT on the SSVEP signal from the frequency domain perspective.To compare the differences between TT and TE to the SSVEP signal, this section also uses the idea of WT to explain the improvement mechanism of TE.Therefore, it is necessary to first decompose the TE matrix into two W matrices using Cholesky decomposition (CD), as shown in (12).
However, the matrix is required to be symmetric positive definite when executing the CD and L is non-positive definite in most cases.There is one and only one negative eigenvalue of the order 10e-15.So, the matrix L needs to be approximated by (13) before the CD.
where P denotes the eigenvector matrix of L, denotes the eigenvalue diagonal matrix of L, and | | denotes the absolute value calculation.The negative eigenvalue with extremely minimal value is converted into a positive one by approximation so that L can satisfy the positive definite condition in (12).Taking 750 sampling points and τ = 7 as an example, the 2-norm of L − L is calculated to be only 9.63e-13, which indicates that the error brought by this approximation is tiny and can replace each other.
After approximating L to L, CD n is applied to get the transform matrix D of TE.Then the Fast Fourier Transform is used to analyze the raw EEG (REEG) signal X , timelocal EEG (TEEG) X *D T , and time-local WT EEG (TTEEG) X * L T in the frequency domain, respectively.Taking the stimulation frequency of 13 Hz as an example, Fig. 2 shows the power ratio of the frequency domain in the three kinds of EEG data from channel Oz in Dataset 1.
As shown in Fig. 2(a), the low-frequency components of TEEG and TTEEG have lower power ratios compared to REEG, and the frequency-domain energies of TEEG and TTEEG are always located below REEG before the fundamental frequency (13 Hz), indicating that the time-local information is beneficial for suppressing low-frequency noise.At the same time, the suppression of low-frequency noise also affects the fundamental frequency component, which reduces the energy ratio of the fundamental frequency component, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.as shown in Fig. 2(b).In addition, we can observe an increased power ratio at multiple target stimulus frequencies in the high-frequency components of TEEG and TTEEG, as shown in Fig. 2(c), (d), and (e), suggesting that time-local information can reinforce the harmonic energy of SSVEP.However, the reinforcing effect on high frequencies also introduces a small amount of high-frequency noise, e.g., beyond the second harmonic (26 Hz), the frequency-domain energies of the TEEG and TTEEG are always located above the REEG.This means that in addition to the energy of the target harmonic frequencies (26 Hz, 39 Hz, 52 Hz), the energy of the other frequencies will also increase.In conclusion, the influence mechanism of time-local information on the SSVEP can be divided into two parts.On the premise of sacrificing part of the SSVEP fundamental frequency energy, low-frequency noise is suppressed.At the same time, the harmonic energy of SSVEP is enhanced at the cost of introducing a small amount of highfrequency noise.
Meanwhile, we can find that compared with TEEG, TTEEG has a lower proportion of low-frequency noise energy distribution and a higher proportion of harmonic energy distribution, both are more favorable to SSVEP identification.

B. Relationship Between TT Parameters and Signal Length
TT is used to extract time-local information by assigning weights to each sampling point and the sampling points in its nearby time window, and then summing them cumulatively.The transformation process is shown in Fig. 3.
As can be seen from Fig. 3, the computing theory of the used method is similar to that of convolutional neural networks (CNN).The window length in local-time can be seen as the size of the convolution kernel.Weighted summation of the sampling points in the time-local window can be considered as the convolution operation.The size of the convolutional kernel directly affects the final results of the CNN, so the selection of τ may also be crucial to the recognition ability of the subsequent algorithm.Considering that the requirements of parameters to different time length signals may be different [18], this section uses MSI and CCA (the number of harmonics is set to 4) to calculate the classification accuracy in the TT framework under different time length signals (0.5 -3 s, in steps of 0.1 s), with the range of τ set to [2], [12].In addition, to compare the differences between the two time-local information extraction methods, the same operation was performed for the TE method.To reflect the variation of parameter τ more clearly when the algorithms reach the highest recognition accuracy under different time lengths, we apply 0-1 normalization to the recognition accuracy for different parameter τ under the same time window.The calculation results are shown in Fig. 4. The color bar shows the relationship between normalized recognition accuracy under different time length signals.The blue line represents the τ value corresponding to the highest recognition accuracy of different time length signals.When the four algorithms achieve the optimal recognition accuracy under different time length signals, there are indeed differences in the requirements for the parameter τ , but the range of variation is relatively limited.Among them, When TMSI and TTMSI reach the optimal performance, the parameter τ is about 5 and 7, respectively.The overall variation range of τ is small and the frequency of variation is low.Although when τ is around 5, TCCA can obtain the highest recognition Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.accuracy, τ changes too frequently.The overall variation frequency of τ is low and stable around 7 when TTCCA achieves the optimal effect, but the range of variation is slightly larger compared with the results of other methods.Overall, four algorithms have a relatively stable demand for parameter τ at different signal lengths, and the difference within the floating range is not obvious.To avoid overfitting in the subsequent exploration and facilitate the subsequent discussion of the results, the parameter τ is set to 4 for TMSI, 7 for TTMSI, 5 for TCCA, and 7 for TCCA.

C. Performance Improvement of TT for SSVEP Target Identification
This section compares the recognition accuracy of the four algorithms in Dataset 1, Dataset 2, and Benchmark Dataset using the parameter τ obtained in Section IV-B.To better observe the effect of the algorithm performance improvement, we also include the baseline test of the standard CCA and MSI algorithms.The number of harmonics is set to 4 for all six algorithms, and the results are shown in Fig. 5.
As shown in Fig. 5, TMSI, TTMSI, TCCA, and TTCCA show higher recognition accuracy than MSI and CCA in all datasets, indicating that time-local information is helpful for SSVEP recognition.Among the 2 methods of extracting timelocal information, TT shows a significantly better recognition performance than TE in the time window of [0.7, 3] s.In addition, the accuracy of TTMSI is higher than TTCCA in Dataset 1 and Benchmark Dataset, but slightly lower than that of TTCCA in the time window of [0.6, 0.8] s in Dataset 2, which may be caused by the smaller amount of data at Dataset 2. The combined results of the three datasets show that the recognition ability of TTCCA should be closer to that of TMSI, while TTMSI is significantly better than TMSI.ITR is also a metric that cannot be ignored when measuring the impact of TT on SSVEP target recognition performance.Considering the large gap between MSI, CCA, and the improved algorithms, only the ITR results of TMSI, TTMSI, TCCA, and TTCCA are shown in Fig. 6.
In Fig. 6 (a), four algorithms correspond to different signal lengths when reaching the highest ITR in Dataset 1.Among them, TTMSI reaches the highest ITR of 90.84 bits/min at 0.8 s, with an improvement of 5.46 bits/min compared to TMSI.TTCCA reached the highest ITR of 84.05 bits/min at 0.9 s, with an improvement of 12.53 bits/min compared to TCCA.The results indicate that the TT recognition framework all show a better effect than the TE method.
In Fig. 6(b), TTMSI, TTCCA, and TMSI achieve the highest ITR at 0.8 s in Dataset 2. At this time, ITR of TTMSI has improved by 3.81 bits/min compared with that of TMSI, and ITR of TTCCA has improved by 15.38 bits/min compared with TCCA.Similar to the results of Dataset 1, TT recognition framework based CCA achieves superior performance  improvement, and TTMSI obtains even better performance.In Fig. 6(c), the ITR of TTMSI is up to 85.02 bits/min in the Benchmark Dataset when the signal length is 0.9s, and the corresponding recognition accuracy is 77.06%.Compared with TMSI, the ITR of TTMSI has increased by 8.13 bits/min, and the recognition accuracy has increased by 3.09%.TTCCA and TCCA reach the highest ITR at 1 s with 75.38 bits/min and 68.10 bits/min, respectively.And the corresponding recognition accuracies are 76.59% and 73.49%.In addition, the accuracy and ITR of TTMSI at 0.9 s are better than TTCCA at 1 s.Therefore, the TT recognition framework should be more suitable for MSI.Combining the results in Fig. 5 and Fig. 6, the results of TT in the three datasets show superiority over the TE method, whether in recognition accuracy or ITR.The experimental results have demonstrated the effectiveness of the TT recognition framework, which has great application potential.
Previous literature [39] shown that hypothesis testing results are more accurate for datasets with larger sample sizes.Therefore, this study uses paired t-test in Benchmark Dataset conduct statistical analysis recognition accuracy of four algorithms.The data of [0.8, 1.2] s is chosen for the t-test because the data length corresponding to the highest ITR reached by different algorithms is concentrated in this data segment.The results are shown in Table I.As can be seen, there is a significant difference between TT and TE in the time window of [0.8, 1.2] s.In the TT recognition framework, TTMSI and TTCCA also have significant differences, which once again shows that the TT recognition framework is more suitable for MSI.

V. DISCUSSION
This section first explores the impact of the TT framework on the separability of the features extracted by the algorithms Fig. 7. Separability of extracted features of MSI, TMSI, and TTMSI algorithms in Benchmark dataset with the time window is 2 s.The higher the R-square, the SSVEP separability is.
and further explains the principle of the TT framework on the enhancement of the algorithm.Then, we discuss the application potential of the TT framework by revolving around whether there is consistency in the requirements of different algorithms for the parameters τ in the TT framework.Finally, the number of WT matrix stacking, a factor that may affect the application effect of the TT framework, is explored.It is found that the number of matrix stacking will affect the final recognition effect, but it will also affect the interpretability of the TT framework.

A. Effect of TT on the Separability of Extracted Features
Section IV.A describes the mechanism of the two time-local information extraction methods for SSVEP signals.To better understand the impact of this mechanism on signal recognition, this section uses the R-square [22], [26], [35] as an indicator to compare the separability of features extracted by the algorithms on the Benchmark dataset.R-square is computed based on the correlation coefficient between the target stimulus eigenvalue and the maximum eigenvalue of the non-target stimulus.For presentation purposes, only the more effective MSI algorithm is used for the separability comparison, the results are shown in Fig. 7.
As can be seen in Fig. 7, the separability of the extracted features is significantly increased for all of them with the use of the TT framework, with a certain degree of degradation occurring only at 11 Hz.Except for 11Hz, the TTMSI algorithm has the highest feature separability for all different target stimuli.In addition, both TMSI and TTMSI extracted higher feature separability than MSI at 12-15 Hz, which indicates that the enhancement of the harmonic energy and the reduction of the fundamental frequency energy in Fig. 2 ultimately produce a positive gain on the separability of the data, which ultimately improves the algorithm's recognition performance.Meanwhile, at 10 Hz, the feature separability of TMSI is still lower than that of MSI, but that of TTMSI is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.significantly higher than that of MSI.Therefore, the frequency range in which TT produces a positive gain in separability is wider and of higher magnitude, which further proves the superiority of TT over TE.

B. Consistency of TT Parameters τ
Fig. 4 in Section IV-B shows the stability of the parameter τ in Dataset 1 for the TMSI, TTMSI, TCCA, and TTCCA.The parameter τ does not change substantially for the four algorithms.The choice of parameter τ directly determines the recognition ability of the subsequent algorithms.If there is consistency in the demand for parameter τ across different datasets, the practical application of the TT recognition framework will be much more effective.Therefore, this section will carry out the same discussion as Section IV-B in Benchmark Dataset, exploring whether the requirements of the four algorithms for τ are consistent with Dataset 1, and the results are shown in Fig. 8.
Fig. 8 shows that, when the four algorithms achieve the best recognition accuracy under different time length signals, there are also some differences in the demand for parameter τ .Different from Section IV-B, when the signal length exceeds 2.5 s, the variation frequency and amplitude of τ increase significantly.In practical applications, the stimulation time is generally controlled within 2.5 s [23], [40], so data after 2.5 s is not included in the consistency discussion.
In the time window of [0.5, 2.5] s, TMSI achieves the optimal recognition accuracy when τ is around 5, which is slightly different from τ around 4 in Fig. 4(a).The optimal recognition accuracy of TTMSI is obtained when τ is around 7, which is consistent with the results in Fig. 4(b), and the overall floating frequency is also relatively low.The results of TCCA are also basically in line with Fig. 4(c), with τ stabilizing around 5 when the highest recognition accuracy is reached.Unlike the above results, parameter τ jumps repeatedly between 7 and 8 when TTCCA achieves the best recognition accuracy.Such a phenomenon is also inconsistent with that in Fig. 4(d).Overall, only TTMSI showed consistency in two completely different datasets for the parameter τ .Combined with the classification results in Section IV-C, perhaps TTMSI (τ = 7) can be considered as a stable training-free algorithm with higher application potential.

C. Influence of Stacking Number of WT Matrices on Recognition Effect
In this paper, L is used as a WT matrix to extract the time-local information of the SSVEP signal.In the practical calculation process, the method is equivalent to replacing L in TE with L T *L, that is, using a weighted stack of two L to improve the SSVEP recognition performance of the algorithm.This section explores the impact of the stacking number of the transformation matrix L on the recognition effect.To clearly show the impact of stacking number on the recognition effect, only the data of [0.7, 1.2] s is taken for analysis in this section.And then MSI algorithm, which is more applicable to the TT framework, is used to measure the influence of different stacking numbers on the recognition effect.The results are shown in Fig. 9. Ti represents the MSI algorithm for i matrix L superposition.When i = 1, T1 is TMSI, and if i = 2, T2 is TTMSI.
Fig. 9(a) and Fig. 9(b) show the recognition accuracy of Ti under different signal lengths.Combining the results of the two figures, the stacking number of WT matrix indeed affects the recognition accuracy of the algorithm and the stability of the τ value.In Fig. 9(a) and Fig. 9(b), T2 does not achieve the highest recognition accuracy in most of the time windows.The stacking number to achieve the highest recognition accuracy varies for different signal lengths, and there is no optimal stacking number that remains stable and constant.But from the perspective of WT, if the matrices are stacked by more than two, there is no theoretical significance, and perhaps an explanation from the perspective of neural networks is needed.The convolution operation of the neural network usually requires a stack of multiple convolution blocks to obtain better network performance.However, as the number of convolutions continues to increase, the network performance may degrade, making the recognition performance begin to decline.The phenomena in Fig. 9(a) and Fig. 9(b) are consistent with this characteristic of the convolution operation: When the stacking number of the transformation matrix is increased to 3∼4, better recognition performance is achieved and more separable features can be extracted.However, the accuracy begins to decline if the stacking number is increased to 5∼6.
In summary, from the perspective of WT, T2 can achieve a stable recognition effect and be reasonably interpretable.Multiple matrix L stacks can achieve higher recognition accuracy but lack interpretability.Perhaps the mechanism of operation can be explained from the perspective of neural networks, which would be an open discussion.

VI. CONCLUSION
Previous works have proved that time-local information can improve the recognition effect of SSVEP, but its improvement mechanism is not clear.This study proposes a TT recognition framework from the perspective of weighted transformation, it reveals the influence mechanism of time-local information on SSVEP recognition.On the premise of sacrificing part of the SSVEP fundamental frequency energy, low-frequency noise is suppressed.At the same time, at the cost of introducing a small amount of high-frequency noise, the harmonic energy of SSVEP is enhanced.Their combined effect ultimately increases the separability of SSVEP.Experimental results indicate that combined with the TT recognition framework, MSI shows optimal recognition results with higher application potential.

Fig. 1 .
Fig. 1.TT identification framework.Colored blocks indicate the function of the module above.

Fig. 2 .
Fig. 2. (a)The power ratio of the frequency domain in raw EEG, TEEG, and TTEEG from channel Oz in Dataset 1, τ = 7.The image on the colored orb is a local enlargement of the image inside the blue circle.(b)∼(e) Variation of the energy distribution of the fundamental frequency and the higher harmonics of the three signals.

Fig. 3 .
Fig. 3. Transformation process of TT.The weights of the sampling points in the window outside the dashed line are 0.

Fig. 4 .
Fig.4.τ in the range of[2] and[12], recognition accuracy of different algorithms at different signal lengths (Dataset 1).(a) TMSI, (b) TTMSI, (c) TCCA, and (d) TTCCA.The blue lines represent the τ values corresponding to the highest recognition accuracy at the same signal length.The color bar represents the normalized recognition accuracy, the closer to red, the higher the accuracy at the current data length.

Fig. 8 .
Fig.8.τ in the range of[2] and[12], recognition accuracy of different algorithms at different signal lengths (Benchmark Dataset).(a) TMSI, (b) TTMSI, (c) TCCA, and (d) TTCCA.The blue lines represent the τ values corresponding to the highest recognition accuracy at the same signal length.The color bar represents the normalized recognition accuracy, the closer to red, the higher the accuracy at the current data length.

Fig. 9 .
Fig. 9. Influence of stacking number of WT matrices L on recognition effect, Ti represents the MSI algorithm for i matrix L superposition.The blue dashed line represents the recognition accuracy of T2.(a) Dataset1.(b) Benchmark Dataset.

TABLE I STATISTICAL
ANALYSIS OF ACCURACY AMONG TTMSI, TMSI, TTCCA, AND TCCA BY THE PAIRED T-TEST