Alignment-Based Adversarial Training (ABAT) for Improving the Robustness and Accuracy of EEG-Based BCIs

Machine learning has achieved great success in electroencephalogram (EEG) based brain-computer interfaces (BCIs). Most existing BCI studies focused on improving the decoding accuracy, with only a few considering the adversarial security. Although many adversarial defense approaches have been proposed in other application domains such as computer vision, previous research showed that their direct extensions to BCIs degrade the classification accuracy on benign samples. This phenomenon greatly affects the applicability of adversarial defense approaches to EEG-based BCIs. To mitigate this problem, we propose alignment-based adversarial training (ABAT), which performs EEG data alignment before adversarial training. Data alignment aligns EEG trials from different domains to reduce their distribution discrepancies, and adversarial training further robustifies the classification boundary. The integration of data alignment and adversarial training can make the trained EEG classifiers simultaneously more accurate and more robust. Experiments on five EEG datasets from two different BCI paradigms (motor imagery classification, and event related potential recognition), three convolutional neural network classifiers (EEGNet, ShallowCNN and DeepCNN) and three different experimental settings (offline within-subject cross-block/-session classification, online cross-session classification, and pre-trained classifiers) demonstrated its effectiveness. It is very intriguing that adversarial attacks, which are usually used to damage BCI systems, can be used in ABAT to simultaneously improve the model accuracy and robustness.

a computer [1].Electroencephalogram (EEG), which records the brain's electrical activities from the scalp, is the most commonly utilized input signal in non-invasive BCIs, due to its affordability and ease of use [2].An EEG-based BCI system typically includes four components: signal acquisition, signal processing, machine learning, and controller, as illustrated in Fig. 1.
Most prior research on EEG decoding primarily focused on the accuracy and efficiency of machine learning algorithms [3].Nonetheless, a critical discovery by Zhang and Wu [4] revealed that adversarial examples, generated using unsupervised fast gradient sign method (FGSM) [5], can significantly degrade the performance of deep learning classifiers in EEG-based BCIs.They introduced an attack framework that transforms a benign EEG epoch into an adversarial one by injecting a jamming module before machine learning to add adversarial perturbations, as depicted in Fig. 2. Furthermore, Zhang et al. [6] demonstrated that adversarial examples can also fool traditional machine learning classifiers in BCI spellers, misleading them to output an arbitrary (incorrect) character specified by the attacker.Liu et al. [7] and Jung et al. [8] developed approaches to generate universal adversarial perturbations for EEG-based BCIs, making adversarial attacks much easier to implement.Bian et al. [9] employed simple square wave signals to generate adversarial examples, for attacking steady-state visual evoked potential based BCIs.Wang et al. [10] investigated physically constrained adversarial attacks to BCIs.Meng et al. [11] also performed adversarial attacks in EEG-based BCI regression problems.
Adversarial attacks to EEG-based BCIs could have various consequences, from mere user frustration to life-threatening accidents.As pointed out in [12], "In BCI spellers for Amyotrophic Lateral Sclerosis patients, adversarial attacks may hijack the user's true input and output wrong letters.The user's intention may be manipulated, or the user may feel too frustrated to use the BCI speller, losing his/her only way to communicate with others.In BCI-based driver drowsiness estimation [13], adversarial attacks may manipulate the output of the BCI system and increase the risk of accidents.In EEG-based awareness evaluation/detection for disorder of consciousness patients [14], adversarial attacks may disturb the true responses of the patients and lead to misdiagnosis."In military applications, adversarial attacks to BCIs may generate  false commands, potentially causing friendly fire [15].Consequently, it is very important to develop BCI machine learning models that are robust against adversarial attacks.
Many adversarial defense approaches have been proposed in the literature [16], [17], [18], among which robust training [19] may be the most classical and effective strategy.Adversarial training (AT) [16] is a representative robust training approach, and many other approaches [17], [18] can be regarded as its variants.AT solves a minimax problem (the saddle point problem).During training, AT generates adversarial examples along gradients that increase the model's loss to the input, and then minimizes the model's loss on these adversarial examples repeatedly [16].This process aims to minimize the model's loss on adversarial examples, but does not explicitly optimize the performance on benign examples.Many studies [16], [20], [21], [22] have shown that robust training may result in a significant decrease of the accuracy on benign samples, which is undesirable.
Few studies have explored the possibility of improving the machine learning performance using adversarial examples.For image classification, Xie et al. [23] employed a separate auxiliary batch normalization for adversarial examples to prevent model overfitting.For EEG classification, Ni et al. [24] used a loss on adversarial examples to improve the cross-subject and cross-state transfer learning performance.However, Li et al. [21] and Meng et al. [22] have shown that conventional robust training approaches usually lead to an evident reduction in BCI model accuracy on benign samples, i.e., it is difficult to achieve both high accuracy and good robustness through robust training.
Robust models aim to maintain good classification performance under adversarial attacks, which is important in safety-critical applications.However, the accuracy degradation of robust models on benign samples seriously affects their adaption.To mitigate this problem, we propose alignmentbased adversarial training (ABAT) to align EEG data for each session before performing robust training on them.This simple approach can be readily used in deep model training.After ABAT, the model's classification accuracy on benign samples and robustness on adversarial samples can simultaneously be improved.Experiments on five datasets using two different BCI paradigms, three classifiers and three different experimental settings demonstrated its effectiveness.To our knowledge, this is the first work on simultaneously improving the accuracy and robustness of the classifiers in EEG-based BCIs, and also the first time that EEG data alignment has been used in BCI adversarial defense.We hope that our findings can inspire more future research on robust EEG classifiers.
The remainder of this paper is organized as follows: Section II introduces related works.Section III proposes ABAT.Section IV describes the experimental settings.Section V presents the experimental results.Finally, Section VI draws conclusions.

II. RELATED WORK
This section introduces background knowledge on EEG data alignment, adversarial attacks, and AT.

A. Euclidean Alignment
EEG data from different subjects/sessions can be regarded as data from different domains.Due to inter-subject/-session variations, the marginal probability distributions of EEG trials from different subjects/sessions are usually (significantly) different [25].Consequently, it is important to perform EEG data alignment to reduce the domain discrepancy.
Various EEG data alignment approaches have been proposed, which are reviewed and compared in [26].Zanini et al. [27] introduced Riemannian alignment to align the covariance matrices of EEG trials from different subjects in the Riemannian space.He and Wu [28] extended Riemannian alignment to Euclidean alignment (EA), which aligns the raw EEG trials in the Euclidean space.EA is efficient and completely unsupervised, demonstrating promising performance in different BCI paradigms [29].
For N EEG trials {X n } N n=1 in a particular domain, EA first computes the Euclidean arithmetic mean R of all N spatial covariance matrices: Then, it performs the alignment by: After EA, the aligned EEG trials { Xn } N n=1 in each domain are whitened, i.e., their average spatial covariance matrix becomes the identity matrix.Thus, EEG data distributions from different domains become more consistent.

B. Incremental EA
In online applications, target domain EEG trials X t arrive one by one on-the-fly, so there is a need to perform incremental EA on them.
Incremental EA applies EA to online EEG classification [30].Let Rt n be the average spatial covariance matrix computed from the first n target domain EEG trials.When the (n + 1)-th target domain EEG trial X t n+1 arrives, we first update: and then perform EA on X t n+1 using:

C. Adversarial Attack
Adversarial examples should closely resemble benign ones, achieved by imposing constraints on the perturbation magnitude.Let D be a distance matric (a common choice is the ℓ p norm), and X adv an adversarial example satisfying D(X adv , X ) < ϵ, where ϵ regulates the magnitude of the perturbation.
We consider the following three representative adversarial attack approaches: 1) Fast gradient sign method (FGSM) [5], a straightforward yet highly effective adversarial attack strategy.It constructs an adversarial example through a single-step gradient computation: where C θ is a classifier with parameter θ, and L its loss function.FGSM perturbs the input along the gradient direction, increasing the classifier's loss on its true label and leading to misclassification.2) Projected Gradient Descent (PGD) [16], which is an iterative extension of FGSM.It starts from a perturbed version of the benign example X : where ξ is uniform random noise sampled from (−ϵ, ϵ).The iterative step is given by: where α ≤ ϵ is the step size.The function Proj X,ϵ ensures that X adv i remains within the ϵ-neighborhood of X according to the ℓ ∞ norm.3) AutoAttack [31], which combines four distinct attack strategies, including two budget-aware step size-free PGD variants with different losses (cross-entropy loss, and difference of logits ratio loss), square attack [32], and fast adaptive boundary attack [33], each serving a unique purpose.AutoAttack is parameter-free, and has demonstrated superior performance in defeating various defense approaches [31].

D. AT
AT [16] is a classical robust training approach for enhancing the robustness of machine learning models against adversarial attacks, i.e., improving the model accuracy on adversarial examples by adding them to the training data.It can be expressed as a min-max (saddle point) optimization problem: where D is the data distribution, and B(X, ϵ) the ℓ ∞ ball of radius ϵ centered at X .X adv can be adversarial examples generated by PGD [16] (AT-PGD) or FGSM [34] (AT-FGSM) or AutoAttack [31], etc. Whereas AT can significantly enhance a model's robustness to adversarial examples, it often comes at the cost of reduced accuracy on benign examples [22].

III. ALIGNMENT-BASED ADVERSARIAL TRAINING (ABAT)
AT aims at learning model parameters θ that minimize the model's loss L on training samples' strong adversarial examples X adv , i.e., to increase its robustness.AT is one of the most effective adversarial defense approaches.However, it often comes at the cost of reduced accuracies on benign examples.This paper studies whether AT can be used to improve the model's robustness and accuracy simultaneously.
Ni et al. [24] were the first to include the loss on adversarial examples in the overall model training loss function in EEG classification, to improve the transfer learning classification accuracy.However, they did not consider the adversarial robustness.Li et al. [21] and Meng et al. [22] showed that frequently, robust training, one of the most popular adversarial defense approaches, degrades the accuracy of BCI models.This may be due to the lack of EEG data alignment to reduce the data discrepancy among different subjects or different sessions.Multiple EEG data aliment approaches have been proposed, e.g., Riemannian alignment [27], EA [28] and label alignment [35].They greatly improve the classification accuracy in traditional transfer learning scenarios [26].However, EEG dat alignment has not been used in BCI adversarial defense.
We propose a very simple yet effective ABAT to fill this gap, by performing EEG data alignment before AT.Data alignment aligns EEG trials from different domains to reduce their distribution discrepancies, and AT further robustifies the classification boundary, as illustrated in Fig. 3. EA is used in this paper, for its simplicity and effectiveness.
Algorithm 1 gives the pseudo-code of ABAT.It first aligns the EEG data of each domain using EA, and then performs adversarial training.The complete BCI flowchart with ABAT consists of data acquisition, data preprocessing (EEG data epoching and filtering), data alignment, AT, and model evaluation (in terms of accuracy and robustness), as shown in Fig. 4. Particularly, after preprocessing the EEG data, ABAT is used to train the EEG classifier, which is then used in subsequent classification and robustness evaluation.

IV. EXPERIMENT SETTINGS
This section introduces our experiment settings, including the datasets, models, performance evaluation metrics, testing scenarios, and hyper-parameters.Our source code, including data preprocessing, can be found at https://github.com/xqchen914/ABAT.

A. Datasets
The following four datasets, summarized in Table I, were used: 1) Four-class motor imagery dataset (MI4) [36]: This is Dataset 2a in BCI Competition IV 1 .It was collected

B. Evaluation Metrics
We used balanced classification accuracy (BCA) to evaluate the classification performance.The frequently-used raw classification accuracy (RCA) is the ratio of the number of correctly classified examples to the number of total examples.
The BCA is the average of the per-class RCAs.BCA was preferred in our experiments because the ERP and P300 datasets have significant intrinsic class imbalance, so using RCA is misleading.When all classes have the same number of samples (e.g., MI4 and MI6), BCA reduces to RCA.
We repeated each experiment three times, and their averages were reported.

D. Testing Scenarios
We tested the performance of different models under adversarial attacks, i.e., their robustness, in offline scenario, where adversarial attacks are most effective.We also tested their classification accuracies on benign samples in both offline and online scenarios.Algorithms 2 gives the pseudo-code of the corresponding online test, offline test, and offline robust test procedures.
On MI4 and ERP, we used the first session as the training set and the remaining ones as the test set.On P300, we used the first two sessions as the training set.On MI6 and MI2, we used the first two blocks of data as the training set.

E. Hyper-Parameters
When training within-subject models, batch size 32 was used on MI4, MI6 and ERP datasets, and batch size 128 on the P300 dataset.When training cross-subject models, batch size 128 was used on all four datasets.All models were trained for 100 epochs with initial learning rate 0.01, which was reduced to 0.001 after 50 epochs.
Since the perturbation magnitude is correlated with the original EEG signal magnitude, we selected the perturbation magnitude to be ϵ times of the EEG signal standard deviation.1) and ( 2) to obtain by ( 5), or ( 6) and ( 7), or AutoAttack [31]; Compute ŷt,adv In adversarial attacks, we chose 20 iterations for PGD and AutoAttack, with attack step size ϵ/10 for PGD.In AT and ABAT, we used 10 iterations for PGD, with PGD attack step size α = ϵ/5.Deep learning models perform very differently on adversarial examples with different perturbation amplitudes [22].To have a more comprehensive assessment, we evaluated the model performance with different adversarial training perturbation amplitudes on adversarial examples with different perturbation amplitudes.

V. EXPERIMENTAL RESULTS
This section presents experimental results to verify the effectiveness of our proposed ABAT.

A. Offline Cross-Block/-Session Performance on Benign Samples
Offline cross-block/-session BCAs of EEGNet, DeepCNN and ShallowCNN under benign training (BT), AT-FGSM and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.AT-PGD, and ABAT-FGSM and ABAT-PGD with different training perturbation magnitudes ϵ, on the benign samples of the five datasets are shown in the 'No Attack' column of Tables II-VI, respectively.Observe that: 1) Without EA, AT-FGSM and AT-PGD had similar or lower BCAs on the benign samples than BT. 2) With EA, as ABAT-FGSM or ABAT-PGD perturbation amplitude increased from 0.01 to 0.05, the BCAs on the benign samples first increased and then decreased, or kept increasing.3) The overall performance of ABAT-FGSM and ABAT-PGD on the three classifiers and four datasets was similar.However, ABAT-FGSM is much faster than ABAT-PGD.

C. Online Cross-Session Performance on Benign Samples
We simulated the online cross-session EEG classification scenario and performed incremental EA for each incoming EEG trail in the target session on MI4 and P300 datasets.The classifiers were the same as those in Subsection V-A.The results are shown in Fig. 5.When EA was not used, the results for online scenarios were the same as those for offline scenarios.
Using incremental EA in online scenarios improved model performance, and ABAT can further improve the BCAs.

D. ABAT Using Pre-Trained Models
In real-world applications, we could pre-train a classifier on other subjects' EEG data, and then fine-tune it with target subject' data.The results on MI4 and MI6 are shown in Tables VII-VIII, respectively.
Compared with BCAs without pre-training in Tables II-III, when EA was used, pre-training achieved higher BCAs on benign samples of the target user; however, the BCAs decreased on adversarial examples with larger perturbations.Nevertheless, ABAT can further improve the BCAs on both benign samples and adversarial examples.

E. Discussions
To explore the influence of training data size on the classification performance in offline within-subject classification, we trained the classifiers using different number of blocks of data on MI6, and computed their BCAs under different AutoAttack amplitudes.We used EA in BT, and ABAT used PGD with ϵ = 0.01.The results are shown in Figs.6(a)-6(c).Perturbation amplitude 0 means benign samples.
It can be observed that increasing the training data size steadily improved the classification performance on benign samples, which is intuitive.However, the classifiers were still Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.vulnerable to adversarial attacks.ABAT improved both the classification performance on benign samples with different training data sizes, and the robustness of the classifiers.Subsection V-A pointed out that the optimal ABAT perturbation amplitude for a classifier to achieve their highest BCAs on benign samples may be positively Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.correlated with its capacity.This subsection performs further investigations.
We tested the BCAs of ShallowCNN with {5, 10, 40, 80, 100} convolution kernels on MI4, and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.7. Generally, as the number of model parameters increased, the optimal ABAT amplitude for benign samples also increased.

VI. CONCLUSION AND FUTURE RESEARCH
This papers has proposed a simple yet effective ABAT approach to perform AT on aligned EEG data in order to make the trained model simultaneously more accurate and more robust.Experiments on four EEG datasets from two different BCI paradigms (MI and ERP), three CNN classifiers (EEGNet, ShallowCNN and DeepCNN) and three different experimental settings (offline within-subject cross-block/-session classification, online cross-session classification, and pre-trained classifiers) demonstrated its effectiveness.It is very intriguing that adversarial attacks, which are usually used to damage BCI systems, can be used in adversarial training to simultaneously improve the model accuracy and robustness.
Our future research will: 1) Study how to perform ABAT for traditional EEG classifiers.This paper proposed ABAT for deep neural network EEG classifiers, but there are also many promising traditional classifiers [43], [44], [45], [46], and it is useful to adapt ABAT to them.2) Study how ABAT can be used in cross-subject application, to increase the accuracy and robustness simultaneously.EEG data exhibit large individual differences [29], so this problem is very challenging.3) Study how data preprocessing/denoising approaches, e.g., multiscale principal component analysis [43], [45], [47], [48], can be integrated with ABAT for even better performance.

Fig. 2 .
Fig.2.The attack framework proposed in[4], which injects a jamming module between signal processing and machine learning to generate adversarial examples.

Fig. 3 .
Fig. 3.The influence of ABAT on EEG data from different domains.Data alignment aligns EEG trials from different domains to reduce their distribution discrepancies, and AT further robustifies the classification boundary.

Fig. 4 .
Fig. 4.The complete BCI flowchart, incorporating ABAT.After preprocessing EEG data using epoching and filtering, ABAT trains the classifier, which is then used in subsequent classification and robustness evaluation.ABAT aligns EEG data centers across different domains and robustifies the classifier's decision boundary through adversarial training.

from 9
subjects in two sessions on different days.There are four classes, i.e., left hand, right hand, feet, and tongue.The 22-channel EEG signals were sampled at 250 Hz.We extracted the data in [0, 4] seconds after each imagination prompt and band-pass filtered the trials at[8,32] Hz.Each subject had 144 EEG epochs per class.2) Six-class motor imagery dataset (MI6)[37]: It was collected from 10 subjects for seven-class classification, i.e., left hand, right hand, feet, both hands, left hand combined with right foot, right hand combined with left foot, and rest state.The data collection process was divided into 9 sections, with 5 to 10 minutes intersection break.We only used data from the first six classes.The 64-channel EEG signals were sampled at 200 Hz.We extracted the data in [0, 4] seconds after each imagination prompt and band-pass filtered the trials at[4,32] Hz.Each subject had 80 EEG epochs per class.3) Two-class motor imagery (MI2)[38] includes EEG data from 60 users performing left hand and right hand motor imagery tasks for 6 runs.27-channel EEG signals were recorded at 512 Hz.After[4,32] Hz band-pass filtering, we downsampled the data to 128 Hz and extracted data within [0, 4] seconds of each imagination prompt.Each subject had 120 EEG epochs per class.4) P300 evoked potentials (P300)[39]: It was collected from four disabled subjects and four healthy ones in four sessions for two-class classification (target and nontarget).The EEG data were recorded from 32 channels at 2048 Hz.We re-referenced the data, discarded the mastoid channels, filtered the data using a[1,12] Hz bandpass filter, and down-sample the data to 128 Hz.EEG epochs between [0, 1] second were extracted.Each subject had 3,300 epochs, among which 557 were target.5) Event related potential (ERP)[40]: It was collected from 10 subjects in three sessions for two-class classification (target and non-target).The EEG signals were recorded using 16 electrodes at 250 Hz.We used MOABB API 2 to get the preprocessed EEG data.EEG epochs between [0, 0.8] second were extracted.Each subject had 1,728 EEG epochs, among which about 288 were target.
Input: S = {D s } S s=1 , labeled data from S source domains; M, the number of model training epochs.Output: C θ , trained EEG classifier.Randomly initialize classifier C θ or pre-train the classifier C θ on other available data; // Data Alignment for s = 1 : S do Perform EA on D s by (1) and (

TABLE I SUMMARY
OF FOUR DATASETS

TABLE II BCAS
OF DIFFERENT TRAINING APPROACHES UNDER BENIGN SAMPLES AND VARIOUS ATTACKS ON MI4

TABLE III BCAS
OF DIFFERENT TRAINING APPROACHES UNDER BENIGN SAMPLES AND VARIOUS ATTACKS ON MI6Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE IV BCAS
OF DIFFERENT TRAINING APPROACHES UNDER BENIGN SAMPLES AND VARIOUS ATTACKS ON MI2

TABLE V BCAS
OF DIFFERENT TRAINING APPROACHES UNDER BENIGN SAMPLES AND VARIOUS ATTACKS ON P300Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE VI BCAS
OF DIFFERENT TRAINING APPROACHES UNDER BENIGN SAMPLES AND VARIOUS ATTACKS ON ERP

TABLE VII BCAS
OF DIFFERENT TRAINING APPROACHES ON PRE-TRAINED MODELS UNDER VARIOUS ATTACKS ON MI4

TABLE VIII BCAS
OF DIFFERENT TRAINING APPROACHES ON PRE-TRAINED MODELS UNDER VARIOUS ATTACKS ON MI6