Improving Motor Imagery of Gait on a Brain–Computer Interface by Means of Virtual Reality: A Case of Study

Motor imagery (MI) is one of the most common paradigms used in brain-computer interfaces (BCIs). This mental process is defined as the imagination of movement without any motion. In some lower-limb exoskeletons controlled by BCIs, users have to perform MI continuously in order to move the exoskeleton. This makes it difficult to design a closed-loop control BCI, as it cannot be assured that the analyzed activity is not related to motion instead of imagery. A possible solution would be the employment of virtual reality (VR). During VR training phase, subjects could focus on MI avoiding any distraction. This could help the subject to create a robust model of the BCI classifier that would be used later to control the exoskeleton. This paper analyzes if gait MI can be improved when VR feedback is provided to subjects instead of visual feedback by a screen. Additionally, both types of visual feedback are analyzed while subjects are seated or standing up. From the analysis, visual feedback by VR was related to higher performances in the majority of cases, not being relevant the differences between standing and being seated. The paper also presents a case of study for the closed-loop control of the BCI in a virtual reality environment. Subjects had to perform gait MI or to be in a relaxation state and based on the output of the BCI, the immersive first person view remained static or started to move. Experiments showed an accuracy of issued commands of 91.0± 6.7, being a very satisfactory result.


I. INTRODUCTION
Motor Imagery (MI) is defined as the mental process of imaging a motion act without actually executing any movement [1]. It is one of the most commonly used control paradigms in brain-computer interfaces (BCIs), as motion imagery produces similar brain patterns to the ones associated with the execution of the movement [2]- [4].
The cognitive involvement of a patient can improve rehabilitation processes thanks to neuroplasticity [5]. This has been demonstrated in clinical studies [6]. Therefore, the use of this paradigm can be used not only for the control of The associate editor coordinating the review of this manuscript and approving it for publication was Gang Wang . mechatronic devices, but as an actively part in rehabilitation therapies.
However the MI performance is affected by several conditions. First, it requires a high focus of the subject during the training of the BCI to adjust the classifier. Any distraction by the subject can easily spoil the data affecting the quality of the classifier model. Therefore, a high control of the experimental conditions are needed, avoiding any external noise or motions. On the other hand, when MI is applied to event related de-synchronization (ERD/ERS) [7], it is important to assure that the MI epochs considered do not contain any data after the actual start of motion. If this is not accomplished, it is difficult to state that the ERD/ERS detection is caused by imagery instead of actual motion activity or VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ artifacts. Indeed, accuracy drops when only epochs before the motion are considered for the classifier creation as it is shown in literature [8], [9]. However, in a restorative therapy it is important to use MI as a continuous mental act instead of an event related act (motion intention) to favor the neuroplasticity mechanisms [5]. If MI is used in a brain-machine interface in combination with a mechanic device, such as an exoskeleton or orthosis, the patient receives the feedback of the BCI classification as the motion of the device in a closed-loop control. This means that the MI act is actually performed during motion, which could not be considered as a proper MI act by its definition. This makes really hard to develop a closed-loop control BCI based on maintained MI. A way to avoid motion artifacts could be to focus on gamma band (>30Hz) [10] instead of sensory motor bands (alpha and low beta band  or applying state of the art realtime motion artifact techniques [11]. However, this cannot assure that the analyzed activity is not related to motion instead of imagery. Specifically, in the case of BCIs for lowerlimb closed-loop control of exoskeletons there are not many investigations in literature, and they usually do not apply a maintained MI paradigm, but ERD/ERS as in [12], [13], Motion Related Cortical Potentials [14] as in [15], Steady State Visual Potentials such as in [16]- [18] and combinations of them [19], [20]. An effective way to adjust the model of a BCI based on MI for its application in closed-loop control could be virtual reality (VR). In this case, the environment can easily isolate the subject from any external perturbation. In addition, motion feedback can be provided through the VR environment in an immersive way without executing any movement. Some works have explored the combination of BCI with VR or virtual screen feedback. For instance, in [21] a virtual avatar was applied in combination with a treadmill for closed-loop control of a BCI. However, the feedback included motion and the avatar was shown to the subject by a screen interface, which could not be considered as a proper immersive VR environment. In addition, the majority of studies that have employed VR are focused on hand MI [22]- [24] or a combination of hand with foot MI [25]. Reference [26] shows an study with VR to promote foot motor imagery but in an open-loop approach.
Filter Bank Common Spatial Patterns (FBCSP) is a decoding algorithm which is based on spectral and spatial features [27]. Whereas there are many studies that has tested this algorithm with offline competition datasets to distinguish among different motor imagery tasks [28], [29], there are only a few that have applied this methodology for the closedloop control of an external device [30]. Additionally, in the literature there are different variants of FBCSP that have reported higher accuracies in offline competition datasets [31], [32]. However, they have not been tested in an online approach applied to the whole EEG signal trial, but only to selected fragments to be processed. In addition, these FBCSP variants need to estimate hyperparameters for the selection of optimal features, which is time consuming and would affect the experimental length of online trials. Consequently, it difficults its application for a real experiment with patients in which the duration of the session is crucial, as otherwise subjects could suffer from fatigue.
The current research explores the use of VR as a mean to improve the MI task execution. This could help the subjects during the training phase of the classifier. This is performed in open-loop control in order to create a classifier model able to separate the classes. Usually, the classes to consider are rest vs. MI walk, and it is critical for the subject to be focused on the mental tasks avoiding any external distraction in order to create a robust model to be used during the closed-loop test trials.
The paper is organized in two experiments. In both, FBCSP was employed for pattern decoding. First experiment explored the use of an immersive VR environment in comparison to a screen interface. In order to take into account balance issues during MI, both experiments were repeated while the subject was seated and standing up. This first step of the research assessed the accuracy of the proposed BCI as the index to compare the performance of the different interfaces. The second part of the research presented a case of study for the closed-loop control of the VR environment by means of the BCI.

A. PARTICIPANTS
Five subjects participated in the study (mean age 29.8 ± 6.46). They were informed about the experiments and signed an informed consent in accordance with the Declaration of Helsinki They did not report any known disease and had no movement impairment. All subjects had some experience with BCI, but not with the same experimental setup or VR. All procedures were approved by the Responsible Research Office of Miguel Hernández University of Elche.
The VR hardware and software used consisted of a VIVE HTC headset (HTC, Taiwan) (2160 × 1200 resolution, 1080 × 1200 per eye, 90 Hz refresh rate) that participants wore, two base stations that tracked the exact location of the headset and Steam software (Valve, United States). In experiment (b), subjects first performed trials in which the visual feedback was predefined and these trials were employed to train the BCI classifier. Afterwards, subjects performed closed-loop trials in which the visual feedback changed based on the output of the BCI classifier previously trained. First, data was recorded. Then, it was pre-processed with different frequency filters and common spatial patterns were extracted from each frequency band. Finally, the algorithm performed a classification in two events: MI or relax.

C. EXPERIMENTAL DESIGN
Two different experiments were conducted in which users had to perform MI of gait. The objective was to investigate if it is possible to differentiate between periods of MI and resting state while subjects get only visual feedback. In the first experiment, different approaches for the visual feedback were compared and the performances were calculated offline. In the second experiment, the approach that showed the highest performance in experiment 1 was employed for closedloop online sessions. The schema of the experimental setup can be seen in Fig. 1.

1) EXPERIMENT 1
In this experiment the BCI performance was compared when users got visual feedback using a screen or a VR environment. Visual environment consisted of an star-ship corridor. The corridor followed a repetitive pattern to avoid any visual distraction, but allowing an easy perception of speed and motion. During the MI periods, the first person view moved recreating the gait action through the VR corridor, creating a realistic motion sensation thanks to some minor balancing motion animations of the camera. Otherwise, first person view was static during periods of resting. An example of the screen view of the corridor can be seen in Fig. 2. Two other conditions were compared: when subjects were physically standing still and when they were seated in a chair. The objective of this comparison was to study if the feeling of stability, especially when using VR, has an effect on the performance. The sequence of experiments with these four conditions was randomized for each subject. Fig. 3 shows an example of a participant with VR and standing.
Subjects had to perform trials with a series of mental tasks as can be seen in Fig. 4(a). They had to alternate periods of motor imagery of walking with resting. There was a visual cue that indicated the beginning of each task. In order to avoid visual evoked potentials, the period of 2 seconds after the cue was labelled as preparation task and was not considered for further analysis. The protocol had an extra class called 'Free'  in which subjects could rest and do free tasks as swallowing or blinking. The protocol of Fig. 4(a). was repeated 6-8 times for each one of the 4 approaches: VR+ standing, VR+ sitting, screen + standing, screen + sitting. Two subjects performed a session with 8 trials of each procedure and the other one participated in two sessions, but with 6 trials. The reduction to 6 trials in the third subject was done in order to limit the protocol times based on the feedback of the first two subjects to avoid fatigue.

2) EXPERIMENT 2
The approach in which subjects were standing still and using VR was employed for closed-loop online sessions. The first part of the session consisted of the open-loop trials used for the classifier training and the second part of the closed-loop trials used to test the performance of the BCI in real time.
The protocol of each kind of trial can be seen in Fig. 4(b). In the first part of the session, each subject performed the 10 open-loop training trials. In addition, the performance of these training trials was calculated offline following a leaveone-out cross-validation. Afterwards, each subject performed 5 closed-loop trials. While in open-loop trials, first person view feedback was previously defined based on the protocol as in experiment 1, in closed-loop trials, first person view motion was based on the output of the BCI.

D. BRAIN COMPUTER INTERFACE
The BCI consisted of four phases of pre-processing, feature extraction, classification and issuing commands.
For both experiments, EEG signals were recorded at a sampling frequency of 500Hz. In experiment 1, data were analysed offline following a pseudo-online approach, whereas all the analysis was done online in experiment 2. From each trial, epochs of 1s with 0.5s of shifting were extracted and processed.

1) PRE-PROCESSING
The first pre-processing step was a notch filter at 50Hz to remove the contribution of the power line. It was followed by a high-pass filter at 0.5Hz. For feature extraction, FBCSP algorithm [27] was employed to get spatial features associated with different frequency bands. As the first stage of FBCSP algorithm, a filter bank of band-pass filters was applied to study different frequency bands. Focusing on the MI, 4 band-pass filters were applied to get alpha and beta waves: 5-10Hz, 10-15Hz, 15-20Hz, 20-25Hz. In order to mitigate artifacts caused by the movement of electrodes or wires, they were fixed with clamps and a medical mesh. Additionally, subjects were asked to not blink, swallow or chew during periods of MI and resting state.

2) FEATURE EXTRACTION
Discriminant characteristics associated with each brain task were extracted from each pre-processed windows of data for classifier training and testing. The second stage of FBCSP algorithm was applied to signals from each filtered frequency band. It designs spatial filters that enhance the differences between two types of EEG patterns in terms of their variances [33]. Thus, given an EEG signal, X , that has N * T dimensions, in which N is the number of channels and T is the number of samples, the algorithm estimates a matrix of spatial filters W . In this case, the two classes to discriminate are MI (X 1 ) and rest (X 2 ). The normalized covariance matrix for each class is .
(1) X 1 y X 2 are calculated by averaging over all the signals from each class. The composite spatial covariance matrix is obtained with the sum of these averaged normalized covariance matrices and can be factorized as 49124 VOLUME 9, 2021 U 0 and are the eigenvectors and the diagonal matrix of eigenvalues respectively. The transformation (3). converts the averaged normalized covariance matrices as (4).
The factorization of S 1 and S 2 are computed as (5). They have common eigenvectors, and the sum of both matrices of eigenvalues is the identify matrix. Therefore, for an eigenvector, if S 1 has the eigenvalue s 1 , S 2 will have s 2 = 1 − s 1 .
The projection matrix is obtained as The original signal S can be projected into another space of uncorrelated components. Columns of W −1 are the spatial patterns.
The resulting signal Z has the same dimensions as S (N * T ), but first and last rows are the components whose variances are more suitable for discrimination between the two classes. These components are associated with the largest eigenvalues of 1 and 2 . Consequently, only the variances of the m first and last components of Z are considered for feature extraction, which is defined as Z p .
The variances of Z p are computed and normalized with the logarithm as Finally, f p is the vector of features and has (fbands * 2 * m) * T dimension. In this case, since there are 4 frequency bands filtered out and m was set to 4, the dimension is 32 * T .

3) CLASSIFICATION
The classifier employed for both experiments was the Linear Discriminant Analysis (LDA). In experiment 1, trials were evaluated performing cross-validation leave-one-out for each approach separately: VR + standing, VR + sitting, screen + standing, screen + sitting. Once the vector of features, f p , was obtained for each epoch of data for all the trials of each approach, the classifier was trained with all the trials but one, and tested with the unused one. This process was repeated using every trial once as test.
In experiment 2, the classifier was trained with open-loop trials. Afterwards, during closed-loop trials, each epoch of data was classified as MI or resting state in real time.

4) OUTPUT COMMANDS
In closed-loop trials of experiment 2, VR environment was controlled by commands issued by the BCI. The commands were chosen based on the prediction of the classifier and the following rules: • During the periods of free and preparation tasks, commands cannot be issued. The first two seconds of preparation are not considered to avoid any evoked potential due to the user interface message shown to the subject at the beginning of the prepare rest or MI tasks.
• The prediction of the classifier was 1 for MI and 0 for resting state. This prediction was averaged every 3s. If the resulting index was higher or equal than 0.7, the command issued was to move the environment and if it was lower than 0.7, the command issued was to stop.
• During 3s new commands cannot be sent.

E. EVALUATION
As mentioned previously in II-D3, trials of experiment 1 were evaluated using cross-validation leave-one-out. The same method was employed for the open-loop trials of experiment 2. The performance was assessed with the percentage of correctly classified epochs of data during MI and resting state tasks. On the other hand, the performance of closedloop trials of experiment 2 was evaluated with the following indices: • Accuracy: percentage of epochs of data correctly classified.
• %Commands: percentage of epochs of data with correct commands.
• Accuracy commands: percentage of correct commands issued.
• True Positive Ratio (TPR): percentage of MI periods (each trial has 3) in which the VR environment is moving.
• False Positives (FP) and False Positives per minute (FP/min): moving commands issued during rest periods.

III. RESULTS
In this section, the results of the experiments described are presented.
A. EXPERIMENT 1 In experiment 1, subjects performed trials with periods of MI and resting while they were receiving visual feedback with VR or a screen. Additionally, they were standing up or seated. Results are shown in Table 1. An statistical analysis was performed in Rstudio to test if the differences among the procedures were significant. The chosen test was the repeated-measures ANOVA to study the differences between VR/screen and standing/seated. Firstly, the assumptions made by the ANOVA were verified: • No significant outliers were identified. • Shapiro-Wilk test was employed to assess if the performance indices obtained with each protocol followed a normal distribution. Results showed that all groups followed a normal distribution with a p value < 0.05.  • The variances of the differences between protocols must be equal, which is defined as sphericity. This assumption was tested with the Mauchly's Test and the results confirmed it (p value < 0.05). The repeated-measures ANOVA was used to study the differences of subject performances based on the visual feedback and the position of the user. The accuracy was defined as the percentage of correctly classified epochs per trial. From the results of these tests, no significant differences were identified between VR and screen, standing and seated and the combination of both (p v alue > 0.05).
Since subjects showed different behaviour to the protocols, it was tested if there were statistical significant differences among them with one-way ANOVA. Firstly, it was tested the normality assumption for the data of each group with the Shapiro-Wilk test. Results revealed that data of S11 and S13 followed a normal distribution but data of S12 did not. Therefore, the non-parametrical test Kruskal-Wallis was employed with a pairwise Mann-Whitney test. Performance of S12 differs significantly from S11 and S13. The average accuracy for subject S12 was 76.89 ± 10.14, 67.94 ± 10.20 for subject S11 and 64.14 ± 7.13 for subject S13.
Finally, the data from each subject was analyzed independently. Two-way ANOVA was applied for S11 and S13 because normality and homoscedasticity assumptions were verified. However, in case of S12 the normality assumption was not fulfilled so the Kruskal-Wallis test was employed. Results from S11 and S13 showed significant differences in terms of the visual feedback.
Although some subjects indicated they experienced fatigue, the average performance of the last procedure was similar. It is difficult to say which procedure was the best in terms of performance due to significant differences among users. However, VR and standing had the highest accuracy in the majority of subjects.  Results of open-loop trials are reported in terms of accuracy of correctly classified epochs as experiment 1. They can be seen in Table 2. The total average classification accuracy of subject S21 is 82.3 ± 6.0 and the accuracies in MI and relax events are 88.1 ± 13.4 and 76.5 ± 13.4 respectively. Regarding S22, the average classification accuracies are 84.6 ± 6.4 in total, 82.3 ± 8.8 in MI events and 86.9 ± 8.8 in relax events. Table 3 summarizes the performance of closed-loop trials. The average value of Accuracy is 71.7 ± 8.9 for S21 and 77.3 ± 8.6 for S22. This metric is equivalent to %total of open-loop trials. %Commands also provides a measure that is evaluated per epoch, but it is focused on the commands issued. From the results, it can be pointed out that %Commands is always higher than Accuracy. On the other hand, TPR is 100% for all trials, which means that there is an activation in all MI events. Regarding FP/min, S22 had a lower rate of False Positives than S12. A similar pattern of results was obtained in open-loop trials in which S22 had higher Accuracy in relax events.
The spatial patterns of motor imagery and relax of S21 and S22 can be seen in Fig. 5 and Fig. 6 respectively. Results from S21 show that electrodes CP1 and CP2 seem to have a relevant role in MI of gait in frequencies from 5 to 20Hz. In relax events, electrode Pz is the one highlighted at 15-25Hz. Regarding S22, the patterns are different. In relax events, the most significant electrode is FCz followed by FC1 and FC2 at 15-25Hz. In MI of gait, the distribution of relevant areas is scattered for all frequency bands considered.

IV. DISCUSSION
Results from experiment 1 showed that visual feedback provided by VR was related to higher performances in the majority of cases. However, no statistically significant differences were found. With regard to the position of the user, there were not notable differences between standing and being seated. Intrinsic within-subject differences may affect performances more than other factors, such as procedure. A similar conclusion was reached by [24], although their work was focused on hand MI. They compared the performance of users and their embodiment when they were using VR or watching a screen. Whereas they found a trend for better performance and embodiment in VR, they could not find significant differences.
In experiment 2, the percentage of correctly issued commands (%Commands) is slightly superior than the percentage of correct outputs (Accuracy) and in some cases, the difference is considerable. This might be due to the fact that maintaining MI or a state of relaxation during long periods can be challenging. It is easy that subjects can get distracted and eventually center their attention in another task. Therefore, by averaging the output of the classifier, short deviations can be mitigated. These findings are in line with the research shown in [10]. The spatial patterns from subjects S21 and S22 show intersubject variability with respect to the MI pattern. A similar conclusion was reached by [34]. They found different optimal spectro-spatial characteristics across subjects and sessions which could be explained by the non-stationary nature of EEG but also unknown factors. In addition, the FBCSP methodology was employed with a motor imagery competition dataset and the spatial patterns showed dissimilarities across subjects [27]. Inter-subject differences with regards to MI BCI performance as well as underlying neural mechanisms remain unclear. Reference [35] studied which characteristics of the fronto-parietal attention network (FPAN) could play a relevant role in the prediction of the performance of the BCI. They observed that certain FPAN structural and functional features could be associated with higher or lower performance. Furthermore, the presence of certain cognitive states during resting state also appears to be associated with the performance [34].
In line with previous studies, the current work addressed inter-subject variability with subject specific modeling BCI [35]. CSP are adapted to every subject and session.
Since it was demonstrated that some users find it more difficult to modulate brain rhythms in a volitional way, future research could focus on adaptive training approaches. So for example, some subjects could need more training sessions or more trials in opened-loop control before starting with closed-loop control. There are some methodologies in the literature that have demonstrated to predict the BCI performance of a user [36]- [38]. Therefore, these neurological predictors could be employed to identify which individuals would need more training and more assistance by the researchers before starting the experiment.
On the other hand, it is important to highlight that the percentage of correctly classified epochs is lower in closedloop trials (Accuracy) than in open-loop trials (%total). This does seem to depend on the fact that during closed-loop trials, subjects know how good they are performing, so they have an additional element that could affect their focus on the mental task. Similar results were reported by [25] in which it was performed hand and foot MI; and [39], where hand MI is studied using an exoskeleton located next to the subjects as feedback.
When comparing the results to [25], the average Accuracy of online closed-loop sessions is slightly superior in our proposed algorithm. However, their experiment is based on the classification between hand and foot MI and our approach is based on the distinction between MI of gait and rest.
Reference [26] shows an study with VR and functional electrical stimulation to enhance foot MI in an open-loop approach. The average accuracy was 78.1 ± 7.6 for VR and 84.8.1 ± 7.0 for the combination of VR with stimulation. These metrics can be confronted with %total in open-loop trials from experiment 2. While the performance is slightly superior for the appliance of functional electrical stimulation, the results achieved with our algorithm are higher in terms of non-stimulated VR.
Even though the study of [21] had a different experimental setup including a treadmill, it was based on foot MI, providing visual feedback through a screen. They reported an accuracy of 71% correctly classified epochs in open-loop trials and 70% in closed-loop trials. By comparing these metrics to %total in open-loop trials and %Commands in closed-loop trials, our approach showed a greater performance.
In [40], it is presented a BCI based on three brain tasks (left hand MI, right hand MI and foot MI) in which users did not get any feedback during training. Although it is not a direct comparison between foot MI and rest, the accuracy obtained during foot MI in training phase can be confronted to %MI from our open-loop trials. It is shown an average accuracy of 80.5±5.9 which is lower than the results from our approach. Consequently, results suggest that the employment of visual feedback can enhance the realization of MI.
The performance of maintained MI or rest state can be challenging for subjects as they can be suddenly distracted from their task. As a consequence, the feedback from the VR environment can be negative, i.e. different from what was expected, and false movements or false stops can happen. Previous studies have assessed the level of attention of the subject during the experiment and based on it, the output commands were adapted so the number of false positives can be reduced [10]. Future research could combine our MI BCI with another one that measures the concentration/attention of subjects.

V. CONCLUSION
In this paper, it is proposed the employment of VR to provide real-time feedback when subjects are performing MI of gait. Since subjects are static, it can be assured that the BCI algorithm is detecting MI and not the actual brain activity associated with motion. Firstly, it was compared if the immersion of the subject in the visual feedback paid a relevant role. It was checked that the visual feedback provided by VR was related to higher performances in most of the cases. On the other hand, the stability of the subject was also studied, but not differences were found between being seated or standing during the trials. Secondly, two subjects took part in online closed-loop sessions. They first performed some trials to train the classifier and then, they performed trials in which they received feedback in the VR environment based on the output of the BCI. The average accuracy in open-loop trials was 83.5±6.2. Regarding closed-loop trials, the average accuracy of predictions was 74.5 ± 8.8. and the average accuracy of commands, 91.0 ± 6.7. These real-time closed-loop results improve the outputs of other methodologies presented in literature.
Future work could design experiments that start with initial VR sessions. Therefore, subjects could learn to modulate their brain rhythms in an immersive environment that mitigates external distractions. Additionally, as the feedback provided by the VR environment is only visual, it is safer than any motion-related feedback, especially for users with motor disabilities with limited experience in the use of orthosis. Therefore, subjects could start with a BCI that controls a VR environment and when a certain level of performance is achieved, they could try other controlled devices, such as robotic exoskeletons.
Future research will study the introduction of a BCI with preliminary VR sessions before the subject employs a lowerlimb exoskeleton in closed-loop control. It will assess the possibility to improve the BCI control success rate, but also the reduction of the training time needed to control successfully the exoskeleton.