Time Series Shapelet-Based Movement Intention Detection Toward Asynchronous BCI for Stroke Rehabilitation

Brain computer interface (BCI) systems for neurorehabilitation have received increasing attention over the past decade. These systems provide an alternative approach to restore lost motor functions in stroke patients by inducing brain plasticity. To utilize BCI systems for stroke rehabilitation, user movement intentions from electroencephalogram are used as triggers for rehabilitation tools such as electrical stimulators. A reliable movement intention detection plays a vital role in an effective rehabilitation to establish a relation between movement intentions and corresponding feedback from a rehabilitation tool. We therefore propose a novel movement intention detection algorithm based on time series shapelets. We collected a dataset of subjects performing a self-pace ankle dorsiflexion and tested the algorithm in both classification and pseudo-online detection. The classification results demonstrate that our proposed algorithm significantly outperforms six other algorithms and achieves the highest classification performance with F1-score of 0.82. For pseudo-online detection, our algorithm gains very high performance in all performance metrics with 69% True Positive Rate (TPR), 8 False Positives per minute (FPs/min), and 44 ms Detection Latency (DL). Our proposed algorithm does not only provide competitive performance to state-of-the-art algorithms in terms of TPR, but also maintain low FPs/min and DL. In addition, the DL of our proposed algorithm is low enough to induce effective brain plasticity for neurorehabilitation. These promising results enlighten the development of asynchronous BCI systems based on time series mining techniques to enhance stroke rehabilitation results.

Among many interesting BCI applications in medical domains, BCI systems for stroke rehabilitation have been The associate editor coordinating the review of this manuscript and approving it for publication was Senthil Kumar . gaining more attention over the past decade as more than 13 million people suffer from stroke annually [16]. Some of the patients can fully recover, but a lot of them end up with disability. Thus, many BCI researches have been devoted to help these stroke patients by providing neuromodulation or combining with traditional therapy to boost-up rehabilitation results. Based on Hebbian theory [17], BCI systems for stroke rehabilitation can provide an alternative therapy to restore lost motor functions in stroke patients by inducing brain plasticity [18], [19], a brain's ability that can modify or change itself following experiences both within the body or even from surrounding environment. Thus, BCI systems for stroke rehabilitation exploit this phenomenon to recover lost motor functions of stroke patients by using an additional tool such as Electrical Stimulator (ES) to provide neurofeedback [18]- [22]. In details, a BCI system reads brain signals as an input. Normally, a non-invasive Electroencephalogram (EEG) is usually used as brain input signals because it does not need any surgeries to place the electrodes. In addition, due to its high portability, acquiring cost of EEG signals is much more affordable than other brain signals. After the data acquisition, a BCI system will process input signals and detect movement intentions of a lost motor function of a stroke patient from their brain signals. Then, the system will translate it into control signals to drive an ES to stimulate the patients' muscles providing them a neurofeedback, which leads to their lost motor function's restoration. It is worth noting that an active involvement of the patient plays a crucial role in improving the rehabilitation outcome [22]- [24]. By using movement intention from brain signals to drive an ES to stimulate the patients' muscles, we can directly gain an active involvement of the patient.
Recently, asynchronous BCI has been applied to induce brain plasticity after stroke [18], [19]. Their results showed improvements of Fugl-Meyer (FM) score and Stroke Impact Scale 16 (SIS-16) on chronic stroke patients after BCI intervention. Moreover, there are findings indicating that an asynchronous BCI system can lead to a larger increase in corticospinal excitability immediately after BCI intervention, comparing with a synchronous BCI system [33], [34]. However, long-term results still have to be observed. The main difference between asynchronous and synchronous BCI systems are the way the users use the BCI system. In asynchronous BCI systems, users can freely decide when to execute the task or imagination without any cues. On the other hand, for synchronous BCI systems, users can only execute a task within a specific period of time according to the system's cues. Thus, asynchronous BCI systems provide more comfortable experiences for users than synchronous BCI systems.
Although asynchronous BCI systems for stroke rehabilitation can lead to larger increases in corticospinal excitability than synchronous BCI systems, designing an asynchronous BCI system is much more difficult due to lower signalto-noise ratios comparing with a synchronous BCI system. Moreover, to successfully induce brain plasticity, a precise stimulation of ES that is switched by an asynchronous BCI is needed [18], [21]. Thus, the researchers need to dedicate much more efforts to develop effective movement intention detection algorithms for asynchronous BCI systems to help stroke patients [26]- [28], [31], [32], [35]- [41]. Though some of these algorithms had been applied to recover stroke patients, they still avoid using the BCI systems due to the lack of its performance [19], [42]. Moreover, some proposed algorithms are unsuitable for real-world scenarios due to the complication in signal pre-processing and filtering techniques that make them impractical for online-scenarios [39], [40], [43]. There were also attempts to apply deep learning algorithms to BCI systems [44]- [49]. However, to the best of our knowledge, there is no successful nor completed study in applying deep learning on self-paced movement intention detection for online-scenarios. Thus, more sophisticated algorithms have to be developed to help restoring stroke patients to improve their quality of life.
Recently, in our preliminary study, we have shown superior results of applying shapelets, a shape-based techniques from time series mining domain, to traditional methods for movement intention detection from brain signals [50]. In this work, we propose a novel movement intention detection algorithm to detect a self-paced movement intention for an ankle dorsiflexion by utilizing both ERD/ERS and MRCP signals in an asynchronous BCI system based on shapelets for the first time. In addition, we compare our proposed algorithm to six other algorithms and also analyze our parameter's sensitivity. It is worth noting that shapelets have been proven to provide superior results in many time series problems [51]. Even though they were originally proposed to tackle stationary time series data problems. Thus, applying shapelets to non-stationary data like EEG cannot be done straightforwardly. We have innovated a new shapelet-based algorithm that is more tolerable to non-stationary nature of EEG data.
Contributions and impact of this work can be summarized as follows: 1) Our novel movement intention detection algorithm works effectively in an asynchronous BCI system for stroke rehabilitation as an alternative or additional rehabilitation tool to restore a lost motor function of stroke patients. 2) Our innovative shapelet-based algorithm demonstrates that it can be successfully and effectively applied to detect movement intention from non-stationary EEG data for the first time. 3) This work demonstrates our outstanding performance in both classification and pseudo-online detection among four other movement intention detection algorithms and two shapelet-based algorithms.

4)
Our work provides promising results of utilizing an unconventional time series mining technique to tackle EEG problems. 5) We also provide a parameter analysis of our proposed algorithm as a guideline for parameter settings. The rest of this paper is organized as follows. In section II, previous works are briefly provided to point out the progress and insights. In section III, we explain our methodology, and section IV is used to describe experiment results and discussion. In section V, we present the conclusion and future work.

II. BACKGROUND
In this section, we briefly provide progress on BCI for stroke rehabilitation, movement intention detection algorithms, and shapelet-based algorithms before describing our proposed algorithm in the next section.

A. BCI FOR STROKE REHABILITATION
The main objective of Active Motor Training (AMT) in stroke rehabilitation, such as Constraint-Induced Movement Therapy (CIMT) [52], is to recover the lost motor function by forcing the patient to use the affected limb. However, the outcome of AMT is based on the residual motor performance of the stroke patient. Unlike AMT, BCI is applied in rehabilitation without using normal neuromuscular pathway; thus, the outcome of BCI intervention is not based on residual motor performance [53]. As demonstrated in clinical study in 2009, Daly et al. [20] reported the improvement in motor outcome of applying cue-based BCI for motor imagery (MI) and motor execution (ME) with Functional Electrical Stimulation (FES) feedback intervention on one middle-aged stroke woman. The patient showed the improvement with an increase in index finger extension. In 2010, Prasad et al. [54] reported the possibility of using the physical practice followed by motor imagery BCI intervention; the neurofeedback of BCI was provided via ball-basket game. The experiment was done on 5 chronic hemiplegic stroke patients. The outcome measured by Action Research Arm Test (ARAT) [55] and grip strength demonstrated the improvement in all patients. Next, Caria et al. [56] used EEG and Magnetoencephalography (MEG)-based BCI intervention combined with physiotherapy to induce brain plasticity. The retired stroke patient, who suffered from severe hand paresis, was trained to modulate the mu-rhythm shown in alpha band during hand motor imagery to induce brain plasticity in BCI session. The improvements were shown in terms of clinical motor improvement and neuroimaging. In 2012, Mrachacz-Kersting et al. [21] studied the temporal association between cortical potentials evoked by motor imagery and afferent-induced cortical plasticity in a cue-based scenario. Twenty-four healthy subjects were included in a total of 4 experiments. They found that only the precise stimulus by ES during the peak negativity of Contingent Negative Variation (CNV), i.e., a cue-based version of MRCP, will induce brain plasticity. Although the results of these BCI interventions provide positive outcomes for stroke rehabilitation, the BCI interventions of these works were based on synchronization, which resulted in patient's fatigue. To overcome this problem, Eileen et al. [57] conducted the experiment to investigate the feasibility of detecting the reaching movement intention from healthy and stroke subjects. Their results successfully showed that detection of movement intention before actual self-paced movement is possible. Moreover, they also mentioned the benefits of using asynchronous over synchronous BCI systems. In the same year, an experiment result on a usage of asynchronous BCI system for rehabilitation was reported by Niazi et al. in [22]. Their proposed algorithm [26] was applied to detect MRCP in motor imagery in real time and to turn on ES. The experiment was done on 16 healthy subjects with self-paced imagination. Motor Evoked Potentials (MEP) triggered by Transcranial Magnetic Stimulation (TMS) was used to evaluate the effect of BCI intervention. According to the results, BCI intervention can be used to improve MEP triggered by TMS. The detection performance was reported at 67.15% as a True Positive Rate (TPR). They also claimed that ''the peripheral stimulation combined with patient driven rehabilitative treatment could lead to more prominent behavioral gain than passive rehabilitative treatment'' [22]. The similar experiments and results were reported by Mrachacz-Kersting et al. in [18], but 22 chronic stroke patients were included instead of healthy subjects. In addition, the authors also reported the improvement of clinical measure in an associative group compared to a non-associative group. However, there is no report about the detection performance in this study. In 2017, Ibáñez et al. [19] also studied the usability of lowlatency movement intention detector [32], [58] to drive ES in 4 chronic stroke patients in 8 sessions of intervention during 1-month period. In their result, they reported both the usability of their BCI system performance and the clinical assessment. For the clinical assessment, they reported the improvement of Stroke Impact Scale (SIS) and also in kinematic analyses. For BCI system's performance, the average true positive detection was 79.75% and the average detection latency was 112 ms. Although the clinical assessment was improved, the acceptability of the BCI intervention was not clearly accepted by all patients, possibly due to its low detection accuracy. Furthermore, Jochumsen et al. [33] studied the difference of brain plasticity induction between two BCI systems, i.e., asynchronous and synchronous BCI systems. By measuring MEP triggered by TMS before, immediately after, and 30 minutes after BCI intervention from 15 healthy subjects, MEP results from immediately after the self-paced BCI intervention is reported to be higher than those in the cue-based BCI. But for 30 minutes after BCI intervention, they are insignificantly different. Regardless of the results, the authors still suggested that immediate changes induced by asynchronous BCI system may affect the stroke rehabilitation results. In recent study, Lu et al. [14] conducted the observation study from 26 chronic stroke patients to verify the efficiency of motor imagery based BCI in the recovery of wrist extension. The results showed some VOLUME 10, 2022 improvement in active range of motion and modified Barthel Index.
As described above, BCI systems can be used as powerful alternatives or rehabilitation enhancement tools for stroke rehabilitation. Though the clinical assessments have shown that utilizing BCI intervention can improve recovery of lost motor function in stroke patients, BCI systems are not widely used due to the lack of its performance.

B. MOVEMENT INTENTION DETECTION ALGORITHMS
In 2011, Niazi et al. [26] proposed an algorithm to detect MRCP in time domain for asynchronous BCI via Matched-Filter (MF). The TPR was reported at 82.5±7.8%. Due to the simplicity and efficiency of the MF algorithm, it has been widely used and compared to newly proposed algorithms [28], [32], [33], [35]. In 2013, Niazi et al. [31] tested their hypothesis about superiority of using a global MRCP template instead of an individual template for detecting a self-paced movement intention via MF. The result showed the possibility of using a global MRCP template in detecting movement intention from self-paced movement with some loss of accuracy. Next, Xu et al. [28] invented a new algorithm (LPP-LDA) that combined Locality Preserving Projections (LPP) [59] and Linear Discriminant Analysis (LDA) to reduce dimension of the data and used LDA as a classifier. Their TPR's outperformance was reported to be 10% over MF, i.e., 79±11% versus 68±10%, respectively. In 2014, Ibáñez et al. [36] tried to detect movement onset from a voluntary movement of three stroke patients by utilizing ERD with Bayesian classifier. In the same year, they also proposed a new algorithm that utilized ERD together with MRCP to detect movement intention via a logistic regression [32]. By utilizing features from both domains, the reported TPR were 74.5±13.8% and 82.2±10.4% for healthy subjects and stroke patients, respectively, demonstrating that their proposed algorithm is superior to algorithms that used features from a single domain [26], [36]. In 2016, Lin et al. [27] proposed Locality Sensitive Discriminant Analysis (LSDA), another version of LPP, and utilized One-Nearest Neighbor (1-NN) as a classifier with TPR of 75.5±12.0%. However, they did not compare to LPP-LDA, its original counterpart. In 2017, Liu et al. [37] made a comparison among utilizations of MRCP, ERD/ERS, and both domains as a feature for Support Vector Machine (SVM). Their experiment results showed that utilizing features from both domains can boost up accuracy by 10%. This result is also conforming to Ibáñez et al. [32]. In 2018, Liu et al. [38] conducted an experiment to compare 3 algorithms, i.e., RF, LPP-LDA, and PCA-LDA, and to investigate the effect of leg side in movement intention detection. The result showed superiority of RF over two other algorithms, with TPR of 82%, 78%, and 58%, respectively. The authors also reported that there is no significant difference on leg side.
Although many movement intention detection algorithms have been proposed and deployed in real BCI applications, their performance and usability may still be substandard [19], [42]. Thus, in this work, we made an attempt to explore the usability of shapelet-based algorithm to develop a novel movement intention detection algorithm to better meet the patient's expectation and usability satisfaction.

C. SHAPELET-BASED ALGORITHMS
In time series classification, classifying time series is normally done by measuring similarity between known class sequences and unknown class sequences. Then, an unknown class sequence will be labeled following the known class sequence that is the most similar to it. However, sometimes, similarity of a whole time series sequence are not suitable to describe the similarity between two time series sequences; instead, local shapes, i.e., shapelets, have recently been used. In 2009, the term ''shapelet'' was first introduced to time series mining domain by Ye and Keogh [60]. According to the original paper, shapelets are subsequences of time series data that have good discriminating features for time series classification. The easiest way to extract shapelets from time series sequences is to extract all possible-length subsequences and then evaluate them using some quality measurement. In the original paper, shapelet-based decision trees were used to classify time series sequences. A shapelet in each node is discovered based on their quality and the information gain using some speedup techniques. As shapelet's exhaustive search is very time consuming, many attempts have been devoted to speed up the searching process [61]- [64]. In 2014, Hills et al. [65] mentioned that decision tree classifier is useful and robust, but there are many other classifiers that can outperform decision trees, such as SVM and Bayesian networks. Thus, they proposed a novel method, called Shapelet Transform (ST) that included two techniques to speed up the shapelet searching process and to generalize the concept of shapelet usage with other classifiers. To reduce shapelet searching time, they proposed a single-scan algorithm to find the best k shapelets using F-statistic for variance analysis to measure the shapelet's quality. Moreover, by using distances between each k shapelet and the time series sequences, the original dataset was transformed to distance features that can be used with classifiers. Based on their results, using SVM with linear kernel as a classifier can achieve the best accuracy. Although ST uses the singlescan algorithm, finding a shapelet is still a time-consuming process. In 2015, Wistuba et al. [66] proposed Generalized Ultra-Fast Shapelets (U-F). Randomization technique was utilized to find shapelets without any quality assessment method. Aside from gaining a lot of speed improvement, randomization was claimed to extract more interacting shapelets, i.e., shapelets that are very useful when combined with other shapelets. However, their method requires a large amount of random shapelets to guarantee high accuracy. In 2019, Ji et al. [67] proposed a Fast Shapelet Selection (FSS) that takes advantage from subclass splitting method [68] and Local Farthest Deviation Points (LFDPs) to reduce shapelet searching time. Despite FSS's lower accuracy, comparing to ST, its running time is thousand times faster. However, the paper reported only the results for twelve datasets out of more than one hundred datasets from the UEA & UCR Time Series Classification Repository [69]. Recently, Vichit and Ratanamahatana [70] also devised a new shapelet discovery algorithm, called Dual Increment Shapelets (DIS). The algorithm combines LFDPs and incremental neural network [71] to reduce the number of shapelet candidates and to cluster similar shapelets. Though their algorithm aims to speed up the process, it outperforms both FSS and ST in terms of accuracy. Although FSS and DIS have good performance, both of them use LFDPs to extract important points from time series sequences. Unfortunately, EEG data is non-stationary, thus important points from the extraction is likely to be less informative.
To summarize, a shapelet is a partial time series subsequence that represents a class based on its similarity. It gives better interpretability and accuracy without having to use the whole time series in a classification. Thus, shapelets have gained a lot of popularity and have been successfully applied in many applications such as gait recognition [72], gesture recognition [73], and medicine [74], [75]. Despite a wide usage of shapelets in various problems and domains, to the best of our knowledge, shapelets have never been applied to detect movement intention from brain signals.

III. METHODOLOGY
In this section, we explain our experimental protocol, data acquisition, data labeling, Partial Shapelets algorithm, MRCP and Non-MRCP classification experiment setup, and pseudoonline detection experiment setup.

A. EXPERIMENTAL PROTOCOL AND DATA ACQUISITION
Nine healthy volunteers (seven males and two females; age ranges from 22 to 26 years old) participated in the experiment. None of them had experiences with any BCI systems prior to the experiment. The participants provided written informed consent. The experiment was approved by the research ethics review committee for research involving human research participants, Health Science Group, Chulalongkorn University (COA No. 049/2018).
At the beginning of the recording session, each participant was asked to sit on a comfortable chair with both legs rested on the ground. Then, Electro-Caps (Electro-Cap International Inc.) with 19 channels of monopolar EEG was attached to each subject's scalp and connected to a Nicolet w10-20HB amplifier (Natus Medical Inc.). The 19 channels were attached to Fp1, Fp2, 3 F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2 positions. Ground electrode was the ground of the cap placed in the middle among Fp1, Fp2, and Fz with the reference placed on the left earlobe. The impedance of all electrodes were calibrated to be less than the 5k-Ohm threshold. To accommodate the EEG labeling, one channel of surface Electromyography (EMG) was also recorded by the Nicolet w10-20HB amplifier. EMG was recorded from bipolar derivation from the Tibialis Anterior (TA) muscle and on the bony surface of the knee of the dominant leg (right knees in all subjects). All of the EEG and EMG signals were sampled at 1,024 Hz. EEG signals were filtered by a Butterworth band-pass filter with a frequency band of 0.01-30 Hz, and a notch filter to filter out 49-51 Hz power-line distortion [35].
In recording sessions, participants were instructed to perform self-paced ballistic ankle dorsiflexions about thirty times. The duration between consecutive trials was roughly 3 to 7 seconds, providing about 280,000 data points per channel per session that need labeling. During the recording session, to reduce artifacts, each subject was asked to stay relaxed, closing the eye lids, and trying not to move other body parts. The protocol is shown in Fig. 1. Each subject had six separate recording sessions with resting periods as long as they wanted in between. During the recording sessions, videos were also recorded for EMG signal synchronization. All recordings were made in an environment similar to a hospital setting.
This data was previously used to study the effects and relationship among factors affecting the performance of an asynchronous BCI system for movement intention detection. Detailed explanations and experiment results can be found in [50].

B. DATA LABELING
The acquired EEG data needed to be labeled for a classifier's training. EMG data and videos were utilized in this stage. Firstly, the EMG data needed to be annotated where the movements took place. To provide the precise specifications of movements, the EMG data was visually inspected to determine the movement onsets and offsets, then the recorded videos were also used to verify these annotations. Any invalid data, i.e., the trials whose movement periods were fewer than 4 seconds apart from the previous trial, were rejected. After that, periods of 2 seconds before the onset and 2 seconds after the offset are regarded as Movement Intention and Non-Movement Intention periods, respectively.

C. PARTIAL SHAPELETS
This section describes our proposed algorithm. As mentioned earlier in section II, a shapelet algorithm generally uses an exhaustive search, enumerating all subsequences of every possible length for all time series sequences in the dataset. VOLUME 10, 2022 These subsequences are then evaluated by a scoring function to extract only the latent and meaningful ones that can be used as a discriminating feature. However, its computational cost is exceedingly high. Moreover, original shapelet algorithms were not specifically designed to handle non-stationary data like EEG; shapelets only exploit shape of the subsequences to determine a class label while EEG signals do not clearly have specific shapes. In addition, all extracted subsequences are normalized in the pre-processing step and may lose some amplitude information of the data. To effectively apply shapelets to detect movement intentions from EEG signals, we have to exploit the latent informative shape of MRCP and also deal with non-stationary problem while, at the same time, maintaining the amplitude information.
To explain our algorithm, we begin with introducing all necessary notation and definitions as follows: Definition 8 (A set of shapelets): Shapelets = < S 1 , i start1 , i end1 >, . . . , < S r , i start r , i end r > is a set of r shapelets where each shapelet contains a subsequence S extracted from a set of templates, the starting index i start , and ending index i end .
Partial Shapelet (PS) is an algorithm that is designed to exploit the latent informative and discriminative shape of MRCP to extract shapelets and exploit the amplitude information and limited boundary to transform an original data space into a new distance feature space. The algorithm consists of two major steps, i.e., partial shapelet extraction and partial shapelet transformation.

1) PARTIAL SHAPELET EXTRACTION
Partial shapelet extraction algorithm is shown in Algorithm 1.
Unlike traditional shapelet algorithms that utilize either exhaustive searching method or important points to extract shapelets, with our utilization of subclass splitting technique, the sequences in one same class that have the same Shapelets = Shapelets ∪ tempshapelet 13: end 14: return Shapelets characteristics are grouped into subclasses. Then, shape averaging technique is applied to each subclass to create a subclass template for a subclass representation. These techniques are described in [68]. By utilizing these techniques, we can extract shapelets faster due to much smaller data. In addition, the diversity of non-stationary EEG data is manipulated by 1) subclass splitting to group data with the same characteristics together and 2) shape averaging to create a data representation for each group. After acquiring all subclass templates for every class, interesting data points of each template will be located and used to extract shapelets (lines 10-11). The process of locating interesting data points is shown in Algorithm 2.
To locate interesting data points for each template, one very simple assumption is made. If a data point is informative, the intraclass difference must be less than the interclass difference. Only the informative data points are extracted and utilized for shapelet extraction. In details, the absolute difference between the query and all candidates are calculated and accumulated according to their classes (lines 4-10). If their classes match, the absolute difference will be added to scoreIntra. On the other hand, the absolute difference will be added to scoreInter if they have different classes. Then, both scores will be averaged and used as representative scores. We then calculate the difference between them and keep it as Diff and use it to find an average and standard deviation of the difference. Then, data points of the query template will be checked for interestingness. If Diff [i] > mean + sd, that index i will be marked as an interesting point (lines [17][18]. These steps are used to filter out the data points that are uninformative. The next step is to extract shapelets by iPoints.add(i) 19: end 20: end 21: return iPoints utilizing these identified interesting data points. The process of shapelet extraction is shown in Algorithm 3.
To extract shapelets from a template, the interesting points are concatenated under user constraint until the requirement is no longer met; this user constraint limits how far the two consecutives interesting points are allowed to form a shapelet. Then, the first and last index of the concatenated interesting points will be used as an indicator to locate a starting and ending point of a shapelet. Specifically, we start with discrepancy checking between the indices of two adjacent interesting points that are kept in iPoints [i − 1] and iPoints [i] (line 4). If the difference between two adjacent interesting points is less than farness, we regard them as a part of a subsequence of a shapelet and continue with the subsequence concatenation until the condition is no longer met. If the condition is no longer met, we check the number of interesting points on the concatenated subsequence to regard only the subsequence containing more than one interesting point as a shapelet. We also keep the starting and ending points of this subsequence on the template for further shapelet transformation. We repeat the process by checking the index discrepancy from the last considered index i and its adjacent index i + 1 until all interesting points are processed and all shapelets from the template are acquired.

2) PARTIAL SHAPELET TRANSFORMATION
After acquiring shapelets from all templates, these shapelets can be used to transform the original data into a new distance feature space by Algorithm 4.
To transform each time series sequence T in the original dataset, sliding window is used to calculate minimal subsequence distance, i.e., the minimal Euclidean distance between a shapelet S and all possible subsequences of a sequence T . Then, the distance that is calculated for each shapelet will be concatenated to make a distance feature vector, which is regarded as a transformation of the original data. However, EEG data is non-stationary, and not every data point is informative. Thus, calculating subsequence distance between a shapelet and every possible subsequence from sequence T is unnecessary and too time consuming. We limit the boundary of the sliding window by using starting and ending index of a shapelet acquired in Algorithm 3. We extend it by partial * T .length, when partial is a percentage and T .length is a length of sequence T . This idea was influenced by partial dynamic time warping [76]. To describe Algorithm 4, each shapelet S and its starting and ending indices will be used VOLUME 10, 2022 Algorithm 4 Partial Shapelet Transformation.
Input: A set of extracted shapelets (Shapelets), A time series dataset with class labels (D), Partial constraint (partial) Output: Transformed dataset (D T ) Initialization: 1: for each time series T in D do 2: newT = ∅ 3: for each shapelet < S, i start , i end > in Shapelets do 4: minDist = ∞ 5: i start = max(0, i start − partial * T .length) 6: i end = min(T .length, i end + partial * T .length) 7: for i = i start to i end − S.length + 1 do 8: dist = EuclideanDist(S, T (i, . . . , i+S.length−1)) 9: if (dist < minDist) then minDist = dist 10: end 11: newT .add(minDist) 12: end 13: D T .add (newT ) 14: end 15: return D T to recalculate the sliding window boundary i start and i end of the shapelet S (lines 5-6). In lines 7-10, the subsequence distances will be calculated from shapelet S and subsequences that are extracted from time series T from index i start to i end with the same length as the shapelet S. It is important not to normalize the shapelets or any subsequences as normally done in shapelet transformation because we want to preserve amplitude information of the shapelets and the subsequences. Thus, not only the shape of subsequences is taken into account, but also its amplitude. Then, the minimum subsequence distance between the shapelet and one of the subsequences that are extracted from index i start to i end of time series T will be regarded as a feature (line 11). These features are transformed feature vector of time series T .

D. MOVEMENT INTENTION CLASSIFICATION EXPERIMENT SETUP
Before applying our proposed algorithm to movement intention detection problem, we evaluate our proposed algorithm on a basic problem, movement intention classification. As mentioned earlier, both MRCP and ERD/ERS have their own advantages. Utilizing them together could bring more benefits to stroke rehabilitation than utilizing single signals. As both signals cannot be preprocessed together, thus we preprocess the acquired data separately before feeding them to further processes.
For time domain, the acquired data were filtered with a causal second-order Butterworth band-pass filter with frequency band of [0.01-1] Hz. Then, filtered signals were segmented into Movement Intention and Non-Movement Intention periods. The segmented data were downsampled to 20 Hz and, Surface Laplacian, a spatial filter, was applied  using Cz as a working electrode and Fz, Pz, C3, and C4 as surrounding electrodes. These configurations were consistent with previous study [50], which provided the dominant results in movement intention classification. Finally, each sample that was filtered by frequency filter, segmented, downsampled, and filtered by spatial filter was z-normalized and regarded as an MRCP feature.
For frequency domain, the acquired data were done in the same way as done for time domain, but we filtered the data by a low-pass filter with frequency band of [0.01-30] Hz and downsampled to 128 Hz. The power values were estimated by Welch's method with Hamming windows of 1 second and 50% overlaps as in [32]. The power estimated value for frequency 0.01-30 Hz in step of 0.5 Hz was z-normalized and regarded as an ERD/ERS feature.
To evaluate the classification performance, we pooled the data from every session together, shuffled and sampled 2/3 for training and the rest, 1/3, for testing. We also changed the seed and repeated the experiment 100 times. The number  Table. I. The MRCP and ERD/ERS features from the training data were separately fed to Algorithm 1 to extract shapelets with farness = 7 and partial = 0. These parameter settings are explored and discussed in section IV-C. These shapelets were used to transform the original MRCP and ERD/ERS features into a shapelet distance feature space by utilizing Algorithm 4. Then, the transformed dataset was used to train a linear kernel Support Vector Machine (SVM) with C = [0.001, 0.01, . . . , 1000]. For the test data, we also transformed the original MRCP and ERD/ERS features into shapelet feature space by utilizing extracted shapelets from the training data before being classified by the trained SVM. The overall process is depicted in Fig. 2.
To evaluate our proposed algorithm's performance on classification problem, F1-score was used as a performance metric. F1-scores are calculated by taking precision and recall into account together as described in (1).
Precision is a number of correctly predicted Movement Intention samples divided by a number of total samples predicted as Movement Intention. Recall is a number of correctly predicted Movement Intention samples divided by a number of total Movement Intention samples.

E. PSEUDO-ONLINE DETECTION EXPERIMENT SETUP
To evaluate movement intention detection in pseudo-online experiment, we used k-fold cross validation on each participant, where k is a number of sessions. The training data were processed; shapelets were extracted and used to train the classifier as done in classification. The trained classifier and extracted shapelets were applied to the test data to detect movement intention every 125 ms as shown in Fig. 3. The movement intentions were reported when two consecutive windows were classified as Movement Intention samples. A threshold for the classifier were automatically selected from the training set by utilizing the knee point of the ROC curve. We evaluated our performance by True Positive Rate (TPR), False Positives per minute (FPs/min) and Detection Latency (DL). A true detection was counted when a movement intention was reported within a time interval of -1 to 1 s, with respect to the movement onset. A false detection was counted when a movement intention was reported outside the time period of -1 to 1 s, with respect to the movement onset. The detection latency was the time interval between the detection time and the movement onset when a true detection was counted.

F. COMPARISON AND STATISTICAL ANALYSIS
Although there are a number of recent movement intention detection algorithms for BCI system that have been proposed [39], [40], [43], [46], [48], [77], [78], only a handful was successfully utilized in real-world scenarios. Thus, we compare our method with four other methods that have been utilized in real-world scenarios or often used in comparison purposes, i.e., Matched Filter (MF) [26], Locality Preserving Projection -Linear Discriminant Analysis (LPP-LDA) [28], Weighted Average -Random Forest (RF) [38], and Event Related Desynchronization + Movement Related Cortical Potential classifier (ERD+MRCP) [32]. Moreover, we also ran Generalized Ultra-Fast Shapelet (U-F Shapelet) [66] and Shapelet Transform (ST) [65] to compare with our proposed VOLUME 10, 2022 method. Note that, to evaluate the real performance of each algorithm, we tried our best to optimize hyperparameters for all the baseline methods, e.g., a number of dimensions for LPP, a number of trees for random forest, and a number of shapelets for U-F and ST, etc.
Due to the fact that we had limited number of participants, we employed a nonparametric Kruskal-Wallis test to analyze the difference between the baseline methods and our proposed method on all described performance metrics. The Dunn-Bonferroni tests was used as a post-hoc test. The significance level of p < 0.05 was applied for all tests.

A. MOVEMENT INTENTION CLASSIFICATION PERFORMANCE
F1-scores were averaged over one hundred iterations for each participant, and overall performance for each method is summarized in the Table. II. Kruskal-Wallis test revealed that there are some significant differences among classifiers (p < 0.05). The pairwise comparisons revealed that our proposed method provided the highest F1-score with significance as shown in Fig. 4. The two lowest F1-score were MF and U-F. ERD+MRCP yielded a significantly higher F1-score than MF and U-F but lower than the rest, except for ST. For LPP-LDA, RF, and ST, there was no significant difference among each other but their results were significantly lower than our proposed PS method. The accurate classification of movement intention is an essential part of the self-paced BCI for stroke rehabilitation system. Thus, inventing a high-quality classifier is the beginning step of embracing a BCI system for stroke rehabilitation. In this work, we proposed a novel classification method based on time series shapelets to classify between a self-paced movement intention and non-movement intention. Although shapelet algorithms are normally used for stationary time series data, we proposed Partial Shapelets that can effectively classify non-stationary data. The results revealed that our proposed method provide superior F1-scores compared to those of MF, LPP-LDA, ERD+MRCP and RF, the state-of-the-art methods in classifying movement intention vs. non-movement intention of EEG samples. As illustrated in Table. II, the highest F1-score for each individual are attained by different classifiers. In other words, there is no single classifier that can provide the highest F1-scores for all individuals. However, our proposed method provides the highest average F1-score, outperforming the traditional shapelet-based classifiers. Thus, it can be used as an alternative classifier to provide additional performance to other systems or as a main classifier to classify the movement intention.

B. PSEUDO-ONLINE DETECTION PERFORMANCE
For pseudo-online detection, average TPR, FPs/min, and DL are reported in Table. III. Kruskal-Wallis test revealed that there are significant differences on TPR, FPs/min, and DL (p < 0.05). The post-hoc test results between methods for TPR, FPs/min, and DL are depicted in Fig. 5.
The post-hoc test revealed that only TPR of MF was significantly lower than those of LPP-LDA, RF, and PS. None of the rest shows any significant difference among each other for TPR. For FPs/min, the post-hoc test revealed the significant difference between MF vs. RF, MF vs. U-F, MF vs. ST, LPP-LDA vs. RF, LPP-LDA vs. U-F, RF vs. ERD+MRCP, RF vs. PS, and ERD+MRCP vs. U-F. The lowest average FPs/min was attained by MF, but it was insignificantly different from LPP-LDA, ERD+MRCP, and PS. The highest average FPs/min was attained by RF and U-F methods without significant difference between them. However, FPs/min of RF and U-F was significantly different from MF, LPP-LDA,  ERD+MRCP, and PS (only with RF). For DL, the two longest average DL were MF and ERD+MRCP. There was no significant difference between MF and ERD+MRCP but DL of these two methods were significantly different from RF, U-F, ST, and PS. There were no other significant differences between LPP-LDA, RF, U-F, ST, and PS.
The aim of this study is to propose a novel self-paced movement intention detection algorithm for asynchronous BCI using time series mining techniques and to demonstrate the capability of shapelets with non-stationary data. In pseudo-online detection, our proposed PS algorithm outperforms MF, a popular baseline method in many previous works [28], [32], [33], [35]. Although PS cannot significantly beat RF in terms of TPR and DL, our FPs/min is significantly lower. Comparing our PS with LPP-LDA and ERD+MRCP, there are insignificant differences in terms of TPR, FPs/min, and DL (only LPP-LDA). In addition, PS can provide the highest DL with acceptable delay within 200 ms while LPP-LDA cannot, and our PS significantly outperforms ERD+MRCP in terms of DL. For other shapelet algorithms, their TPR and DL are not much different from other traditional methods, while their FPs/min are significantly higher. These results demonstrate our superiority to other traditional shapelet algorithms. While the traditional shapelet algorithms can clearly classify between Movement Intention and Non-Movement Intention samples, non-stationary property of continuous EEG data can ruin their pseudo-online detection performances, but it has no effect to our proposed PS method. Note that FPs/min of all methods in this study are reported to be about 3 times higher than those previous studies [26], [28], [38], [57] due to the fact that our participants made a move about every 5 seconds while other works have a longer period between moves.
Our superior results comparing to rival algorithms along with the under 200 ms crucial threshold for effective stroke recovery [28], [29] suggests high feasibility in applying our PS algorithm as a precise brain switch for asynchronous BCI system to practically and effectively induce brain plasticity for neurorehabilitation. However, as mentioned in [29], [37], [38], a further study on stroke patients is still needed before real-world implementation.

C. PARAMETER EFFECTS
Partial Shapelet algorithm has two user-defined parameters, i.e., farness and partial. Farness is used to limit the interval length between two contiguous interesting points during shapelet formation and concatenation steps. It ranges from 1 to half of the sample length. Setting this parameter to 1 will allow only the contiguous data points to form a shapelet, while setting it to a larger value will form a longer shapelet. However, setting this parameter larger than half of the sample length will disregard the shapelet and the whole time series sequence will be used. Partial is used to limit the searching boundary during the distance calculation between a shapelet and the subsequences for feature transformation in Algorithm 4. This parameter ranges from 0 to 1. If it is fixed to 0, a subsequence will be extracted from the same position as the shapelet for the distance calculation. When it is fixed to 1, subsequences from every position with same length as the shapelet will be extracted and used for the distance calculation. Normally, allowing shapelet shifting is better than fixing it because shapelet can be slided to discover the most similar subsequence. However, in movement intention detection problem, EEG data are processed continuously, thus shapelet shifting is unnecessary and too time consuming.
To show robustness of our algorithm, we vary farness with 1, 3, 5, 7, 10, and 20 (half length of the data) and vary partial with 0, 0.05, 0.1, 0.5, and 1. The average F1-scores for each adjustment are shown in Fig. 6. As shown in Fig. 6(a), these two parameters are not very sensitive. So, farness variation can only increase or decrease F1-scores by relatively small amount, about two percent for each partial increase. Fig. 6(b) shows F1-scores when farness is varied and partial is fixed to 0. F1-scores increase as farness increases from 1 to 7 and slightly decrease afterward. Fig. 6(c) demonstrates that F1-scores decrease as partial increases, while farness is fixed to 7. However, variation in both farness and partial only yields slight effects on F1-scores, confirming the robustness of our algorithm.   Adjusting partial parameter provides a shallow downtrend line on every farness value, and farness variation provides slight differences of F1-score on each partial. (b) When farness is increased, F1-scores increase and peak at farness = 7. If farness is too large, F1-score performance slightly degrades. (c) The average F1-scores decrease slowly when partial is increased.
To summarize, the two parameters in our proposed algorithm are not sensitive and need no complicated expert knowledge. In addition, arbitrary settings can only slightly affect the algorithm's performance. Although both parameters are not difficult to tune in our problem as mentioned earlier, they should still be tuned properly by utilizing validation set when Partial Shapelet algorithm is applied to achieve the best results.

V. CONCLUSION
In this work, we focus on developing an algorithm to be used in an asynchronous BCI system for stroke rehabilitation. To fully recover from strokes, efficient rehabilitation tools and software are necessary. Thus, we propose a novel and precise movement intention detection algorithm based on time series shapelets to detect movement intentions from EEG data. Our algorithm can be used as a brain switch to control an electrical stimulator in an asynchronous BCI system to induce brain plasticity for stroke rehabilitation. We want to emphasize that this is the first time that movement intentions from EEG data are successfully detected by a shapelet-based algorithm. Though shapelet-based algorithms are normally applied to solve problems in stationary data, our proposed algorithm can effectively deal with diversity of non-stationary data and maintain amplitude information to detect movement intention from EEG signals. We conducted the experiment on EEG data recorded from nine healthy participants performing ankle dorsiflexions in a self-paced manner without providing any cues. The performance of our algorithm was evaluated both in classification and pseudo-online detection. Our proposed algorithm does not only outperform state-ofthe-art algorithms in classification, but also in pseudo-online detection problem. Moreover, its detection latency is low enough to induce effective brain plasticity for neurorehabilitation. Although our experiments were based on data from healthy subjects, our promising results reveal the capability of utilizing our proposed algorithm as a precise brain switch for an asynchronous BCI system to recover stroke patients.
For future works, we plan to extend our experiments to stroke patients to observe the capability of our proposed algorithm in inducing brain plasticity for stroke rehabilitation.