Determination of Optimum Segmentation Schemes for Pattern Recognition-Based Myoelectric Control: A Multi-Dataset Investigation

Pattern recognition (PR) algorithms have shown promising results for upper limb myoelectric control (MEC). Several studies have explored the efﬁcacy of different pre and post processing techniques in implementing PR-based MECs. This paper explores the effect of segmentation type (disjoint and overlap) and segment size on the performance of PR-based MEC, for multiple datasets recorded with different recording devices. Two PR-based methods; linear discriminant analysis (LDA) and support vector machine (SVM) are used to classify hand gestures. Optimum values of segment size, step size and segmentation type were considered as performance measure for a robust MEC. Statistical analysis showed that optimum values of segment size for disjoint segmentation are between 250ms and 300ms for both LDA and SVM. For overlap segmentation, best results have been observed in the range of 250ms-300ms for LDA and 275ms-300ms for SVM. For both classiﬁers the step size of 20% achieved highest mean classiﬁcation accuracy (MCA) on all datasets for overlap segmentation. Overall, there is no signiﬁcant difference in MCA of disjoint and overlap segmentation for LDA (P-value = 0.15) but differ signiﬁcantly in the case of SVM (P-value < 0.05). For disjoint segmentation, MCA of LDA is 88.68% and for SVM, it is 77.83%. Statistical analysis showed that LDA outperformed SVM for disjoint segmentation (P-value < 0.05). For overlap segmentation, MCA of LDA is 89.86% and for SVM, it is 89.16%, showing that statistically, there is no signiﬁcant difference between MCA of both classiﬁers for overlap segmentation (P-value = 0.45). The indicated values of segment size and overlap size can be used to achieve better performance results, without increasing delay time, for a robust PR-based MEC system.


FIGURE 1.
Pattern recognition based myoelectric control system. The recorded raw signal is pre-processed to remove motion artifacts and noise before segmentation. The features are then calculated from segments and fed to the classifier to perform classification. state machine (FSM) and on-off controller for upper limb prostheses. These conventional techniques used the amplitude of EMG signals as a threshold to deliver the desired response with a single DOF [1]. Pattern recognition (PR) based techniques overcome the limitation of single DOF and provide better and intuitive control for upper limb prostheses with multiple DOFs, by recognizing patterns in the EMG signals [2]. In PR-based MEC, usually, a four-step process is performed; signal pre-processing, segmentation, feature extraction, and classification. The pre-processing step removes motion artifacts and other electrical interference by filtering certain frequencies. Then, the signal is segmented using a windowing mechanism. After segmentation, feature extraction and/or dimensionality reduction is performed per segment. A classifier is trained with the aggregated features (train data). To evaluate how well the MEC has trained, a performance metric is used to measure its performance on test data. Classification accuracy (CA), recall, precision, and F1-score are widely used performance metrics. Figure 1 represents the general flowchart of a PR-based MEC.
The primary goal of a MEC is to provide natural control of upper limb prostheses with multiple DOF [3]- [7]. The performance of PR-based MEC is dependent on multiple factors like the selection of features and classifiers. Various time and frequency domain features have been proposed for the classification of EMG signals [8]- [13]. Hudgins et al. (1993) proposed a set of four-time domain features that are widely used and provide promising results [8]. Zia ur Rehman et al. (2018) used these features and acquired an accuracy of 90% along with different machine learning (ML) algorithms [13]. Ortiz-Catalan et al. (2015) claimed that cardinality is one of the most distinguishable features for the classification of EMG signals [3]. In addition to feature selection, several studies have investigated the effect of different classifiers and analyzed their performance [14]- [19].  investigated the performance of artificial neural networks (ANN), linear discriminant analysis (LDA), K-nearest neighbor (KNN), support vector machine (SVM), decision tree and naïve Bayes, for MEC in hand gesture applications [20]. The study suggested that ANN has the highest CA among other ML-based classifiers for myoelectric based control systems. On the other hand, the study conducted by  on various ML-based classifiers for PR-based MEC achieved the highest accuracy using SVM among non-linear logistic regression (NLR), multi-layer perceptron (MLP), and LDA [21]. Phinyomark et al. (2013) compared several ML classifiers and found that LDA outperforms other classifiers [22]. LDA is also being used in a commercial prosthetic control system [23]. Phinyomark et al. (2018) also investigated the effect of sampling frequency on the performance of MEC and suggested that a sampling frequency of lower than 1000Hz reduces the CA significantly [24]. The segmentation of EMG signals affects the performance of PR-based MEC as well. Researchers have used both disjoint and overlap segmentation (segmentation types) of up to 500ms segment size with different incremental periods (overlaps/ step-size), on datasets recorded at various sampling frequencies.  used an overlap segmentation of 400ms with an increment/step-size of 10ms for evaluation of ML classifiers for sEMG-based hand movement classification [25]. Alkan et al. (2012) used a disjoint segmentation of 60 ms for the classification of EMG signals [26]. Fougner et al. (2014) investigated that overlap segmentation is better than disjoint segmentation in terms of productive use of processor for real-time myoelectric control [27]. As using a disjoint segmentation makes the processor idle most of the time, therefore, the processor is not used to its full capacity. Englehart et al. (2001) suggested that a segment size of less than 300ms should be used for a real-time prosthetic control system; otherwise, the processor would not be able to process the information before the next input to the system [28]. Such a system, irrespective of its accuracy, is useless at the user's end in a real-time myoelectric control system. Both Englehart et al. (2001) and Fougner et al. (2014) also suggested that CA increases with a longer segment size. Oskoei et al. (2008) conducted a study on a single dataset recorded at a sampling frequency of 1000Hz to examine the effect of different segment sizes and type of segmentation technique for MEC based on SVM and suggested that segments with longer segment size provide better results in the offline classification of EMG signals [14].
Even though a good amount of work has been done to improve PR-based MEC but, due to the stochastic and non-stationary nature of EMG signals, there is still no consensus on which ML-based classifier (between LDA and SVM) provides better classification results for classification EMG signals. Nor it is clear whether a particular segmentation technique (overlap and disjoint), with generalized optimum segment and step size, is better in terms of CA irrespective of the sampling frequency of dataset. To the best of our knowledge, no study has been conducted to investigate the effect of segmentation type (overlap and disjoint), length of segment size and length of step size on the performance of MEC and which ML classifier is best suited for classification of sEMG signals, on multiple datasets recorded at different sampling frequencies. Therefore, the aim of this study is to explore the effect of segmentation type, length of segment and step size in PR-based MEC over multiple datasets with different sampling frequencies. These parameters are explored by employing two widely used ML classifiers for MEC; SVM and LDA.
The rest of the paper is organized as follows: Section 2 gives details about subjects, data collection, experimental procedure and methodology adopted for PR based MEC. In section 3, all experimental results have been presented along with statistical analysis. Section 4 presents the summary and discussion of results and section 5 presents the overall conclusion.

A. DATASETS
Surface EMG data from 30 subjects, spread across 3 datasets, was used in the study. Dataset-1 was recorded for this study and has never been used before in any study whereas other two datasets were pre-recorded and have been used in other studies. All data was recorded in accordance with approval from the local ethical committee of the National University of Science and Technology (approval no.: ref# NUST/SMME-BME/REC/000129/20012019). Written consent was obtained from all subjects before the experimental procedure. In dataset-1, 10 able-bodied subjects (6 male and 4 females, aged 23-28 years) participated in the experiment. They had no prior history of upper extremity disorder or musculoskeletal disease. The surface EMG data was recorded using a commercially available wearable MYO armband EMG sensor (MYB), developed by Thalamic Lab [29].The MYB was worn on the right hand of participants such that it covered flexor digitorum superficialis, extensor carpi radialis, palmaris longus, extensor digitorum, flexor carpi radialis, and extensor carpi ulnaris muscles. The experiments were conducted using a publicly available EMG platform BioPa-tRec and the procedure was designed so that each participant performed eleven active hand motions [30]. Each movement was shown to participants using a BioPatRec graphical user interface before recording. The following hand motions were performed in the experimental procedure: Close Hand (CH), Open Hand (OH), Flex Hand (FH), Extend Hand (EH), Pronation (Pro), Supination (SUP), Side Grip (SG), Fine Grip (FG), Agree (AGR), Pointer (POI) and a rest state (REST) or no-motion state. The data was recorded in a single session from each subject. Each movement lasted for 10 seconds with 6 seconds of contraction period and 4 seconds of relaxation period.
Dataset-2 is comprised of 10 subjects and has a sampling frequency of 2000Hz and was used from [4]. It contained 8 active hand motions. The recorded hand motions include Close Hand (CH), Open Hand (OH), Flex Hand (FH), Extend Hand (EH), Pronation (Pro), Supination (SUP), Side Grip (SG) and a rest state (REST) or no-motion state.
Dataset-3 was also pre-recorded and had been used in [20]. It was recorded using a commercially available myoelectric amplifier (AnEMG12, OT Bioellectronica, Tronio, Italy). The dataset is comprised of 10 subjects with 11 active hand motions. The motions include Close Hand (CH), Open Hand (OH), Flex Hand (FH), Extend Hand (EH), Pronation (Pro), Supination (SUP), Side Grip (SG), Fine Grip (FG), Agree (AGR), Pointer (POI) and a rest state (REST) or no-motion state. Also, an analog bandpass filter with cut-off frequencies of 10Hz and 500Hz was applied to EMG signals during recording. More details about the three datasets used in this study can be found in table 1.

B. PREPROCESSING
The spectral range of the EMG signal is from 10 Hz to 500 Hz. During EMG signal recording, noise (line interference and motion artifacts) is added and mixed up in the original signal. Thus, it is necessary to pre-process the EMG signal before analysis. All three datasets were pre-processed by using a notch filter to reduce electrical interferences. Additionally, a fourth-ordered digital Butterworth high-pass filter was applied on dataset-1 to reduce motion artifacts. Dataset-2 and dataset-3 were also pre-processed by using a fourth-ordered Butterworth bandpass filter to minimize the effect of motion artifacts. From every contraction period,  Figure 2(a) represents disjoint segmentation technique, in disjoint segmentation the pre-processed EMG signal is segmented into disjoint segments of equal length. Figure 2(b) represents overlap segmentation, which is characterized by a segment and an overlap. In overlap segmentation, the next segment contains some portion (overlap) of previous segment.
to avoid non-stationarity, one second was reserved for both the onset and offset phase. Also, the relaxation period was eliminated from every EMG signal before further processing.

C. SEGMENTATION
For some biomedical signals, such as ECG, each peak reveals information about the original signal. Therefore these signals are segmented according to their shape. In the case of EMG signals, a single peak does not reveal the required information for PR based MEC. Also, as EMG signals are non-stationary such that their statistical properties change over time, these signals are analyzed over segments of variable time. A single segment of a signal represents a sequence of data in a specific time slot such that it helps to estimate the overall features and characteristics of the complete signal. The longer the segment is, the more information it would contain about the original signal. But longer segments impose computational load on real-time PR-based MEC. Thus, there is a trade-off between computational load and representation accuracy of a segment. Also, a shorter segment is prone to variance and bias in feature extraction and is more sensitive to noise. A segment size of less than 200ms length does not contain enough information to represent the original signal [30], [31]. For both offline and real-time MEC, a segment size of greater than 200ms is essential for an accurate representation of the original signal. Whereas a real-time constraint of 300ms bounds to keep the segment size less than 300ms for smooth and real-time operation. Figure 2 illustrates the difference between the two types of segmentation techniques. A disjoint segment is characterized by segment length whereas the overlap segment is characterized by segment length and a step-size (increment/overlap). The step-size or overlap is the time difference between two consecutive segments, and it should always be less than segment length and greater than the processing time of MEC.
As larger segments provide better results in PR-based MEC, in order to meet real-time constraint and smooth operation of MEC, overlap segmentation helps to employ  [3], [9]. These features include the mean absolute value (MAV), waveform length (WL), slope sign change (SSC), zero crossings (ZC) and cardinality (CARD). In the present study, we have also used these features to train and test our PR-based MEC system. Principle component analysis (PCA) has also been used to prevent overfitting, as data with high dimensions causes the model to overfit which results in classification error. Principle component analysis was used to reduce the dimensionality of the data. Figure 3 shows the scatter plot of number of SSC and MAV features for the extensor digitorum muscle corresponding to all hand motions. For illustration purpose only the feature plot of SSC vs. MAV for subject-1 from dataset-1 has been chosen.

E. CLASSIFICATION
In the present study, we used both LDA and SVM to investigate the effects of segmentation type and segment-size, VOLUME 8, 2020  on classification accuracy. For SVM, a linear kernel with a box constraint of 1 has been used to train the classifier. The data was divided into three parts: training, validation, and testing. Data was randomized and 70% was reserved for training, 20% for validation and 10% for testing. All the results have been presented on testing data. MATLAB 2018 has been used for analysing the data. Figure 4 depicts the decision boundaries created by LDA and SVM respectively, to classifiy open hand, close hand and supination motion motions.

F. PERFORMANCE VALIDATION
For offline PR-based MEC, various performance metrics are proposed to access the performance of the classifier. In this study, classification accuracy (CA) has been used to evaluate and compare the performance of both classifiers. CA is the percentage of correctly classified instances to the total number of instances. The results are presented in terms of mean classification accuracies by averaging all the results over subjects across all classes. In this study, a two-way analysis of variance (ANOVA) is performed, followed by a FIGURE 5. Classification accuracy vs. segment sizes for all datasets and both classifiers (LDA and SVM) using disjoint segmentation technique. The above graph depicts the relationship between classification accuracy and segment size for LDA and below graph shows the relation between classification accuracy and segment size for SVM using disjoint segmentation technique.
Tukey's honest posthoc test. A P-value of less than 0.05 was considered significant for the evaluation of the results.

III. RESULTS
For all three datasets, both classifiers were trained corresponding to each subject for each segmentation technique and each segment size. The results are presented as mean classification accuracy (MCA), by averaging CA of individual subjects.

A. EFFECT OF SEGMENT SIZE
To find optimum values for segment-sizes, classifiers (LDA, SVM) were trained on both types of segmentation techniques (overlap and disjoint). The ranges investigated for segment size varies between 50 and 450 milliseconds, with an increment of 25 milliseconds for disjoint and overlap segmentation. Further, for overlap segmentation, each segment size was analysed with 7 above mentioned step sizes. Figure 5 and 6 illustrate the effect of segment size by reporting accuracies (MCA) across three datasets for both overlap and disjoint segmentation, as shown, the MCA generally increases with larger segment sizes, for both types of segmentation.
For LDA, along with disjoint segmentation on dataset-1, a segment size of 375ms was statistically better in terms of MCA (90%), while the minimum MCA of 67% was observed on the segment size of 50ms. Two-way ANOVA revealed that statistically there is no significant difference in MCAs of segment size of 375ms and 325ms (P-value = 0.63), 400ms (P-value = 0.91), 425ms (P-value = 0.95) and 450ms (P-value = 1). However, the segment size of 375ms was different (P-value < 0.05) from all other segment sizes. Similarly, for LDA along with disjoint segmentation on dataset-2, segment sizes of 375ms and 50ms secured highest and lowest MCAs of 99% and 89% respectively. It has been observed that MCA of segment size of 375ms was significantly different from MCA of segment size of 50ms (P-value = 0) and 75ms (P-value = 0). Besides that, there was no significant difference between the MCAs of different segment sizes (P-values > 0.05). Dataset-3 for LDA and disjoint segmentation achieved the highest and lowest MCAs of 92% and 81%, at segment sizes of 300ms and 50ms respectively. A segment size of 300ms has a statistically higher MCA among all segment sizes and is significantly different from MCAs of segment size of 50ms (P-value = 0), 75ms (P-value = 0) and 100ms (P-value = 0). Except mentioned segment sizes, there was no statistically significant difference between MCA of segment sizes of 300ms and others (P-value > 0.05). For LDA, along with overlap segmentation on dataset-1, segment sizes of 450ms and 50ms acquired highest and lowest MCA of 89% and 68% respectively. Also, a segment size of 450ms was statistically superior in terms of MCA among all other segment sizes there was no statistically significant difference among MCAs of segment VOLUME 8, 2020 FIGURE 6. Classification accuracy vs. segment sizes for all datasets and both classifiers (LDA and SVM) using overlap segmentation technique. The above graph depicts the relationship between classification accuracy and segment size for LDA and below graph shows the relation between classification accuracy and segment size for SVM using overlap segmentation technique. In contrast with disjoint segmentation technique, the trend lines for all three datasets on both classifiers are very smooth.
For SVM, along with disjoint segmentation on dataset-1, highest and lowest MCAs of 78% and 48% have been recorded at a segment size of 400ms and 50ms respectively. MCA of segment size 400ms was statistically significantly different from MCAs of segment sizes of 50ms (P-value = 0), 75ms (P-value = 0), 100ms (P-value = 0), 125ms (P-value = 0), 150ms (P-value = 0), 175ms (P-value = 0) and 200ms (P-value = 0 ). There was no significant difference among MCA of all other segment sizes from a segment size of 400ms except those mentioned previously. On dataset-2, a segment size of 275ms and 50ms achieved maximum and minimum MCA of 92% and 78% respectively. MCA of segment size of 50ms (P-value = 0) and 75ms (P-value = 0) was significantly different from MCA of segment size of 275ms. Except for the ones mentioned, there was no significant difference between MCA of segment size of 275ms and others (P-value > 0.05). A segment size of 450ms and 50ms achieved the maximum and minimum MCA of 82% and 68% respectively on dataset-3. MCA of segment size 450ms was significantly different from MCA of segment size 50ms (P-value = 0) and 75ms (P-value = 0.01). All other segment sizes have MCAs with no significant difference with MCA of segment size of 450ms.
As, dataset-2 and dataset-3 were prerecorded and had been used in [4] and [20] respectively. In [4], dataset-2 was used to investigate effect of threshold values on various combinations of sEMG time domain features. For the said purpose, the author used 7 time-domain features i.e. MAV, WL, ZC, CARD, SSC, WAMP and MYOP. The features were extracted using overlapping segmentation technique with a segment size of 250ms and step size of 25ms. While using all these features, combined, LDA achieved CA of 98.18%. In comparison, in this study with same segment and step size LDA achieved an accuracy of 96.87%. Similarly dataset-3 was used for multiday evaluation of techniques for EMG based classification of hand motions. The features MAV, WL, ZC, CARD, SSC, WAMP and MYOP were extracted using overlap segmentation technique with segment size of 160ms and step size of 35ms. LDA and SVM secured an accuracy of 95.41% and 90.05% respectively. Whereas, in this study with same segment and step size LDA and SVM achieved an accuracy of 89.44% and 82.87%. The difference in CA between current and previous studies is due to number of features being investigated. In previous studies 7 features had been investigated whereas in this study 5 features have been used to train and test the classifier.

C. EFFECT OF STEP SIZES
To investigate the effect of length of overlap size, each segment size with 7 above mentioned overlap/step sizes were used to train and test both classifiers in overlap segmentation on all datasets. Results indicated that the CA decreases with increase in length of step size for all datasets and both classifiers.
For LDA an insignificant drop in CA of 0.53%, 1.1% and 0.37% has been observed; respectively for dataset-1, dataset-2 and dataset-3, when length of segment size is increased from 20% to 80%. Similarly for SVM, the drop in CA of 4.2%, 4.05% and 7.4% has been observed, respectively, when length of overlap is increased from 20% to 80%. For both classifiers the maximum CA was observed on overlap size of 20% and decreased with increasing the length of step size. Trendline of all datasets, from figure 7, evidently depicts that for SVM the drop in CA is significant.

D. DISJOINT VS. OVERLAP SEGMENTATION
All three datasets were segmented for both disjoint and overlap segmentation and results have been computed accordingly using both segmentation types. Figure 8 represents mean classification accuracies of all three datasets across disjoint and overlap segmentation for both classifiers.Two-way ANOVA reveals that there is no significant difference between MCA of disjoint and overlap segmentation across LDA on all three datasets (difference of 1.8% in MCA, P-value = 0.14). On the other hand, MCA of both types of segmentation differs significantly in the case of SVM (difference of 11.17% in MCA, P-value < 0).

E. LDA VS. SVM
The results of individual subjects have been averaged across all datasets to get mean CA (MCA) across both disjoint and overlap segmentation for both classifiers and are presented in figure 9. Two-way ANOVA revealed that, while using disjoint segmentation, MCAs of both classifiers across all three datasets differ significantly (P-value = 0). LDA outperformed SVM with the difference in MCA of 10.69%. Contrarily, on overlap segmentation, there is no statistically significant difference between MCAs of both classifiers (P-value = 0.44). The difference between MCAs of both classifiers on overlap segmentation against all datasets is 0.7%.

F. F1 SCORE
As mentioned, in introduction section, besides classification accuracy there are other performance measuring metrics VOLUME 8, 2020  to measure the performance of a classifier. Precision is calculated by dividing true positive instances with sum of true and false positive instances. Whereas recall is calculated by dividing true positive instances with sum of true positive and false negative instances. The harmonic mean of precision and recall yields to F1 score. In this study, F1 score is also used to evaluate the performance of both classifiers over all 3 datasets. Figure 10 and 11 illustrate the effect of segment size by reporting F1 score across three datasets for both overlap and disjoint segmentation, as shown, the F1 score generally increases with larger segment sizes, for both types of segmentation. From figures 5,6,10 and 11 it can be observed that the effect of length of segment size on both performance measuring metrics is the same, as the resulting curves are identical.

G. MOTION-WISE CLASSIFICATION ACCURACY
To see the effect of variability in type and size of the segment on classification, per motion CA for both classifiers is reported here. Figure 12 represents the worst, average and  . F1 score vs. segment sizes for all datasets and both classifiers (LDA and SVM) using disjoint segmentation technique. The above graph depicts the relationship between F1 score and segment size for LDA and below graph shows the relation between F1 score and segment size for SVM using disjoint segmentation technique.
best mean classification accuracies averaged over 10 subjects of individual motions across dataset-1 for LDA in both segmentation techniques. For both LDA and SVM against both disjoint and overlap segmentation worst classification accuracies have been observed at segment size of 50ms. For LDA, against disjoint segmentation, the best MCA has been FIGURE 11. F1 score vs. segment sizes for all datasets and both classifiers (LDA and SVM) using overlap segmentation technique. The above graph depicts the relationship between F1 score and segment size for LDA and below graph shows the relation between F1 score and segment size for SVM using overlap segmentation technique. In contrast with disjoint segmentation technique, the trend lines for all three datasets on both classifiers are very smooth.
observed at a segment size of 375ms and for overlap segmentation, it has been recorded at a segment size of 450ms. For SVM against disjoint segmentation, the best MCA has been observed at segment size of 400ms and against overlap segmentation, it has been recorded at the segment size of 425ms. Similarly, average MCA for LDA against disjoint segmentation has been observed at 175ms and for overlap, it has been observed at segment size of 200ms. For SVM, average MCA for disjoint and overlap segmentation has been observed at a segment size of 175ms. Figure 12 and 13 represents the worst, average and best motion-wise MCA of individual motions for LDA and SVM against both segmentation techniques. Figure 12 depicts the motion-wise CA of individual motions across all three datasets for LDA using disjoint segmentation at worst, average and best MCA corresponding to their segment size. While using LDA with disjoint segmentation, the worst MCA was observed at the segment size of 50ms. From figure 12 it can be seen that, out of eleven hand motions ''rest or no-motion'' achieved highest MCA accuracy of 98.1% followed by flex hand (85.1%), extend hand (76%) and close hand (73.9%), in contrast with fine grip which achieved lowest MCA of 51.5%. One-way ANOVA revealed that rest-motion was significantly different from MCA of all the other motions (P-value = 0). There was no significant difference among MCAs of flex hand, extend hand (P-value = 0.15) and close hand (P-value = 0.42) except all other hand motions (P-value < 0.05).Whereas in the case of average MCA results, for LDA with disjoint segmentation, rest and fine grip achieved the highest and lowest MCAs of 97.3% and 67.3% respectively. Open Hand, close hand, extend the hand, flex hand, agree and rest motions have statistically higher MCA and are significantly different from other motions (P-value > 0.05). Best MCA for LDA with disjoint segmentation was recorded at segment size of 375ms; flex hand and rest were the motions with the highest MCA of 98.3% in contrast to fine grip, which had the lowest MCA of 70.8%. Seven hand motions, out of eleven, including open hand, close hand, flex hand, extend hand, pronation, agree and rest have statistically higher and significantly different MCAs as compared to the remaining hand motions (P-value > 0.05).
For LDA with overlap segmentation, the worst MCA was recorded at a segment size of 50ms. Rest and fine grip were the motions with the highest and lowest MCAs of 98.2% and 53.2% respectively. Close hand, flex hand, extend hand and rest motions have statistically higher and significantly different MCA as compared to others (P-value > 0.05). Average MCA was recorded at a segment size of 175ms with an overlap of 25ms. Motion-wise highest and lowest MCAs of 99.1% and 71.1% were observed for rest and fine grip respectively. Open hand, close hand, flex hand, extend hand, agree and rest are the motions with higher MCA and are significantly different from the rest motions (P-value > 0.05). LDA with overlap segmentation generated best results on the segment size of 450ms with an overlap of 25ms; highest and lowest MCAs of 99.2% and 80.4% were recorded for rest and fine grip. All hand motions except fine grip and pointer were statistically higher and significantly different in terms of MCA (P-value < 0.05).
For SVM with overlap and disjoint segmentation, maximum and minimum MCA of 92.2% and 61.5% for disjoint and 89.9% and 59.4% for overlap segmentation was recorded at a segment size of 50ms corresponding to flex hand and fine grip respectively. Besides that, close hand, flex hand, extend hand and rest were the motions with statistically higher and significantly different MCA with respect to others (P-value < 0.05). For SVM with disjoint segmentation, average MCA was reported at a segment size of 175ms. Maximum and minimum MCA of 98.7% and 67.7% were recorded against close hand and fine grip respectively. Except for fine grip, all VOLUME 8, 2020 other hand motions achieved MCA greater than 84% and are significantly different. For overlap segmentation, maximum and minimum MCA of 99.1% and 70.2% were recorded against flex hand and supination respectively. Close hand, flex hand, extend hand, agree and rest were the hand motions with statistically higher and significantly different MCA (P-value > 0.05). Best MCA for disjoint segmentation was reported at a segment size of 400ms. Motion-wise highest and lowest MCAs of 100% and 85.6% were reported against flex hand and fine grip respectively. Supination, fine grip, and pointer were the hand motions with statistically lower and significantly different MCAs (P-value > 0.05). Whereas, in the case of overlap segmentation, maximum and minimum MCA of 99.5% and 75.7% were reported against flex hand and fine grip respectively. Supination, side grip, fine grip, and pointer were the motions with statistically lower and significantly different MCA with respect to others (P-value > 0.05).

IV. DISCUSSION
The aim of the study was to determine if there are any optimum segment size limits for the segmentation of sEMG signals irrespective of the sampling frequency of the recorded data. For the said purpose, all three datasets, for both types of segmentation and both classifiers have been trained and tested on various segment sizes. It has been observed that CA continuously increases by increasing segment size. CA proportionally increases by increasing segment size from 50ms to 250ms (P-value < 0.05) for disjoint segmentation against both classifiers but no significant difference in mean CA has been observed after increasing segment size from 250ms to 450ms for both LDA (P-value = 0.44) and SVM (P-value = 0.99). Increasing length of segment size from 325ms to further do not increases classification results instead it only increases computational load and exceeds the delay time limit of real-time MEC. For robust MEC, the best classification accuracies can be achieved by setting a segment size between 250ms and 300ms for disjoint segmentation, without effecting delay time. For LDA with overlap segmentation, no significant difference has been observed in MCA of segment size 250ms and 450ms (P-value = 0.08). Similarly, for SVM with overlap segmentation, no statistically significant difference has been observed in MCA of segment size of 275ms and 450ms (P-value = 0.12). As a real-time MEC imposes a constraint of keeping segment size below 300ms, the best CA can be achieved by using overlap segmentation with segment size less than 300ms.
For overlap segmentation, the effect of overlap size on CA was also investigated and illustrated in figure 7. All datasets with both classifiers, exhibits decrease in CA with increase in length of overlap size. Two-way ANOVA revealed that, for LDA on all datasets, there is no significant difference in MCA of all overlap sizes (P-value > 0.05) and the MCA decreases from 89.38% to 88.85% when overlap size is increased from 20% to 80% of a segment size. For SVM, on all datasets, the MCA accuracy decreases from 83.65% to 78.43% with increase in length of overlap size from 20% to 80%. The MCA differs insignificantly when length of overlap size is increased 20% to 50% (P-value > 0.05) but differs significantly when overlap size is increased further till 80% (P-value < 0.05). As the length of overlap size should always be greater than the processing time of a real-time MEC, the overlap size should be chosen accordingly. The larger overlap sizes not only decrease CA but also cause to delay response time of a real-time MEC.
To investigate which segmentation type is better in terms of CA, the data was segmented for both types of segmentation techniques with the same segment sizes. Both SVM and LDA have been trained and tested on each segment size for both segmentation techniques and results have been presented in section III. In all three datasets, for both SVM and LDA, it has been reported that overlap segmentation performs better in terms of MCA than disjoint segmentation. A significant difference of 11.2% in MCA of both types of segmentation has been observed in the case of SVM (P-value < 0). In contrast, it has been observed that there is no significant difference in MCA of both segmentation techniques for LDA (P-value = 0.15). From figure 6 it can be observed that MCA of overlap segmentation continuously increases and behaves smoothly and linearly. Whereas, MCA of disjoint segmentation also increases with an increase in segment size but completely lacks the property of linearity.
It was also intended to investigate between LDA and SVM which ML classifier is best suited in terms of CA for classification of sEMG data. For the said purpose both classifiers have been trained and tested using the same parameters and topologies on three different datasets for both disjoint and overlap segmentation of various sizes. While using overlap segmentation, no statistically significant difference in MCA of both classifiers has been observed (P-value = 0.45). Contrarily, on disjoint segmentation, LDA outperforms SVM significantly (P-value < 0). As reported in section I, there are multiple factors that affect EMG signals due to which EMG signals are stochastic and non-stationary in nature. Therefore, EMG data recorded in certain conditions can differ in some properties from data recorded in other conditions. The effect of segmentation type and segment size on the MCA of individual motions has also been investigated in this study. Results have shown that even at segment size with worst MCA for disjoint and overlap segmentation for both classifiers, flex hand, extend hand, close hand and rest are the distinguishable hand motions. On segment size with average MCA, for both LDA and SVM, against both segmentation types, open hand, close hand, extend hand, flex hand, agree and rest are the most distinguishable hand motions. A segment size with best MCA, seven hand motions; open hand, close hand, flex hand, extend hand, pronation, agree and rest were most distinguishable hand motions. From figure 10 and figure 11, it is evident that the classification performance of hand motions is dependent on varying length of segment size. Knowing the hand motions that a MEC system has to perform the segment size can be chosen accordingly which can provide best possible classification results. Also, customized PR-based MEC systems can be designed according, specific to certain motion, based on the optimum segment sizes. The presented results can help to choose segmentation technique, segment size and step size for any MEC applications such as rehabilitation devices, brain-computer interface devices or robotic manipulators etc. such that the device can respond to the input signal within 300ms for smooth real-time operation. As studies have shown that keeping the segment size greater than 300ms yields unstable functionality of the MEC system [28].

V. CONCLUSION
The study presented a comparison of two, state of the art, ML classifiers (SVM and LDA) for classification of sEMG signals along with a comparison of two types of segmentation techniques and an investigation for optimum segment and step size. Also, the effect of segmentation type and segment size on individual motions has been investigated. For the generalization of the results, three datasets with different sampling frequencies have been chosen for the study. Paper investigated the behavior of two classifiers for multiple datasets across various segmentation types of various segment sizes. The results showed that the classification of sEMG signals is case dependent and segmentation type, segment size and step size have no association with the sampling frequency of the data. His research interests include control system design for power electronic converter and mechatronic systems. His additional interests include myoelectric control, system identification, and renewable energy systems for smart grid technologies. He is the author and coauthor of several IEEE publications in different journals and peer-reviewed conferences. He was a recipient of different awards and funding grants. He is an Associate Editor of IEEE ACCESS. SYED HAMMAD NAZEER GILANI (Senior Member, IEEE) received the bachelor's and master's degrees in mechatronics engineering from the National University of Science and Technology (NUST). He is currently pursuing the Ph.D. degree in mechatronics engineering with Air University, Islamabad. He is currently working as Lecturer with Air University. He has published more than ten peer-reviewed research articles. His research interests include robotic rehabilitation, brain-computer interface, bio robotics, neurorobotics, artificial intelligence, and machine learning. VOLUME 8, 2020