Loading web-font TeX/Math/Italic
Machine Learning and Signal Processing Based Analysis of sEMG Signals for Daily Action Classification | IEEE Journals & Magazine | IEEE Xplore

Machine Learning and Signal Processing Based Analysis of sEMG Signals for Daily Action Classification


A non-invasive mechanism for categorizing physical actions based on multiple signatures of sEMG signals.

Abstract:

The way of living of many individuals around the world endures because of mental and physical disability associated with the movement of limbs. The usage of the assistive...Show More

Abstract:

The way of living of many individuals around the world endures because of mental and physical disability associated with the movement of limbs. The usage of the assistive technology and systems will enhance the quality of affected people. In this situation, you can pave the way for a solution by transforming the movement of physical activities into computer-assisted applications. Surface Electromyography (sEMG) introduced the non-intervention procedure that can transform physical activities into signals for classification purposes and then practice it in applications. In this study, we suggest a scheme based on machine learning for the identification of 20 physical movements. This scheme follows up on the distinct characteristics from numerous signatures that include time-domain features, frequency-domain features, and inter-channel statistics of an sEMG signal. Afterward, we performed a thorough comparative examination of the k-NN and SVM classifier by considering the group of features for multiple normal and aggressive activities. The impact of different arrangements of dimensionalities has been recorded as well. Eventually, the SVM classifier gives 100% accuracy for 10 normal actions whereas 1-NN for a subgroup of features achieves 98.91% accuracy for 10 aggressive actions respectively. Additionally, we combine both SVM and 1-NN to propose a hybrid approach to classify 20 physical actions. The hybrid classifier gives an accuracy of 98.97% respectively. These recommendations are valuable for algorithm designers to select the finest approach by considering the resources available for the execution of an algorithm.
A non-invasive mechanism for categorizing physical actions based on multiple signatures of sEMG signals.
Published in: IEEE Access ( Volume: 10)
Page(s): 40506 - 40516
Date of Publication: 12 April 2022
Electronic ISSN: 2169-3536

Funding Agency:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

Recently, the physical disabilities cause major issues in daily life. There are several factors responsible for these disabilities. They include limb impairment or gait disorder, as the age increases [1], occupational hazards or traumas like sports accidents, drastically affect the life. Another most important reason for disabilities in adults is the stroke. [2]. Most of the affected ones require prosthetic or partial limb support to have assistance in day to day tasks. Aside from that, neurological disorders contribute towards physical disability either directly through hindrance in daily activities or through injuries caused through accidents [3]. Epilepsy is a common neurological disease, normally caused by the activities of nerve cells in brain [4], that affects more than 50 million people worldwide [5]. Hence it is need of the hour that a system is established which classifies physical signals in order to design prosthetic limbs or to get the notification of an epileptic attack well in time, in order to prevent injuries.

In this respect, a feasible solution could be to detect the intended movement and act accordingly. Surface Electromyography (sEMG) is known to be the most accurate non-invasive performer for analyzing activities [6]. sEMG focuses on recording the electrical activity generated when muscles move. Figure 1 is displayed to portray a side-by-side comparison of two signals, the sEMG signals observed against aggressive activities (like elbowing), and the normal ones (like clapping). The nature of EMG signals allow its use for analysis in several cases, i.e. the development of modern human-computer interaction and identifying the ailments of muscular system, as well as in clinical and biomedical applications too [3], [7]. These signals are then used to examine medical abnormalities [8], [9], prosthetic limbs control [10], and emotion detection l [7]. Numerous approaches include wavelet analysis, Fourier transform, empirical mode decomposition (EMD) and filtering etc. have been suggested over the time to inspect the sEMG signals.

FIGURE 1. - Raw sEMG signal showing dissimilarity between two activities for two activities (Normal and aggressive).
FIGURE 1.

Raw sEMG signal showing dissimilarity between two activities for two activities (Normal and aggressive).

An innovative algorithm is proposed in this paper that will classify multiple physical activities based on the sEMG signals. The proposed framework pre-processes a window of raw sEMG signals, primarily improving the variance between the signals of normal and aggressive activities. The pre-processed window is then progressed to a feature extractor, one of sEMG’s well-known time based and frequency- based features.Afterward, the characteristics for the classes are calculated based on Correlation and Covariance between the channels. The feature set is then fed to Support Vector Machines and K-Nearest Neighbor classifier to process the output class of the signal. The suggested technique even with the use of simple classifier performs better than the previous techniques with complex classifiers. The remaining work is organized in the subsequent manner: Section II discusses the development of the algorithms proposed by the researchers for the recognition of physical activities using sEMG signals. Proposed framework is given in Section III which is followed by dataset description, experimentation and the results which are covered by Section IV. In the end, the description and conclusion of these results are detailed in Sections V and VI respectively.

SECTION II.

Related Work

In the literature, focus of the researches has been on numerous aspects including pre-processing, hand-crafted feature engineering and classification. With regard to classification, sEMG signals have been used for various activities ranging from muscle movement to actions identification leading to abnormality detection.

Akhundov et al. evaluates the quality of sEMG signal by conducting comparison of five distinct classifiers [11]. They used both supervised and unsupervised artificial neural networks. Supervised classifiers includes Adaptive Neuro Fuzzy Inference System, and Probabilistic Neural Network whereas, the Convolutional Neural Network, Alex-Net, and ResNet50 were used as unsupervised classifiers. In this work, mean absolute value, variance, root mean square, power spectrum ratio were extracted based on discrete wavelet transform (DWT) feature extraction algorithm for the supervised learning algorithms. For all three CNNs they take an envelope extraction of an EMG signal and then transformed it to an Image for further processing. Ultimately, the unsupervised artificial neural networks improve the classification accuracy by up to 98% related to supervised artificial neural networks has been reported. Duan et al. elaborates the gesture motion recognition, the collection of EMG data takes place for 10 different hand gestures using Myo arm band [12].They introduced the concepts of multi-task learning and multi-class classification to enhance the generalization ability of motion detection systems. On comparing both CNNs and SVMs, CNNs have 94.06% better classification accuracy than SVMs because they have better translational symmetry due to their weight sharing properties. Therefore, the spectrogram image is taken by evaluating the SEMG signal and used as a time-frequency image of the CNN.

In [13], Sezgin et al. describe describe the analysis of the sEMG signal based on the bispectrum that belongs to the family of higher-order spectrum. The binary class sEMG data set (normal action or aggressive action) was taken from UCI machine learning repository. First, they used a bispectrum to analyze the sEMG signal, and then the QPC (quadratic phase coupling) of each sEMG segment was calculated. The characteristics of the analyzed sEMG signal were then input to the learning machine. In the end, the sEMG signal was classified as belonging to either normal activity or aggressive activity. The performance comparison was based on ANN (Artificial neural networks), SVMs (support vector machines), LR (Logistic regression), LDA (Linear discriminant analysis), and ELM (an extreme learning machine) classifiers. The train test ratio of the extreme learning machine was randomly assumed to be 50:50 from the extracted features of the sEMG signal data. It shows that ELM is more efficient and gives higher classification accuracy of 99.75% as compared to conventional learning machines. In [14], Mishra et al. demonstrated that the performance of the improved EMD (Empirical Mode Decomposition) method was better, in which the median filter-based traditional EMD method is used to remove the impulse noise from IMFs (intrinsic mode functions). The amplitude modulation bandwidth, the bandwidth of frequency modulation, Fourier moments of power spectral density, and the first derivative of instantaneous frequency are the features extracted from improved IMFs to detect ALS (Amyotrophic Lateral Sclerosis) and Normal sEMG signals. In [15], Jana et al. discussed the Adaptive neuro-fuzzy inference system (ANFIS) based differentiation of aggressive activities and normal activities from one another. In this paper, the extraction of sEMG features was carried out using the discrete wavelet transform (DWT) algorithm. On the basis of DWT, they used DB-4 (Daubechies) wavelet of level 5, approximate time-series coefficients, etc. to decompose the sEMG signal. The approximate coefficients from the signals were assumed to enter the ANFIS module to classify the physical activities. They used the training testing ratio as 70:30. The accuracy of the ANFIS algorithm based on extracted features was found to be 98% for the binary classification problem. Alaskar et al. presented a novel approach in which three convolutional neural networks are evaluated using the two time-frequency representations [16]. The Spectrogram and Scalograms images are produced from surface EMG signals as the input dataset of CNNs. From the analysis, it can be proven that EMG signal representation affects the performance of CNNs. Spectrogram Images are used as the input dataset to the convolutional neural network for the differentiation between normal and aggressive activity. As a result, this algorithm achieved the accuracy of 94.61% for a binary class problem.

Moving towards action classification, several researchers have presented models for classification of multiple actions either from normal or aggressive action. Sukumar et al. perform identification using Variational Mode Decomposition (VMD) for the ten normal physical activities like bowing, clapping, walking, waving, jumping, etc. of sEMG signal for the analysis of musculoskeletal disorder [17]. VMD decomposes the signal into several functions. These functions are used for the extraction of statistical features like coefficient of variance, zero crossing rate, standard deviation, entropy, mean and negentropy. Next, the extracted features are fed into MC-LS-SVM (multi class least square support vector machine) with RBF kernel for the discrimination of 10 normal activities and the system achieved an accuracy of 98.17% as compared to existing methods. In [18], Subasi, et al. proposed an EMG pattern recognition system for the exoskeleton robot control and rehabilitation purpose. In this study, they used multiscale principal component analysis to remove the noise from various sEMG signals to minimize the effect of outliers. The discrete wavelet transform based statistical features includes mean of absolute value, average of power, standard deviation and ratio of mean absolute value have been extracted. The extracted features are then fed into the SVM with gaussian kernel. The experimental results show that the proposed system got an accuracy of 92.27% for 10 normal classes. Sravani, et al. discussed that the extraction of features is based on Flexible Analytic Wavelet Transform (FAWT) and then fed these features into a feed-forward neural network called as extreme learning machine (ELM) classifier for the classification of multi-class problem in [19]. FAWT decomposes EMG signals into eight sub bands. Following features including negentropy, mean absolute value, variance, modified mean absolute value, Tsallis entropy, simple square integral, waveform length and integrated EMG are extracted from each decomposed sub band. After that these features are fed into the ELM classifier for the identification of 10 normal activities and the proposed algorithm achieves an accuracy of 99.36%. Moreover, In [20], Demir et al. discussed another method in which spectrograms were assumed to be the input of a pre-trained convolutional neural network. They used VGG-16 and Alex-Net for deep feature extraction whereas SVM are used to classify sEMG signals based Physical movements. The highest accuracy of 99.04% for 10 normal activities includes bowing, handshaking, clapping, standing, seating, waving, jumping, hugging and walking etc. is achieved by the deep feature concatenation of fully connected layers of both Alex Net and VGG-16.

Furthermore, an improved classification framework for the multi-class problem is proposed in [21]. The EMG dataset has been taken from machine learning repository. The dataset is comprised of 20 physical activities i.e. 10 normal actions and 10 aggressive actions. The ten normal actions include bowing, hand shaking, hugging, clapping, etc. The ten aggressive actions include elbowing, hammering, Headering, slapping, etc. The classification framework includes probabilistic neural network and subspace KNN. The features are extracted from different signal signatures includes time domain, inter channel correlation, modified spectral moment based features, and local binary patterns. Afterwards sequential forward feature selection algorithm is used to reduce the dimensions. The classification is performed using multiple classifiers like subspace KNN, probabilistic neural network, cubic SVM, gaussian SVM, functional KNN, Bagged trees, and LDA by considering the selected subset of features. But, subspace KNN gives highest accuracy of 93.91% for 20 physical actions.This study deals with the above mentioned multi-class classification problem, as described in the sections below.

In the previous literature, the researchers mostly address the binary class problem of physical activities by considering both traditional machine learning and deep learning. They also address the multi-class normal actions but didn’t focus on multi class aggressive activities. There is only research paper that directs the problem of 20 physical activities. In this article, we address the classification of Multi class problem of 20 physical actions as well as 10 aggressive activities and 10 normal activities comprised of multi-channel sEMG data. Following are the contributions made in the article.

  • We propose an improved feature set consisting of selected feature subsets from different feature signatures including time domain (TD) statistical features, the inter-channel correlation and co-variance, the moment ratios and products of fourier spectrum and, the spectral band powers based statistics.

  • We have combined the SVM and 1-NN models to design a hierarchical classifier to maximize the classification performance.

  • We perform the classification of multiple aggressive activities based on 1-NN model with selected subset of features and results in an accuracy of 98.91%.

  • Finally, we also demonstrate that by combining SVM and 1-NN models results in an accuracy of greater than 98.97% for 20 physical actions.

SECTION III.

Methodology

In this work, the problem of multi-class classification of physical activities is considered. It is based on 8-channel of sEMG data. The suggested approach is split into pre-processing of raw sEMG signals, feature extraction, and classification into 20 categories. Figure 2 shows the diagram representing the proposed framework.

FIGURE 2. - Proposed framework based on hand-crafted features for classification of 20 physical actions.
FIGURE 2.

Proposed framework based on hand-crafted features for classification of 20 physical actions.

A. Pre-Processing and Segmentation

First of all, the pre-processing of the multi-channel sEMG signals has taken in the proposed framework. Let us consider the length of each signal (s_{ch} ) to be of D duration sampled at sampling rate f_{s} having c channels. In this step, each record of sEMG is segmented into smaller chunks, \varphi _{i} , using rectangular window (wind_{l} ) of length W , where W < length of the signal using equation 1.\begin{equation*} \varphi = s_{ch}(l)*wind(l)\tag{1}\end{equation*}

View SourceRight-click on figure for MathML and additional features. whereas, \begin{align*} wind(l) = \begin{cases} \displaystyle 1 & 0\leq l\leq W-1 \\ \displaystyle 0 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

The rectangular window is moved over the whole signal in sliding manner with an overlap of 25% as shown in Fig. 3. The window length W is managed by the sampling frequency at which the desired signal is captured. Note that after segmentation, each sEMG signal is translated into a length W , N_{s} trials using the ch -channels. Then, feature extraction is performed from each sEMG pattern in each sub-window.

FIGURE 3. - sEMG signal segmentation using rectangular window.
FIGURE 3.

sEMG signal segmentation using rectangular window.

After that, extracted features are based on each of the sub-window (s) from every sEMG pattern.

B. Feature Vector Generation

In an attempt to generate an accurate feature vector, the proposed methodology uses different features from both the time domain and the frequency domain. It contains signatures from Time Domain statistics, Frequency Domain statistics and Inter-Channel Correlation and Covariance. In the subsections we elaborate on these features.

1) Similarity Index and Covariance

This subset of features is based on channel wise analysis surrounded by the corresponding segment of two channels (ch_\alpha) and (ch_\beta) of a sEMG signal \varphi _{i} , where \alpha \neq \beta . The channel wise pairing for calculation of these features is shown in Figure 4.

FIGURE 4. - Channel-wise pairing of sEMG segments for correlation and covariance calculation.
FIGURE 4.

Channel-wise pairing of sEMG segments for correlation and covariance calculation.

Initially, maximum correlation between 2 channel is computed based on work done in [21] using equation 2.\begin{equation*} correlation(ch_\alpha,ch_\beta) = max(Corr(\varphi _{i, \alpha }(l), \varphi _{i, \beta }(l)))\tag{2}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Additionally, we propose the use of an additional feature which is the maximum covaraiance between 2 channels using equation 3.\begin{equation*} covariance(ch_\alpha,ch_\beta) = max(Cov(\varphi _{i, \alpha }(l), \varphi _{i, \beta }(l)))\tag{3}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Equations 2 and 3 indicate the maximum measure of correlation and the measure of joint variability i-e; covariance respectively among the segments of two channels ch_\alpha and ch_\beta of a sEMG signal. Since there are 8 channels so we can get the feature vector f_{ICS} of 7\times 8 values after performing the maximum cross-correlation and covariance among the corresponding channels of the signal.

2) Spectral Band Power

The characteristics of power spectral density were previously proposed to identify sEMG patterns in [22]. In the suggested algorithm, the power spectrum estimates are calculated using the Burg transformation of each channel in the sEMG sample. The power spectral density characteristics are assessed by sub-dividing the spectrum into Nb bands and calculating the power of each of these bands.And the feature set, f_{PSD} , is comprised of these band powers for all bands and channels giving us N_{b} \times 8 features.

3) Moments of Logarithm of Fourier Spectrum

The log of both moments and their ratios from the frequency spectrum is calculated for the sEMG segment based on [21]. The total 17 log moment ratios have been calculated for each channel. Hence we get a feature vector, f_{LMF} , of length 17 \times 8 for a segment of EMG activity.

4) Time Domain Statistics

One of the most frequently used signatures is time domain analysis of sEMG signal. This modality shows how a signal changes its parameters or shape over time.

a: Amplitude

The maximum signal’s amplitude can be expressed as in equation 4:\begin{equation*} f_{t_{1}} = max(|\varphi _{i}(l)|^{2})\tag{4}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

b: Root Mean Square

The RMS characterizes the square root of the mean power of the sEMG pattern for a given period as shown in equation 5.\begin{equation*} f_{t_{2}} = \sqrt {\frac {1}{W} \sum _{l=1}^{W} |\varphi _{i}(l)|^{2} }\tag{5}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

Here, ‘s’ represents the segment of a sEMG signal and ‘W’ is the length of the segment.

c: Variance

Variance is used to measures the power of a signal as expressed in equation 6.\begin{equation*} f_{t_{3}} = \frac {1}{W-1} \sum _{l=1}^{W} |\varphi _{i}(l)|^{2}\tag{6}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

d: Waveform Length

WL illustrates an accumulative variance in the sEMG pattern that can indicate the variations of the sEMG signal [23]. It can be calculated using equation 7.\begin{equation*} f_{t_{4}} = \sum _{l=1}^{W-1} |\varphi _{i}(l+1) - \varphi _{i}(l)|\tag{7}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

e: Mean Absolute Value

It indicates the average of an sEMG signal by taking the aggregate of the absolute value of the signal. [23] and can be calculated using equation 8.\begin{equation*} f_{t_{5}} = \frac {1}{W} \sum _{l=1}^{W} |\varphi _{i}(l)|\tag{8}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

f: Simple Square-Integrable Function

It indicates the aggregate of the square values of the amplitude of the sEMG pattern [23], and it can be determined using equation 9.\begin{equation*} f_{t_{6}} = \sum _{l = 1}^{W} \varphi _{i}(l)^{2}\tag{9}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

g: Zero Crossing

It characterizes the total number of counts when the sign of the sEMG signal changes its sign from positive to negative over time [23]. The two given adjoining sEMG amplitude samples \varphi _{i}(l) and \varphi _{i}(l+1) the zero values can be calculated using equation 10.\begin{equation*} f_{t_{7}} = \sum f(s)\tag{10}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where f(s) is set as 1 or 0 for consecutive samples of a segment.\begin{align*} f(s)= \begin{cases} 1 & sgn(\varphi _{i}(k)) \neq sgn(\varphi _{i}(k-1))\\ 0 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.
whereas, k \in {1, 2,\cdots, W-1}

h: Slope Sign Change

It denotes the total number of counts when the slope of the sEMG signal changes its sign from positive to negative over time [23]. Given three neighboring amplitude samples of the sEMG signal \varphi _{i}(l-1) , \varphi _{i}(l) and \varphi _{i}(l+1) , the count is calculated using equation 11:\begin{equation*} f_{t_{8}} = \sum f(s)\tag{11}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where f(s) is set as 1 or 0 for three consecutive samples of a segment.\begin{align*} f(s)= \begin{cases} 1 & sgn(\varphi _{i}(k)) \neq sgn(\varphi _{i}(k-1))\\ 0 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

i: Willison Amplitude

It represents the total counts that are responsible for the variation in the amplitude among the two neighboring samples of the sEMG signal that exceed a specified threshold [23]. It is calculated using equation 12.\begin{equation*} f_{t_{9}} = \sum _{l=1}^{W-1}f(|\varphi _{i}(l+1) - \varphi _{i}(l)|)\tag{12}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where f(s) is calculated as:\begin{align*} f(s)= \begin{cases} \displaystyle 1 & s > threshold\\ \displaystyle 0 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

j: Integrated EMG

It characterizes by the integration of the rectified sEMG pattern and indicates the pre-activation of muscle movement [23]. Simply put, it can be implying as the sum of the absolute values of the sEMG amplitudes as shown in equation 13.\begin{equation*} f_{t_{10}} = \sum _{l=1}^{W}|\varphi _{i}(l)|\tag{13}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

k: Log Detector

It describes the estimation of exerted force observed at the time of muscle activity [23], and it can be expressed using equation 14.\begin{equation*} f_{t_{11}} = e^{\frac {1}{W} \sum _{l=1}^{W}log(|\varphi _{i}(l)|}\tag{14}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

l: Myopulse Percentage Rate

This feature takes the mean value of myopulse output in which the absolute value of the raw sEMG signal outreach the specified threshold. [23] and can be calculated as equation 15.\begin{equation*} f_{t_{12}} = \frac {1}{W} \sum _{l=1}^{W}f(\varphi _{i}(l))\tag{15}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where, ‘s’ is called as a wavelet coefficient of respective channel, and ‘W’ is the window length of coefficient.

m: Difference Absolute Standard Deviation Value

It calculates the standard deviation by taking the difference between the adjoining sEMG samples. [23], and it can be implying as \begin{equation*} f_{t_{13}} = \sqrt {\sum _{l=1}^{W-1}\frac {(\varphi _{i}(l+1) - \varphi _{i}(l))^{2}}{W-1}}\tag{16}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

n: Enhanced Mean Absolute Value

This feature provides a satisfying estimation of exerted force at the time of muscle activity [23], and it can express as the following function by using the parameter p to identify the effect of the sample in the signal.\begin{equation*} f_{t_{14}} = \frac {1}{W} \sum _{l=1}^{W}|\varphi _{i}(l)(l)^{p}|\tag{17}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where p is set using the following \begin{align*}p = \begin{cases} \displaystyle 0.75 & 0.2W \leq l \leq 0.8W\\ \displaystyle 0.5 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

o: Enhanced Wavelength

The features include EMAV and EWL use more value of parameter p in the following function in the 20% to 80% area. You can improve the quality of functionality by increasing the amount of information in the central part to achieve maximum relevant information. In addition, you can see that EMAV and EWL are expansions of MAV and WL features with simple changes, so they do not require much additional computational time for evaluation as provided in [23].\begin{equation*} f_{t_{15}} = \sum _{l=2}^{W}(|\varphi _{i}(l)- \varphi _{i}(l-1)|^{p})\tag{18}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where p is set using the following \begin{align*}p = \begin{cases} \displaystyle 0.75 & 0.2W \leq l \leq 0.8W\\ \displaystyle 0 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

p: Modified Mean Absolute Value

This feature extends the mean absolute value function by giving the weighted window function as in [23]. Mathematically, MMAV can be expressed as \begin{equation*} f_{t_{16}} = \frac {1}{W} \sum _{l=1}^{W}(w_{l}|\varphi _{i}(l)|)\tag{19}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where w_{l} is set using the following \begin{align*}w_{l} = \begin{cases} \displaystyle 0.75 & 0.2W \leq l \leq 0.75W\\ \displaystyle 0.5 & otherwise \end{cases}\end{align*}
View SourceRight-click on figure for MathML and additional features.

q: Modified Mean Absolute Value 2

This feature extends the mean absolute value function by keeping the continuous weighted window function as in [23], it can be implying using equation 19 but with modified value of w_{l} . The modified value of w_{l} is set using \begin{align*}w_{l} = \begin{cases} \displaystyle 1 & 0.25W \leq l \leq 0.75W\\[0.5pc] \displaystyle \frac {4l}{W} & l < 0.25W \\[0.5pc] \displaystyle \frac {(l - W)}{W} & otherwise \end{cases}\end{align*}

View SourceRight-click on figure for MathML and additional features.

r: Maximum Fractal Length

This is a new technique used to measure low levels of muscle activation. Setting the minimum scale to 1 makes the definition of this feature equivalent to the modified version of WL that takes into account the RMS and logarithmic functions that can be calculated using 20 similar to [24].\begin{equation*} f_{t_{18}} = log\big(\sqrt {\sum _{l=1}^{W-1}(\varphi _{i}(l+1) - \varphi _{i}(l)\big)^{2}}\tag{20}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

s: Average Amplitude Change

This feature is equivalent to the WL function, except that the wavelengths are averaged as shown in [25]. It can be formulated with equation 21.\begin{equation*} f_{t_{19}} = \frac {1}{W}\sum _{l=1}^{W-1}|\varphi _{i}(l+1) - \varphi _{i}(l)|\tag{21}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

t: Kurtosis

This feature defines a statistical method that describes the distribution and identifies trends in peak data as given in [26].

u: Skewness

This feature defines as the inclination distribution data. If the mean, median, and data modes are on a single line on the curve, the data is said to be normally distributed. If these values are not on the line of the curve, the result will be skewed or heel as discussed in [26].

After calculating the above mentioned time domain features for each channel, all features are grouped to form a feature vector f_{t} . As the number of channels are eight, so 21\times 8 will gives us the feature vector of length 168.

C. Classifier

The extracted feature vectors described in section III-B are the entered into the classifier and the activity class is predicted from the 20 possible activities. In this work, we implemented various classification techniques, focusing on the use of simple classifiers such as support vector machines and K-nearest neighbors. The aggressive and Normal activities is classified by KNN and SVM. After that, we hybridized both SVM and KNN for the better classification results. This hybrid approach performs better classification accuracy for multi-classes. The performance of classifiers is evaluated using classification accuracy, sensitivity, specificity, precision, F_{1} -score, and kappa coefficient.

1) Support Vector Machine

A support vector machine (SVM) is a supervised machine learning algorithm which is used for both classification and regression problems. SVM is a fast and dependable classification algorithm that performs very well with a limited amount of data to analyze. In this algorithm, each data value is plotted as a point in an N-dimensional space whereas N indicates the number of features/dimensions. For the classification of data, SVM finds the hyper-plane that does not only separates the two classes but also maximizes the margin (i.e. the distance between the margin and the closest data point of each class). In this work, to classify the normal actions the SVM model performs better classification as compared to k-NN model discussed in experimentation section IV-B. The performance of the classifier is analyzed using 5-fold cross validation, the ‘Quadratic’ kernel function, and ‘One vs. all’ multi-class method.

2) K-Nearest Neighbor

k- Nearest Neighbor is one of the most basic and easy-to-implement supervised machine learning algorithm used for both classification and regression problems. It is widely used to recognize patterns, intrusion detection, and data mining. According to this classifier, the value of data point is determined by the data points around it or based on the majority voting principle. The mechanism of k-NN is to find the distances (i.e., Euclidean, Manhattan, Minkowski, hamming etc.) between a new data point and all the neighbor examples in the training data, selecting the specified number neighbor examples (k) closest to the new data point. Afterwards it votes for the most frequent label in classification problems. In our proposed methodology, k-NN is used for the classification of aggressive actions with a subset of features and perform better classification as compared to SVM model discussed in experimentation section. The performance of the classification model is analyzed by setting the value of k = 1 whereas the distance metric is set as Manhattan distance which is also known as city block distance. It is the sum of absolute differences between points across all the dimensions. Equation 22 is the generalized formula for calculation of distance for an n-dimensional space.\begin{equation*} dist = \frac {1}{W}\sum _{l=1}^{W-1}|s_{ch}(l+1) - s_{ch}(l)|\tag{22}\end{equation*}

View SourceRight-click on figure for MathML and additional features.

3) Hybrid Approach (1-NN and SVM)

In Hierarchical classification model, the classification models are grouped together in the form of hierarchy in order to solve the problem of multi-class classification. Considering the given EMG data the classification is well separable and discriminant for two class problem. This can be visualized from Figure 5 shows the effect of mean EMG signals of all ten aggressive and all ten normal activities.

FIGURE 5. - Mean of normal v/s aggressive classes.
FIGURE 5.

Mean of normal v/s aggressive classes.

On considering this, we started the classification form binary and leading towards the multi-class problem (10 normal classes, 10 aggressive classes and 20 physical action classes etc.)

Finally, we combine both SVM and 1-NN classification model in hierarchical manner to classify the 20 different physical actions as shown in Figure 6. First of all, we trained three different models using SVM and 1-NN classifiers. The Binary class model is trained using 1-NN which is based on 440 features, as it classifies the data into either normal or aggressive class. Another SVM based model uses the features from all signatures (i.e. 440 number of features) in order to classify the 10 normal actions. The 1-NN based model uses the subset of features (i.e. 272 features from inter channel correlation and covariance, Log moment of Fourier spectra and spectral band power domains) for the classification of 10 aggressive activities.

FIGURE 6. - Hybrid classifier for multi-class classification.
FIGURE 6.

Hybrid classifier for multi-class classification.

SECTION IV.

Experimentation and Results

A. Dataset

The dataset of physical activities that is used for the analysis of our proposed methodology is taken from UCI Machine Learning Laboratory database [27]. The dataset contains sEMG data of 4 subjects, 3 males and 1 female of age range between 25 and 30. Each subject performed a total of 20 actions which are divided into 10 normal (i.e. Bowing (1), Clapping (2), Hugging (3), Handshaking (4), Jumping (5), Running (6), Seating (7), Standing (8), Walking (9), Waving (10)) and 10 aggressive actions (i.e. Elbowing (11), Frontkicking (12), Hammering (13), Headering (14), Kneeing (15), Pulling (16), Punching (17), Pushing (18), Side-kicking (19), Slapping (20)). The sEMG signals collected from these subjects have a total of eight channels, four for the upper limbs and four for the lower limbs. Every channel correlates to time series data from each electrode consisting of around 10,000 samples.

B. Experimentation

The dataset of physical activity as stated above holds around 10000 samples per action per subject. First, these samples are broken into several overlying segments, with the spacing of 1000 per segment with an overlap of 25%. Therefore, you can subdivide each activity to get a healthy sample space of 600–900 segments depending on the length of the signal. The sample span is then further utilized to extract the feature set per segment with different signatures. described in section III. For getting a better classification rate, the experimental setting for validation is to measure this parameter for a 20-class problem. SVMs and kNNs have been applied to categorize using the full feature vector and their subsets. The complete feature set was divided into subsets holding particular feature types (time, frequency, etc.) and their possible combinations. The details of these subsets are available in Table 1.

TABLE 1 Feature Subset Details Used for Different Experimentation
Table 1- 
Feature Subset Details Used for Different Experimentation

Various parameters such as accuracy, sensitivity, specificity, f-Measure, accuracy, misclassification rate, and kappa coefficient were calculated to measure the performance of the proposed algorithm. The SVM and kNN 5-fold cross-validation helps to split the data into 20 and 10 classes, respectively.

Initially, the entire feature vector of length 440 is surpassed to the classifiers to categorize data into 20, and 10 classes respectively for normal and aggressive actions. Table 2 represents that the subset includes features from Inter-Channel Statistics, Log moments of Fourier spectra and spectral band power have higher classification accuracy for 10 aggressive classes.

TABLE 2 Comparison of 1-NN Based Classification Using Various Subset of Features
Table 2- 
Comparison of 1-NN Based Classification Using Various Subset of Features

It is clear from Table 3 that SVM model performs better using all features to classify 10 normal actions as compared to k-NN model. So, from this observation we can combine both SVM and KNN models to classify 20 physical actions.

TABLE 3 SVM Based Classification Using Various Subset of Features
Table 3- 
SVM Based Classification Using Various Subset of Features

In addition, a thorough analysis of SVMs was performed to classify physical activities using an entire subset of features, but the subset of features did not provide good classification accuracy.

SECTION V.

Discussion

The acquisition sensor of non-invasive signals such as sEMG plays a vital role in uplifting the lifestyle of human beings suffering from various neurological and physical disabilities/diseases. The correct type of bodily sports is the primary step in supplying a possible option to such patients. In this write-up, we have proposed a framework that can assist in developing an assistive technology by giving a classification of physical activities.

In our methodology, the use of pre-processing has a significant impact on the accuracy of classification, both for the comprehensive feature set and for the optimal subset. The result is that SVM and 1NN with a combination of f_{ICS} and frequency domain analysis features provide better classification accuracy compared to an entire feature set and other classifiers. The relevant classification accuracies for 1-NN and SVM are 98.91% and 100%.

A. Binary Classification

The confusion matrix is obtained by performing 80:20 split on 974 observations of binary class (Normal and Aggressive) using 1-NN classifier on considering features from all signatures. The resultant model has been trained on 780 samples whereas has it has been tested on 194 samples. The model gives us the testing accuracy of 100% which is reflected in Table 4.

TABLE 4 Confusion Matrix for Binary Classification Using Proposed Technique
Table 4- 
Confusion Matrix for Binary Classification Using Proposed Technique

B. Normal Activities Classification

The confusion matrix is obtained by performing 80:20 split on 483 observations of 10 normal physical activities using SVM classifier on considering features from all signatures. The resultant model has been trained on 393 samples whereas it has been tested on 90 samples. Hence, the model gives us the testing accuracy of 100% as shown in Figure 7.

FIGURE 7. - Heat map of normal activities from SVM with 10-fold cross validation for 10 normal actions (All features).
FIGURE 7.

Heat map of normal activities from SVM with 10-fold cross validation for 10 normal actions (All features).

C. Aggressive Activities Classification

The confusion matrix is obtained by performing 80:20 split on 491 observations of 10 Aggressive actions using 1-NN classifier on considering the subset of features from the signatures include Inter channel statistics, Log moment of Fourier spectra and Power spectral density. In this classification, Time domain features are not considered. The resultant model has been trained on 399 samples whereas it has been tested on 92 samples. Hence, the model gives us the accuracy of 98.91% as shown in Figure 8.

FIGURE 8. - Heat-map of aggressive classes for 1-NN classification with 10-fold cross validation for 10 aggressive actions (ICS, LMF and PSD subset).
FIGURE 8.

Heat-map of aggressive classes for 1-NN classification with 10-fold cross validation for 10 aggressive actions (ICS, LMF and PSD subset).

D. All Activities (Hybrid Classifier)

The confusion matrix is obtained by testing 194 observations of 2 class using 1-NN classifier on considering the features from all different signatures. As a result, optimized k-NN classify the samples into either normal class or aggressive class. The samples classified as Normal are fed into the optimized SVM classifier (trained using all features) whereas the samples classified as aggressive are fed into the optimized KNN classifier which are trained on the subset of features from the signatures include f_{ICS} , f_{LMF} and f_{PSD} . Hence the 20 class classification is performed by training two different classifiers in hierarchy. The model gives us the average testing accuracy of 98.97% as shown in Figure 9.

FIGURE 9. - Heat-map of 20 classes for hybrid classifier.
FIGURE 9.

Heat-map of 20 classes for hybrid classifier.

A comparison of performance indicators such as sensitivity, specificity and precision, f-measure, mis-classification rate and kappa constants is shown in Table 5 from both binary classes and multi-classes. The performance indicator carries the Specificity and sensitivity and shows the true negative and true positive values of the classifier for the proposed feature vector. The accuracy determines the true positive value that belongs to this activity.

TABLE 5 Comparison of Performance Parameters
Table 5- 
Comparison of Performance Parameters

The performance comparison of our proposed method with the latest research work is shown in Table 6. The obtained features from different signal signatures for each segment gives a good response to the classification of 20 physical actions of sEMG. It provides robustness to the variation between classes. This result shows that the hybridization of SVM and KNN models provides better performance for automatic identification of surface EMG signals.

TABLE 6 Performance Comparison of Proposed Method With Same Dataset for the Classification of Physical Actions sEMG Signals
Table 6- 
Performance Comparison of Proposed Method With Same Dataset for the Classification of Physical Actions sEMG Signals

SECTION VI.

Conclusion

In this study, we have proposed a multi-class classification framework based on SVM and KNN to categorize the physical actions using the features derived from eight channels of the surface EMG data. A set of 440 features were extracted from various signatures including the statistical features of time domain, the inter channel cross correlation and covariance, logarithm of moments of Fourier spectra and the mean band power of power spectral density estimates based on the Burg’s algorithm. The results show that the SVM performs better for the classification of ten normal classes whereas KNN improves the accuracy for the ten aggressive classes. In the case of 20 class classification, adopting hybrid approach by combining SVM with KNN models improves the accuracy especially if the dataset is limited. The classification results of proposed method shows better performance in terms of accuracy as compared to other existing methods.

References

References is not available for this document.