Deep Human Motion Detection and Multi-Features Analysis for Smart Healthcare Learning Tools

Unhealthy lifestyle causes several chronic diseases in humans. Many products are introduced to avoid such illnesses and provide e-learning-based healthcare services. However, the main focus is still on providing comfortable and reliable solutions. Inertial measurement units (IMU) are considered as the most independent and non-intrusive way to monitor human health via motion patterns detection. In this paper, a deep-learning-based human motion detection approach for smart healthcare learning tool has been proposed. A novel hybrid descriptors-based pre-classification and multi-features analysis algorithm is proposed to classify the human motion for healthcare e-learning. For pre-processing, a quaternion-based filter is used to filter the IMU signals. An experiment is performed over the acceleration signals using minimum and average gravity removal techniques. Next, signal segmentation of multiple time intervals has been applied to segment data and ultimately compare the results to decide which type provides better performance. Then, pre-classification is done using motion pattern identification in the form of active and passive patterns. During the features analysis phase, features are extracted based on both active and passive motion patterns. Further, an orthogonal fuzzy neighborhood discriminant analysis technique has been used to reduce the dimensionality of the extracted feature vector. Finally, a deep learner known as long-short term memory has been applied to classify the actions of both active and passive motion features for healthcare e-learning systems. For this purpose, we utilized two datasets: REALDISP and wearable computing. The experimental results show that our proposed system for smart healthcare learning outperformed other state-of-the-art systems. The proposed implemented system provided 87.35% accuracy for REALDISP and 85.18% accuracy for wearable computing datasets. Furthermore, the classified motion patterns are provided to a smart healthcare advisor in order to provide live feedback about human health for immediate action.

activities are also imperative [1]. Remote accessibility to the healthcare services comes under the healthcare e-learning models that can provide access to the healthcare facilities over internet. Modern wearable healthcare monitoring is a way to deal with everyday human action recognition that can become extremely complex when monitored through multiple sensors [2], [3]. Therefore, simple inertial measurement unit (IMU)based human motion detection is becoming popular [4], [5]. Auspiciously, a variety of features extraction techniques have evolved in this domain of research [6], [7]. Human motion patterns can be both simple and complex [8]. Applications of the proposed system include healthcare monitoring facilities [9], security [10], emergency services [11], rehabilitation centers [12], and daily living assistance [13]. Regular monitoring is required for all of the above-mentioned applications.
There can be multiple uses for IMUs in research. These small but useful devices can observe and transmit important motion-related data for healthcare watching, appropriateness, and recognition of motion patterns to track and secure humans [14]. Different models have been proposed by researchers to sense human motion and aid in human healthcare. IMUs have been used in a variety of environments, including both indoor and outdoor settings [15], [16]. Some setups achieved higher accuracies when compared to the other systems. IMUs can also provide privacy to the users while monitoring the daily life routine [17].
Various works proposed by researchers show that there were a few flaws that we focused on with our proposed methods. A few studies are based on fused sensor-based systems that lack the ability to demonstrate effectiveness in a variety of environmental settings [18]. Some researchers ignored the outcome variations based on the limited age groups of humans. A few methods achieved limited system efficiency due to limited sensor positions [19], [20].
The IMU signals can be simple as well as complex [21]. Same features extraction techniques cannot provide acceptable results for both types of motion patterns [22]. Multiple systems have focused on pre-processing followed by features extraction [23], [24], [25]. Whereas, our proposed model emphasizes the pre-classification step after pre-processing and before features extraction phases. Ultimately, the proposed system extracts features of two types including active and passive features.
Conventionally, the systems proposed to extract features of the same motion patterns type. Here, we proposed to extract the features based on two types of motion patterns determined from the pre-classification step. So, our model will extract features from multiple domains for both active and passive patterned motion signals. The motion patterns are challenging to sense by conventional models due to the large discrepancies in active patterns and resemblances in the static signal features [26]. Hence, this model proposed a preclassification step along with multi-features extraction. Compared to other state-of-the-art methods, these active-passive pre-classified motion patterns provided higher accuracy rates.
The proposed human motion detection model provides three-fold benefits in the form of pre-classification, multifeatures extraction, and smart healthcare system: 1. Usually, motion detection models become extremely complex in terms of time and computation that they are not able to provide good response time [27]. Therefore, this system was modeled to provide effective human motion detection while keeping in mind the above complexity factors. Our proposed model contains the following important steps: IMU sensed data filtration, data segmentation for further processing, classifying active and passive motion patterns, extracting the features for both patterns separately, optimizing the extracted features into codewords, and classifying the human motion achieving acceptable accuracy rates. 2. Deep learning-based classification have been of great opportunity for researchers [15]. Hence, we have taken advantage of this healthcare e-learning technique and applied LSTM [28] as our system's deep learner and classifier. 3. The system architecture is designed in such a way that it will aid a smart health advisor for making live decisions. A detailed architecture for smart health advisor is also presented. Acceleration and Euler angles along with active and passive motion patterns detection will also be making our system smart. The key contributions of this study can be discussed as: • The system will support smart healthcare learning and will also provide the live feedback to users. Hence, it is a comprehensive approach towards healthcare e-learning.
• A state-of-the-art filter for IMUs has been used. It will help the model in eliminating the noise from the motion sensors based signals.
• Minimum and average gravity has been calculated from the acceleration signal for at-rest motion pattern. It helped in identifying the best possible results from the proposed system.
• A unique pre-classification technique has been introduced. It will help the model identify the motion patterns in advance.
• Active and passive patterns will better support the human healthcare e-learning models. Due to defining these two types of patterns, the system has a better opportunity to classify the motion correctly.
• Deep learner LSTM can discriminate the patterns accurately enough to increase the efficiency and the effectiveness of the model. • We have applied the model for diversified experimentation over two different human motion activities-related datasets. A thorough comparison with other state-of-theart methods is also a plus point of this study. This paper is organized into VI sections as: Section II contains related work in the field. Section III gives the proposed model's architecture in detail. Section IV presents the outcomes of the experimentation phase and provides comparison with conventional approaches in the field. Section V provides a discussion over the smart healthcare advisor system. Lastly, Section VI describes the concluding remarks and mentions future research ideas.

II. LITERATURE REVIEW
A variety of indoor and outdoor environmental sensors have been used in history by researchers for smart healthcare learning. This section shows a background discussion on indoor and outdoor environmental systems for IMU sensors based on human motion detection.

A. IMU SENSORS FOR INDOOR-ENVIRONMENTS
In IMU and indoor environment-based models, different researchers have proposed a variety of systems. Petropoulos et al. [15] proposed a posture recognition and correction system via IMUs attached to the human body and signals are cleansed through an explicit complementary filter. However, healthcare monitoring is not limited to body postures. In [29], Xia et al. suggested a human activity recognition technique. First, they built a large synthetic dataset using IMU. Then, they proposed a model to align the distributions of low-level and high-level virtual along with real data. They also utilized three publicly available datasets to test the performance of their system. The proposed method performed very well over one dataset, whereas the performance was not very good for the other two. Yang et al. [30] have proposed a wearable device using different sensors by collecting muscles-related motion information. Features are calculated and activities are classified via five well-known classifiers. However, the technique was not able to process the dynamic activities recognition without an air pressure sensor.
Jalal et al. [31] proposed a human activity recognition system for smart home indoor environments. They suggested using depth silhouettes and R transformation to recognize the activities of disabled senior citizens. The system was trained via hidden Markov models. The system showed good results but was not able to identify the active patterns correctly. In [32], Lai et al. identified a problem related to elderly-body-posture and proposed its solution. They have utilized collaborative accelerometers as the sensing devices placed over multiple body parts including the neck, waist, and thigh. Based on a dynamic motion detection model, they identified the pose by recognizing the activity. But the proposed method couldn't detect the body situation after a collision i.e. active pattern. Shloul et al. [33] introduced a student's health exercise recognition framework to recognize students' indoor activities for physical education. The system used a modified Quaternion-based filter, data fusion, segmentation, static-kinematic patterns identification, features extraction along with optimization, and classification via extended Kalman filter-based neural networks. However, the classification results and analysis show that the system was not able to achieve high accuracy rates. The authors in [34] proposed a sequential steps-pretrained deep model selection for features classification. Initially, they considered two pre-trained models and fine-tuned through layers addition or deletion. Further, deep transfer learning was utilized to train the models and engineer the features via fully connected and average pooling layers. Moreover, discriminant correlation analysis was performed to fuse them together followed by optimization through an improved moth-flame optimization algorithm. Extreme learning machine was used for the final classification.

B. IMU SENSORS FOR OUTDOOR-ENVIRONMENTS
Recent approaches in sensors-based systems have shown a great influence on outdoor environment-based activities.
To monitor human motion, researchers have proposed multiple types of sensor fusions for outdoor environment sensing. But IMUs are considered to be the best when identifying human motion outside due to its feasibility. Li et al. [35] aimed to provide a system for pedestrian multi-motion recognition. Based on MEMS-IMU, they proposed pre-processing the data from IMU, then performing. They have used a machine learning technique to classify the activities. However, their system did not filter the raw data correctly. In [36], Mäkela et al. established threefold contributions. First, they offered a publicly available dataset named VTT-ConloT. Then, they used a benchmark baseline for human activity recognition. Then, they provided an analysis of their dataset's usefulness. The setup was in a highly regulated environment, therefore there is still a need for real-time scenarios data collection, and experimentation.
Hölzemann and Laerhoven [37] presented a technique that can monitor the basketball players' actions. They used a wrist-worn inertial sensor, which was able to recognize short actions. However, the limitation of this technique include not being able to recognize all types of active and passive actions with high accuracy. In [38], Kondo et al. proposed a detailed soccer players movement recognition method for amateur soccer players. The 3-axis acceleration data of six soccer movements have been utilized to validate the system. They also used the ensemble bagged trees classification method to test the system. However, the proposed system only detected six types of movements, requiring more effort. In [39], multiple gait recognition systems have been reviewed using lower limb exoskeletons. Multiple levels of data fusion, different features, a variety of pre-processing methods, and diverse classification models have been adopted. However, there are multiple reasons for not reaching the accuracy up to 100% including interference in the signal acquisition, not enough data available for experimentation, and traditional classification algorithms are not accurate enough.
An internet of healthcare things model is proposed in [40]. The IMU signals have been registered and filtered using Butterworth low-pass filter. Next, the signals are segmented into a specified size of windows. Then, the features are represented through autoregressive coefficients, signal magnitude area, tilt and roll angles, mean, standard deviation, power of acceleration signal, and entropy of jerk signal. Further, the feature vector is scaled and normalized. Finally, random VOLUME 10, 2022 forest, multi-layer perceptron, support vector machines, and naïve Bayes algorithms are utilized to classify the outdoor activities like walking, jogging, and climbing. The authors in [41] presented a system to generate automatic instructions and real-time recognition of worker activities. They have integrated convolutional neural network (CNN), support vector machine (SVM), CNN region-based CNN to recognize the tasks. After acquiring the reference videos, video frames followed by features were determined. Then, data is saved into a database and graphic instructions are designed. Tested video material is acquired and further, features are determined again using CNN and SVM.

III. PROPOSED SMART HEALTHCARE LEARNING
In order to take care of the shortcomings mentioned in previous section, the suggested method applies a variety of algorithms combination based on experiments conducted [4], [8], [26], [33]. We proposed to use two publicly available datasets for this study, namely, REALDISP [42] and wearable computing [43]. Both datasets have utilized the IMU sensors for retrieving the motion signals. A Quaternionbased filtration technique proposed in [26] has been utilized to denoise the IMU signals [44]. Next, to discover the multiple effects over results, we have used 5, 10, 15, and 20 seconds window segmentation for the signals. Then, active or passive motion patterns have been identified, where active motion patterns are complex and variable in nature. However, passive motion patterns are considered to be those signals that are taken from low-level motion activities and are static. Further, we have applied features extraction algorithms from different domains for each type of motion pattern i.e. active and passive. Moreover, the relative features have been selected from those followed by activity classification via a deep learning model. The detected behavior from classification step will be further provided to a smart health advisor and a live feedback will be given to the user, which will be finally sent back to the system again in order to improve the efficiency. Fig. 1 describes the overall architecture flow for the proposed healthcare e-learning system.

A. DATA FILTRATION
As first part of the proposed smart healthcare learning system, data has been acquired from two different datasets consisting of multiple diverse activities. IMUs integrate the accelerometer, gyroscope, and magnetometer sensors' signals. We have used a Quaternion-based wavelet transformed filter. The filter analyzes IMU signals by filtering the missing values, biasness, and noise to acquire the regulated signals for the accelerometer, gyroscope, and magnetometer in tuning phase [26].
Acceleration data has gravitational errors [45] and the Earth's gravity has been utilized to accommodate the momentary oscillations in the signal. The gravity has been calculated from the most passive i.e. at-rest activity of the dataset. It has been calculated in the form of minimum and average gravity while making a difference in both acceleration and  gravity. Fig. 2 shows the filtered acceleration signal with both minimum and average gravity deducted. Algorithm 1 has described the process of finding the minimum and average gravities from the at-rest activity and gravity removal process. Then, drift errors are present in gyroscope data [46], Quaternion-based mapping [47] and gradient descent supported the rate of change [48] to get normalized signal data.
Moreover, the magnetometer errors have been removed via Earth's magnetic field [49].

B. DATA SEGMENTATION ANALYSIS
After filtration, the filtered data of the whole signal does not provide the characteristics required [50]. Therefore, we applied data segmentation to get different windows [51] of the signal. The windows extracted were of 5, 10, 15, and 20 seconds long filtered signals data [52]. This data segmentation analysis provided us the opportunity to compare the results of different time interval windows, so we decided to extract the 5 seconds windows from each sensor's filtered data.

C. PATTERN TYPES IDENTIFICATION
Followed by data segmentation, pattern types identification is introduced as a way to pre-classify the motion signal windows. Dynamic time warping (DTW) has been applied to separate the active and passive pattern types from the windowed signal. We have extracted the at-rest activities patterns as a reference pattern for comparison [53]. Fig. 3 represents the reference patterns selected from the acceleration signal for 10 seconds window. If the current window's signal data match the reference pattern, it will be considered as passive signal or active signal. Eq. 1 represents the current window and Eq. 2 shows the reference pattern as: where Euclidean distance has been used to calculate the distance between P and R. Next, the warping path WP has been searched for as: where WP is the grid formed by r m and p m . Finally, the DTW has been calculated using:

D. MULTI-FEATURES EXTRACTION
After the pattern types identification, this study applied multiple types of feature extraction techniques for both active and passive pattern types. The extracted features are from cepstral coefficients, spectral, and transformation matrix domains.

1) SYNCHROSQUEEZING TRANSFORM
First, the extracted feature for active pattern type is synchrosqueezing transform (SST) that decomposes the complex activity signals into time-varying oscillatory constituents [54]. It is extensively utilized in analyzing and processing multi-components based signals like IMUs. Therefore, this paper applied SST over the signals. The formula for calculation of SST is provided as: where M is the number of iterations and Ts [M ] (t, γ ) is the time-frequency coefficient. Fig. 4 shows SST applied over the active motion pattern signal.

2) SPECTRAL ROLLOFF
Second extracted feature is the total spectral energy that lies below the spectral rolloff point. There is a relation between the signal's frequency and energy [55]. Hence, we have applied it to the complex motion signal patterns. It is calculated as: where the total number of frequency ranges is presented by N, n is the frequency range, the spectral rolloff frequency that a specified proportion k has accumulated is in the form of fr, and PS k represents the corresponding spectral magnitude. Fig. 5 signifies the spectral rolloff points for active identified patterns over the REALDISP dataset.

3) TEAGER ENERGY CEPSTRAL COEFFICIENTS
First features for passive patterns extracted is called Teager energy operator, which reflects the instantaneous energy of passive signals. It is then used to extract the Teager energy cepstral coefficients (TECC) [56] as: where n is the total number of windows and x is the current window. TECC has several stages including pre-processing,  Gabor filter band, Teager energy operators, framing, averaging, log, and discrete cosine transform with cepstral mean subtraction phases. Fig. 6 explains how the Teager energy affected the passive signals of the system.

4) SPECTRAL FLUX
Another extracted feature is spectral flux over the passive patterns identified. The rate of signal change power spectrum has been measured via spectral flux. Two consecutive windows are being used to compare the power spectrum [57], which is important for passive patterns. It can be calculated as: where SF is the spectral flux for the ith window with WL as the window length and EN is the normalized discrete Fourier transform coefficient. Fig. 7 gives details about four different passive motion patterns over the REALDISP dataset.

E. DIMENSIONALITY REDUCTION VIA ORTHOGONAL FUZZY NEIGHBORHOOD DISCRIMINANT ANALYSIS
When features are extracted from each window, it increases the dimensionality of the feature vector. High dimensional data can cause high computational complexities, therefore it becomes important to take care of the issue by reducing the number of features. For that purpose, we applied orthogonal fuzzy neighborhood discriminant analysis (OFNDA) due to its ability to maximize the distance between multiple motion patterns and minimize the distance within the same type of pattern [58]. The fuzzy partition matrix has been extracted from d samples of a activities as: where i represents the activity number, k is the sample number, µ ik gives the membership grade, λ is the language multiplier, p is the parameter for fuzzification, and η i is the chosen radius for each motion pattern. Fig. 8 (a) and (b) shows the reduced and optimized features over REALDISP dataset for active and passive patterns, respectively.

F. LONG SHORT TERM MEMORY CLASSIFICATION
Dimensionally reduced feature vector is provided to the deep learner named LSTM. It is an artificial neural network used to learn the training scenarios using feedback connections. The name LSTM refers to both long-term memory and short-term memory. The architecture of LSTM gives a short-term memory that can last for a long time. LSTM   consists of three gates, namely, forget gate, input gate, and output gate [59]. We applied LSTM to the motion patterns recognition problem because it can provide data from the previously learned activities. Fig. 9 explains the working flow of LSTM.

IV. EXPERIMENTAL SETTINGS AND ANALYSIS
The above-mentioned proposed smart healthcare learning system has been implemented over a laptop equipped with Intel Core i7-8550U 1.80GHz processing power, 24GB RAM having x64 based Windows 10, and MATLAB tool for experimentation. The proposed model outperformed when experiments were done over two datasets: wearable computing and REALDISP. The activities recognized in these two datasets are vital for the proposed smart healthcare learning system as the detected activities will be supporting  the healthcare advisor and giving feedback to the users and system. A 10 cross-fold validation has been used to avoid the overfitting problem. Furthermore, a deep learning techniques-based comparison with other similar systems has been performed. Details of the datasets are as follows: A. WEARABLE COMPUTING DATASET The first benchmarked dataset, wearable computing (WC) [43], has been made using four accelerometers to collect 5 human motion patterns. These patterns include both active and passive motions such as sitting down, standing up, walking, standing, and sitting.

B. REALISTIC SENSOR DISPLACEMENT (REALDISP) BENCHMARK DATASET
The second nominated dataset is REALDISP [42], which has been collected using nine IMUs placed at the subjects' calves, thighs, arms, and back. The active and passive motion patterns are being captured by 17

C. MODEL EVALUATION AND EXPERIMENTAL RESULTS
Now, the proposed model with minimum and average gravity errors removal, multiple types of window segmentations, multi-features, OFNDA features optimization, and classification via LSTM have been evaluated using the abovementioned two datasets. For this purpose, the trials have been reiterated five times to precisely evaluate the model's performance. Tables 1 and 2 illustrate the confusion matrices for passive and active motion patterns over the WC dataset achieving 86.50% and 83.86% mean accuracies, respectively. Tables 3 and 4 portray the mean accuracy rates of 89.19% for passive motion and 85.50% for active motion over the REALDISP dataset. In Table 5, a comparison with other state-of-the-art models has been presented. The evaluation is based upon deep-learning techniques applied in different models for human motion recognition. It is observed that the proposed model has outperformed the conventional methods for indoor-outdoor environments with improved accuracy rates.

D. ABLATION STUDY
In addition to the experimental results in the form of accuracy rates, the system's efficiency is also important. If the system's results are output immediately without the filtration applied, it will lose the quality in terms of robustness and correctness. Therefore, to evaluate the contribution of used Quaternion-based wavelet transformed filter, we performed an ablation study over WC and REALDISP datasets. Table 6 shows the results of proposed human motion detection system with and without proposed filter applied. The accuracies, peak signal-to-noise ratios (PSNR), and mean squared errors (MSE) for both with and without the filter applied over WC and REALDISP datasets is compared. It is evident from the comparison that the system's performance has been largely boosted using the proposed filter.

V. DISCUSSION
This section presents a discussion for proposed system's utilization in applications such as smart healthcare advisor. Health is a vital element in a human's daily living, which is associated with many daily life aspects. Hence, this article proposes a score-based smart healthcare advisor with embedded proposed human motion recognition system. On a scale of 1 to 10, a perfectly healthy individual will be scored between 8 and 10. Whereas, a score below 8 will be requiring some advice to the individuals i.e. to run some related tests or perform exercise tasks on urgent basis. All the scores from each category of health will be averaged in order to attain a score on the scale of 1 to 10. Fig. 10 gives an example of such a smart healthcare advisor. There are multiple factors that can help the advisor to score an individual such as physiological status, psychological status, potential diseases analysis, user behavior analysis, elderly fall detection, and emotional health analysis. This article has focused on the user behavior analysis system for a smart healthcare advisor and will support to score an individual. The proposed system can also help in detecting fall in elderly individuals and potential disease analysis by following a system such as [65] and [66].

VI. CONCLUSION
In conclusion, this paper has focused on the multi-features and IMU-based human motion patterns recognition. First, the data is acquired from selected datasets i.e. WC and REALD-ISP. Next, the raw data has been pre-processed using stateof-the-art IMU filtration technique via average and minimum gravity removal procedures. Then, filtered data has been segmented through multiple window types and a 5 seconds window has been selected for the proposed model. This segmented data is further provided to dynamic time warping method for active and passive motion patterns recognition. Further, SST and spectral rolloff have been selected for active patterns features extraction, whereas TECC and spectral flux are extracted for passive patterned features. Moreover, an OFNDA optimization technique is utilized for features vector reduction. Finally, LSTM has been applied over the both publicly available datasets-based optimized features. The model outperformed with mean recognition accuracies of 84.68% and 87.85% for active and passive patterns, respectively. The overall mean accuracy of the proposed model is 86.26%, which implies that the proposed model will be useful for smart healthcare learning tools. A comparison with conventional state-of-the-art methods has shown that the proposed model is beneficial for the human motion recognition.
However, some limitations are also present for this research, such as restricted surrounding experiments, response time delays, and limited motion patterns recognition. We aim to include more variety of motion patterns in future such as smart environments, healthcare services, and outdoor complexes into our model by using multiple types of sensors. We will perform further experiments over the patterns identification techniques and different domain features in order to improve the results. Moreover, the proposed filter is taking maximum time of the system response and causing computational cost to be high. We will further improve the filter by utilizing different techniques for gyroscope-based rotational angles.
MOHAMMED ALARFAJ received the B.S., M. Eng., and Ph.D. degrees in electrical and computer engineering from Oregon State University, in 2011, 2014, and 2019, respectively. He is currently an Assistant Professor of electrical engineering at King Faisal University, where he is also the Head of the Electrical Engineering Department. His current research interests include MIMO, mmWave and wireless communications, signal processing, applications in wireless communication, and sensor networks.
KHALED ALNOWAISER received the Ph.D. degree in computer science from Glasgow University, Scotland. He is currently an Assistant Professor at the Computer Engineering Department, Prince Sattam Bin Abdulaziz University, Saudi Arabia. His research interests include computer vision, optimization techniques, and performance enhancement.
AHMAD JALAL received the Ph.D. degree from the Department of Biomedical Engineering, Kyung Hee University, Republic of Korea. He is currently an Associate Professor at the Department of Computer Science and Engineering, Air University, Pakistan. He was working as a Postdoctoral Research Fellowship at POSTECH. His research interests include multimedia contents and artificial intelligence.
NAWAL ALSUFYANI received the B.Sc. degree in computer science from Taif University, KSA, in 2009, and the M.Sc. degree in information security and biometrics and the Ph.D. degree in electronics engineering from the University of Kent, U.K., in 2014 and 2019, respectively. She is currently an Assistant Professor of computer science at the College of Computers and Information Technology, Taif University. Her research interests include biometrics, computer vision, pattern recognition, and machine learning algorithms.