Complex Human Activity Recognition Using a Local Weighted Approach

Due to the life expectancy increase, there will be a workforce shortage in elderly care sector in forthcoming years. Ambient Assisted Living (AAL) systems can cope with this issue. A subset of AAL, Human activity recognition (HAR) provides an efficient way to tackle this issue. It can help with evaluating general health and welfare of elderly by automatically tracking their activities. Lifelogging and home diary applications will reduce the load on physicians and caregivers. On the other hand, complex activities play a vital role as they have high level semantic characteristics that truly represent daily life of the user. The main objective is to track these high-level semantic motions with low-cost single sensor systems with efficient machine learning frameworks. To achieve this objective, a framework is proposed to predict complex human activities from a single sensor using a machine learning approach. Time and frequency features are extracted from PAAL ADL Accelerometry Dataset and fed to Locally Weighted Random Forest (LWRF) machine learning algorithm. This algorithm is a hybrid structure that utilizes local weighting by introducing neighboring samples on Random Forest tree building phases. Proposed approach achieved 91% accuracy for HAR and 91.3% for gender recognition, outperforming other machine learning algorithms and previous study on the same dataset. This is the first study that utilize a local weighted approach for accelerometer signal domain. For prospective application, proposed framework can be embedded in lifelogging and home diary applications in home environments to track mental status of elderlies.


I. INTRODUCTION
In recent years, world life expectancy is increased due to 20 advancements on healthcare, this situation results in an ele- 21 vation of elderly population in society [1]. With the reduction 22 of birth rates all around the world, the ageing population 23 gains a larger proportion in living societies. In approximately 24 30 years, 16% of all human population will be over 65 years The associate editor coordinating the review of this manuscript and approving it for publication was Lei Shu . the elderly. These technologies can help on rehabilitation, 33 monitor chronic diseases, tracking cognitive impairment and 34 mild dementia in older adults [3]. 35 AAL is the combination of several aspects, these are 36 context awareness, internet of things, machine learning and 37 sensor technologies [4]. All of these aspects are combined 38 to provide a better life quality for elderly and enable them 39 to live their life independently. A subset of AAL, Human 40 activity recognition (HAR) is a perfect combination of these 41 aspects. HAR, which involves analyzing data from different 42 sensor sources to identify characteristics related to a person's 43 activity, is a crucial component of AAL. It can be utilized 44 to promote proactive behavior or even basic cooperation 45 between the person and the environment [2]. 46 Recognition of daily activities via HAR is a good approach 47 for evaluating general health and welfare in elderly people. 48 This evaluation can be done by asking the question ''Is the 49 defined by actions that repeats itself and possess a single 106 body pose. Example simple activities are sitting, running, 107 walking. Simple activities lack the capability of reflecting 108 the daily life of users because behaviors of users are made 109 of combination of several activities. Complex activities on 110 the other hand, are the combination of simple activities. For 111 example, eating a meal can be considered as a complex 112 activity because it involves sitting and can contain several 113 different hand motions while eating a meal. Example complex 114 activities are cleaning, cooking, writing and eating [9]. These 115 activities usually have high level semantic characteristics 116 that truly represent daily life of the user. In order to create 117 an AAL system for the care of the elderly, it is necessary 118 to examine complex activities instead of focusing on sim-119 pler ones because complex activities contain more informa-120 tion about daily life of a user [12]. But to be able to find 121 appropriate features is another challenge in complex HAR 122 tasks [13]. Machine learning approaches can become insuffi-123 cient to make accurate predictions without a prior knowledge 124 on which features have the most representative power [14]. 125 Ranking of features can provide this insight. With the help 126 of ranking approaches, relevant features can be identified 127 easily for recognition tasks [15]. Another research direction 128 for these complex activities is to find robust machine learning 129 frameworks that can cope with limited sample size. Due to 130 their nature, these activities can have small motion cycles and 131 therefore they have small number of samples. These limited 132 sample size data can have a negative effect on prediction 133 capability of machine learning approaches [5].

134
To this end, a framework is proposed to predict complex 135 human activities from a single sensor using a local weighted 136 machine learning approach. The main challenge is, instead 137 of using multiple sensor data, use a robust framework that 138 combines single sensor data and machine learning algorithms 139 to predict complex daily living activities. Tian et al. [10] proposed ensemble-based filter feature 237 selection (EFFS) approach to optimize the feature set in 238 human activity recognition task. They extracted wavelet 239 decomposition features and filtered them using EFFS 240 approach on a private dataset that includes single accelerom-241 eter signals. SVM and kNN are selected as classifiers. They 242 reported that EEFS approach combined with SVM classifier 243 can give high accuracy with less features. Lu and Tong [11] 244 conducted a research on human activity recognition using 245 single three axis accelerometer. In order to reduce the bur-246 den of heavy preprocessing phase, the authors proposed a 247 modified recurrence plot by converting motion signals to 248 images. After these conversion phase, images go through a 249 tiny residual neural network for classification. They used their 250 own dataset and ADL open access dataset. Several machine 251 learning approaches are benchmarked, and their approach 252 outperformed others in terms of accuracy and computation 253 time. Guney and Erdas [16] studied how deep learning archi-254 tectures affect human activity prediction rate from single 255 accelerometer signals. They aimed to do feature free classifi-256 cation using deep Long Short-Term Memory (LSTM) model. 257 They used an open access dataset that has single tri axis 258 accelerometer data. The authors compared their approach 259 with other previous studies and outperformed them in terms 260 of accuracy. Acici et al.
[17] constructed a complex human 261 activity dataset with single wrist worn Inertial Measurement 262 Unit (IMU). They extracted time and frequency domain 263 features from motion signals. After feature extraction, they 264 compared traditional machine learning approaches on activity 265 prediction and person identification tasks. Random Forest 266 model achieved the highest accuracy in all cases. They also 267 discovered that combination of accelerometer and magne-268 tometer signals can increase prediction performance on per-269 son identification and complex activity recognition. 270 Lu et al. [18] provide a different perspective for activ-271 ity recognition using single accelerometer. The authors 272 VOLUME 10, 2022 categorize human actions as countable (complex) and Based on their findings, the authors reported that local fea-278 tures have a bigger impact on countable activities rather than 279 uncountable ones. 280 Lv et al. [19] aim to characterize complex human activ-    Each observation in the dataset has an activity and a gender 327 label. Trained classifier uses available trained observations to 328 predict the class label of a test observation. At the last stage, 329 classifier performance is evaluated using various metrics. 330 General overview of proposed machine learning approach is 331 given in Fig. 1.

333
For this study, a publicly accessible human activity recogni-334 tion dataset called ''PAAL ADL Accelerometry dataset'' is 335 used [22], [23]. The dataset consists of accelerometer mea-336 surements of 52 healthy participants. Participant gender dis-337 tributions are 26 men and 26 women. The age of participants 338 ranges between 18 and 77 years. The dataset has 24 daily 339 living activities that include simple movements and complex 340 daily living activities. Activities in the dataset are divided 341 into 6 broad categories (Eating and Drinking, Hygiene, 342 Dressing/Undressing, Miscellaneous and Communication, 343 Basic health indicators, House cleaning). Although these 344 24 activities sometimes have similar movements, accelerom-345 eter signals can capture subtle motion differences in these 346 similar activities on all axes. These daily living activities are 347 given in Table 1. Data capture is done with a single wrist 348 worn accelerometer. The measurements are recorded at 15Hz. 349 In order to capture true nature of activities, participants are 350 asked to wear this wrist worn accelerometer at home or office 351 setting instead of a laboratory environment [22]. Participants 352 did repetitions for every activity at average 5 times. This 353 approach leads to 6072 recording files in total. ing classifiers need equal input in order to process data, this 359 is not the case for accelerometer signals since they have 360 different signal lengths [17]. The other reason is that these 361 signals are not in the same time dimension even though 362 they sometimes possess identical signal lengths. This issue 363 makes it hard for machine learning algorithms to exploit local 364 patterns in signal data. Another reason is that the temporal 365 signal measurements sometimes make it harder for machine 366 learning algorithms to predict behavior of motion since 367 these raw measurements sometimes do not reflect motion 368   [26]. With the contribution of both 407 Random Forest and local weighting schemes, it outperformed 408 other machine learning approaches in multi-channel gait 409 final process, outputs of these trees are combined to achieve 454 a final output [34]. Majority voting of all tree outputs is done 455 for classification tasks and averaging of all tree output values 456 for regression tasks.

457
The second part of the LWRF algorithm is local weighting. 458 Local weighting considered as a non-parametric learning 459 model that utilize local relations on the dataset [35]. The 460 nearest points to the query sample are used to build the 461 local weighted model, rather than building a global model 462 on all available training data. Number of nearest points are 463 usually user defined same as in k nearest neighbor algorithm. 464 A weight value is assigned for every neighboring data sample 465 in the dataset. Target value estimation is affected by these 466 weight values [36]. Data points that are closest to the query 467 have greater weight values compared to those that are further 468 from it. From these closest points, estimated k points are used 469 in training phase to finally define the label of a query point. 470 In LWRF algorithm, local weighting scheme is infused with 471 Random Forest when computing split points and selecting 472 samples for bootstrap [26]. The novelty of LWRF algorithm, 473 is that instead of focusing all of the existing data, LWRF 474 algorithm focuses on similar data points which are defined 475 by distances and by adding weights to these similar data 476 points, it incorporates these weighted data points to Random 477 forest decision making processes. By this incorporation Ran-478 dom Forest selects bootstrap samples among these weighted 479 samples.

492
All experiments are done using a k-fold cross validation (CV) 493 setup. In this setup, the dataset is randomly split into k folds.

494
Afterwards, each fold is selected for testing phase while  Overall prediction results for activity recognition with 545 a 10-fold CV setting can be seen in Table 4. Proposed 546 locally weighted framework achieved 91%, 90.9%, 91%, 547 99.5%, 0.91 and 0.91 in terms of Accuracy, Precision, Recall, 548 Specificity, Matthews correlation coefficient and F1 score 549 respectively. As can be seen from Table 4. LWRF algorithm 550 outperformed other classifier algorithms and previous work 551 by Climent-Pérez and Florez- Revuelta [5]. kNN algorithm 552 comes as second and previous work [5] comes as third. 553 Naïve Bayes performed worst when predicting activities from 554 accelerometer signals. A conclusion can be drawn from these 555 classification results that adding local weights to Random 556 Forest algorithm can increase accuracy when detecting com-557 plex human activities. Confusion matrix of LWRF algorithm 558 for human activity recognition is given in Fig. 3. is conducted. In this analysis, gender recognition capabilities 576 of machine learning algorithms are investigated on the same 577 dataset. Main purpose of this investigation is to validate 578 the capability of the proposed framework when exploiting 579 relationships between human motions and gender charac-580 teristics. Obtained results can be seen in Table 5. LWRF 581 achieved 91.3%, 91.3%, 91.4%, 91.2%, 0.83 and 0.91 in 582 terms of Accuracy, Precision, Recall, Specificity, Matthews 583 correlation coefficient and F1 score respectively. As can be 584 seen from Table 5. LWRF algorithm outperformed other 585 classifier algorithms and previous work by Climent-Pérez and 586 Florez-Revuelta [5] when predicting gender characteristics of 587 human motions. kNN algorithm comes as second and previ-588 ous work [5] comes as third. Naïve Bayes performed worst 589 when predicting gender characteristics from accelerometer 590 signals. Confusion matrix of LWRF algorithm for gender 591 recognition is given in Fig. 5.

592
Experiments are extended to include feature rank analysis. 593 Feature ranking approaches determine which features are 594 important in decision making processes [14], [15]. In this 595 study, feature ranking is done by using a Correlation based 596 feature selection method with ranker approach for both tasks 597 (Human Activity Recognition (HAR) and Gender Recog-598 nition (GR)) [45]. By analyzing feature ranks, important 599 time and frequency signal features are identified for both 600 tasks in decision making [46]. Top 10 ranked features 601 according to feature rank analysis algorithm are given in 602    Side by side comparison with previous work on the same 612 dataset can be seen in Fig. 6. As can be seen from Fig. 6,   613 for activity and gender recognition tasks, LWRF algorithm 614 outperformed previous study in terms of accuracy. Previ-615 ous study utilized Random Forest only as a global model.

616
Whereas LWRF focuses on local models. By using local 617 weighting, dataset variability is minimized and therefore 618 accuracy is increased. Weighting of samples in Random 619 Forest processes increases the chance of machine learning 620 algorithm to pay attention to similar available data points 621 whereas previous study only consider global samples for 622 Random Forest processes.

623
The authors that analyzed this dataset mentioned some 624 issues regarding class confusion [5]. They reported that sev-625 eral activities have similar movements and therefore classifi-626 cation model is confused when making activity predictions. 627 Similarities came from same position of hands, symmetricity 628 along time axis and lack of movement of wrist when doing 629 activities [5]. Reported most confused activities are ''put 630 on a shoe, take off a shoe, open a bottle, open a box, put 631 on glasses, take off glasses, stand up, sit down, phone call, 632 sneeze or cough and blow nose''. As a final analysis, pro-633 posed framework is compared with previous study on most 634 confused activities. As can be seen in Table 7, when using 635 LWRF algorithm, accuracies are increased in most of the 636 confused activities. Previous study failed to detect activities 637 that have small sample sizes (open a bottle, open a box, take 638 off glasses, sneeze/cough and blow nose) whereas proposed 639 approach identified these activities with higher recognition 640 rate. A possible inference from these results is that LWRF 641 algorithm's inclusion of local weights in Random Forest 642 decision making process can overcome the need for more data 643 to achieve global machine learning model. This analysis also 644 shows that proposed framework can detect smallest changes 645 in human motions and identify them correctly in many cases. 646 In addition, ROC curve results of LWRF algorithm for all 647 activities is given as supplementary material (Fig. 7, 8, 9). 648 In these figures, X axis refers to FP rate and y axis refers to 649 TP rate. As can be seen from these results, LWRF maintains 650 a good performance in predicting complex activities.

652
According to results presented in ''Empirical Results'' 653 section, proposed complex human activity recognition frame-654 work achieved higher performance when compared with 655 other machine learning models and previous sole study on 656 this dataset. A conclusion can be drawn from HAR task 657 classification results that adding local weights to Random 658 Forest algorithm can increase prediction quality when detect-659 ing complex human activities. LWRF algorithm's parameter 660 effect on recognition tasks are also investigated and 55 neigh-661 bor value is selected as suitable candidate for experiments. 662 Gender recognition capabilities of proposed framework are 663 also investigated and similar results achieved. According to 664 achieved results, proposed framework can be a viable tool for 665 exploiting relationships between human motions and gender 666 characteristics.

689
Due to the life expectancy increase, there will be a workforce 690 shortage in elderly care sector in forthcoming years. The best 691 way to overcome this issue is with the help of AAL systems. 692 A subset of this system, HAR provides an efficient way to 693 tackle this workforce shortage. It can help with evaluating 694 general health and welfare status of elderly by automati-695 cally tracking their activities. For example, Lifelogging and 696 home diary applications for dementia disease will reduce 697 the load on physicians and caregivers. Complex activities 698 play a vital role in these applications as they have high 699 level semantic characteristics that truly represent daily life 700 of the user. Therefore, instead of focusing simple activities, 701 recent studies pay attention to the activities that have complex 702 motion behavior. Another important thing is to track these 703 high-level semantic motions with low-cost single sensor sys-704 tems with efficient machine learning frameworks. To address 705 this challenge, a framework is proposed to predict complex 706 human activities from a single sensor using a local weighted 707 machine learning approach. Proposed framework has several 708 contributions. First, it is the first study that utilize local 709 weighted machine learning approach for accelerometer signal 710 domain. Secondly, proposed model outperformed sole previ-711 ous study on the same dataset when predicting activities and 712 user gender. Novelty of this approach comes from proposed 713 approach's ability to accurately predict complex activities 714 that have a small movement cycle. These low sample size 715 activities can be harder to predict for global models due to 716 lack of data but LWRF algorithm's inclusion of local weights 717 in Random Forest decision making process can overcome 718 these problems. On the other hand, related studies focused 719 only complex activities that have limited number of activ-720 ity categories. Proposed approach investigated a dataset that 721 has the largest number of activity categories. The empirical 722 results indicate that this study can provide robust solutions 723 in AAL for assessing dementia related disease progression 724 of elder people in home environments by tracking their daily 725 activities.

726
A hybrid machine learning algorithm called Locally 727 Weighted Random Forest (LWRF) is used as a classifier in 728 this problem domain. It is the combination of Random Forest 729 classifier with local weighting. Frequency and time domain 730 features are extracted and fed as an input to LWRF algorithm. 731 LWRF algorithm, outperformed other machine learning algo-732 rithms and the previous work on activity recognition and 733 gender recognition tasks. Obtained results suggest that LWRF 734 algorithm can be able to distinguish complex activities even 735 with a limited number of samples. Another conclusion that 736 can be drawn from the experiments is that proposed frame-737 work can reduce variation effects in accelerometer signals by 738 introducing local weights in several phases of Random Forest 739 algorithm. In addition, Feature rank analysis on the extracted 740 features revealed that signal magnitude features, RMS and 741 percentile of time domain signal features hold the most valu-742 able information when determining complex human activities 743 and correlation between axes and signal magnitude features 744 play an important role for determining gender. SMA feature is 745 the most valuable one for both tasks. These high rank features 746 can be beneficial for HAR applications that focus on large 747 number of complex activities and also differentiating gender 748 motions in accelerometer signals.

749
There are some shortcomings exist in this study. First short- niques can be also considered to overcome this issue [49].

772
The other shortcoming of this study is with the sample size  Table 8. As can be seen from the Table 8, the dataset that is 792 used in this study has the highest number of activities [5].

793
In other studies [9], [17], [18], [19], [20], high variety of  [16], [21] are 798 mainly focused on limited number of activities due to its 799 challenging nature. In addition, as can be seen from  challenging task of predicting complex motions. Whereas in 804 this study by using a single accelerometer sensor, a dataset 805 with highest number of activities among other datasets is 806 investigated with numerous machine learning algorithms.

807
Overview of strengths and limitations of proposed 808 approach are as follows; Main strengths of this study over 809 others can be given as; it is the first study that utilize 810 local weighted machine learning approach for accelerometer 811 signal domain, the proposed framework outperformed pre-812 vious study on the same dataset and other machine learn-813 ing approaches when predicting complex human activities, 814 the proposed framework can reduce dataset variation effects 815 by introducing a combination of local weighting scheme 816 and Random Forest algorithm, by utilizing neighboring data 817 points and locally weighting in prediction phase this approach 818 can be considered for predicting activities with limited num-819 ber of samples. Limitations of the proposed framework com-820 pared with other studies can be given as; decision of number 821 of neighbors (k value) needs parameter tuning and LWRF can 822 reach a high computational complexity depending on number 823 of samples and features used. Another problem is the lack of 824 sample size to utilize deep learning architectures.

825
For future direction of this study, one aim can be bench-826 marking proposed framework on other human activity recog-827 nition open access datasets. Datasets that have different type 828 of sensors can be considered. Fusion of these sensors and 829 their impact on prediction tasks can be investigated. This will 830 increase validity of the proposed approach on different sensor 831 VOLUME 10, 2022