Detecting Elderly Behaviors based on Deep Learning for Healthcare: Recent Advances, Methods, Real-World Applications and Challenges

Machine learning has been applied in healthcare domain for the development of smart devices to improve the life of the elderly persons in the society. Taking care of elderly person in the society is a critical issue that need automation. To proffer solution, many researchers developed deep learning algorithms smart devices for detecting elderly behavior to improve the elderly healthcare. Despite the progress made in the applications of deep learning algorithms in elderly healthcare systems, to the best of the author’s knowledge no comprehensive recent development has been published on this interesting research area especially focusing on deep learning. In this paper, we presented a comprehensive recent development on the advances, methods and real world applications on developing smart devices for detecting elderly behavior for use in smart home, smart clinic, smart hospital and smart elderly nursing home for elderly person’s healthcare. Theories of the deep learning algorithms, recent development recorded as regard to the applicability of deep learning in elderly healthcare systems and case studies were discussed. A taxonomy based on the data extracted from the applicability of deep learning algorithms in elderly healthcare systems is created to ease pointing out areas that need more attention. The article shows that the deep learning algorithm that received tremendous attention from researchers is convolutional neural network architecture and its variants. To help in future development of the research area, we highlighted the challenges associated to the applicability of deep learning algorithms in elderly healthcare system and pointed out new point of view for future research. The research community can use our review as a benchmark for proposing novel deep learning algorithms based smart devices to detect elderly behavior for elderly healthcare systems. Industries and organizations can use the paper as a guide in selecting machine learning based smart device for detecting elderly behavior for elderly healthcare support.

homes, smart healthcare, smart environment monitoring, and smart homeland security. The interconnection of wireless sensor networks and internet protocol produce the internet of things (IoT) that makes it possible for the connection of everyday life object to the internet [4] for easy monitoring in real time and ubiquitously. Different wireless sensor networks were developed for providing support to the elderly person such as smart homes for elderly person and active assisted living devices that can help prevent accident, physical problem, mental problem and social problem that are typically experience by the elderly person [5]. Cutting edge technologies such as the IoT and deep learning algorithms brought revolution in the monitoring of medical situations of the elderly person [6]. This makes a lot of researchers to develop smart devices for detecting elderly behavior based on deep learning algorithm for the monitoring of elderly person behavior with the aim of providing healthcare support and better welfare. For example, smart device for the detection of elderly person fall, depressed disorder, body gesture, facial gesture and hand gesture. Review on elderly healthcare based on machine learning were published in the literature, for example, [7] present review on the applications of shallow machine learning algorithms for the prognosis of dementia in elderly people. The review mainly focused on shallow machine learning algorithms such as support vector machine, decision tree, Bayesian network, artificial neural network, etc. [8] conducted a review on elderly healthcare system based on shallow machine learning algorithms. The paper review the machine learning technique for detecting chronic ailment in elderly person. The paper discusses the weakness and strength of the machine learning techniques. However, the previously published reviews focus mainly on shallow machine learning algorithms in the domain of dementia and chronic ailment for elderly persons. To the best of the author's knowledge, no comprehensive literature review has been conducted on the detection of elderly behaviours based on deep learning algorithms to support elderly person healthcare despite the fact that the research area is fast developing. In this paper, we propose to conduct a comprehensive review on the adoption of deep learning algorithms in developing smart devices to provide healthcare support to the elderly people in the society. The summary of the contributions in the paper is provided as follows:  The fundamental flow for the deep learning algorithms in enhancing the detection of elderly behaviour to provide improve health services to the elderly person is presented, the review classified the literature into different primary architecture of the deep learning algorithms: convolutional neural network (CNN) (and its variants like the GoogleNet, AlexNet, YOLO, and VGG), long short term memory (LSTM), and hybrid deep learning architecture.
 A taxonomy on the deep learning algorithms adopted for the development of smart devices to provide healthcare support to the elderly persons in the society is created.  The pattern of research conducted in developing healthcare supportive technology for the elderly persons based on deep learning is depicted showing significant interest being generated recently from both the academia and the industries.  A new taxonomy of the elderly persons behavior is created based on the data extracted from different literature: abnormal behavior, falls, facial, hand gesture recognition and depressed disorder.  It is found from the review that CNN and its variant were the heavily adopted deep learning architecture for developing smart devices to provide healthcare support to the elderly person in smart environment.  Real world applicability of the deep learning in the development of smart devices to support healthcare of the elderly persons were outlined.  The open research challenges militating against realizing the full potential of deep learning algorithms in developing smart devices to support healthcare of the elderly persons were discussed. Possible research direction to resolve the open challenges were outlined to provide opportunity in promoting research in the area of adopting deep learning algorithms for healthcare support to the elderly person.
Other section of the paper is structured as outlined: Section II, present the methodology for the review. Section III, technical background of deep learning algorithms is presented. Section IV, smart environment and its application is provided. Section V, present the detection of the elderly person behaviour in smart environment. Section VI, advances made in the adoption of deep learning to develop smart devices for the elderly healthcare is presented. Section VII, present the performance metrics used by different projects. Section VIII, present discussion on learning paradigm. Section IX, presents different smart devices develop based on deep learning. Section X, present datasets. Section XI, shows the case studies for machine learning based smart devices. Section XII, presented discussion about the research area. Section XIII, presented challenges and future work before Section XIV present the conclusions.

II. METHODOLOGY
In this section, we discussed the detail procedure used to conduct the literature review to ensure that sufficient literature on the applications of deep learning algorithms to develop smart device for supporting elderly healthcare is covered. The section covers keywords formulation, search strategies, The keywords formulated in the previous section were used for retrieving articles from the academic search engine and abstract databases. The paper targeted strictly peer review article in reputable venues such as journals, conferences proceedings, edited book, etc indexed in credible academic search engine and abstract databases including DBLP computer science, ACM digital library, Sciencedirect, Springerlink, Eicompendix, IEEE Xplore, Web of Science and Scopus. The article that were found to match the keywords were scanned through the references for extracting relevant papers. The search in the academic search engines and abstract databases was conducted in two phases: The first phase of the search was conducted between the period of October 10,2021 to October 20, 2021

C. INCLUSION AND EXCLUSION CRITERIA
Inclusion and exclusion criteria were benchmark to extract only the relevant literature that fulfilled the criteria. The articles were screened for relevancy and irrelevancy based on the title, abstract, conclusions and content of the article extracted from the academic search engines and abstract databases. The exclusion and inclusion criteria used for the article selection are outlined as follows: for any article to be included it must have fulfilled the following criteria -The mainly focus on the articles that applied any deep learning architecture to develop smart device for supporting elderly healthcare. So any that described it, is included for the review. Articles published in peer review journal, conference proceedings and edited book. In terms of medium of communication, only articles written in English language were included. On the other hand, articles excluded are those that falls in the following category: articles that described smart device for supporting elderly healthcare in smart environment develop based on different technology not deep learning. The articles published in the form of abstract, editorials, commentary, keynote speech and textbook were excluded. Articles that adopted statistical or conventional machine learning algorithm for developing smart device to support elderly healthcare were excluded.

D. ELIGIBILITY
For article to be eligible for selection, it must describe empirical application of any deep learning architecture in This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. developing smart device for elderly healthcare support. The articles retrieved from the academic search engine and abstract databases were subjected to set of criteria for determining the article eligible for selection. The articles from the search results of the academic search engine and abstract databases was returned. It was later reduced after eliminating significant number of the article based on titles and duplicate. The screening continues at the second stage with the rejection of articles based on abstract and conclusions.

E. DATA EXTRACTION
We extracted data of the empirical studies that reported deep learning based smart device for elderly healthcare. Table was used for the tabulation of the extracted data. It gives an opportunity to easily identify more redundant and duplicate articles for elimination. Two reviewers independent from the research were invited to assess the data together with the objective of the study. After the comments from the independent reviewers were received, it was observed that the two comments agreed that the data is in conformity with the study objective. As such, no disparity from the comments of the two different independent reviewers.

III. THEORETICAL BACKGROUND OF DEEP LEARNING ALGORITHMS
This section discusses the operations of different deep learning architectures that were found to be used in developing smart devices for use in smart environment to support elderly person healthcare. We only considered the deep learning architecture relevant to the study only. Other architectures of the deep learning were not considered because of scope. Figure 2 is a taxonomy created based on the deep learning architectures extracted from the papers that adopted the deep learning for developing the smart devices.

A. CONVOLUTIONAL NEURAL NETWORK
The CNN consists of 3-dimensional units, namely the height, width and depth. As the height and the width corresponds to the black and white colours, the depth corresponds to the red, green and blue colours (RGB) from which the image input is being passed to the CNN layers. The term convolution refers to a way of coiling or twisting the pixel values (grey scale) of the input image together with the corresponding filter (also called kernel) values as per defined window with respect to specify stride values [9].The CNN is found to be the most widely known deep learning algorithm [10][11][12][13][14][15], and has the ability to figure out relevant features without any form of human intervention [16]. The CNN is used to solve problems in the areas of computer vision [17], speech processing [18], facial recognition [19] and bioinformatics [20]. The entire operational processes of the CNN are made up of six components/layers, namely; the image input, convolution, activation, pooling, fully connected and output.
According to [21], for a given input image a , with a resultant height, width and depth of The activation function that is mostly used for the CNN is the

1) THE VGG ARCHITECTURE
The VGGNet is a variant of the CNN architecture proposed in [22]. Usually, the VGG model takes in 244*244 images as inputs which corresponds to the RGB pixels values with a margin from 0 to 255, the measure referring to the mean value in respect to the images is determined by calculating the differences associated with total training datasets for the total available ImageNet. The layer that houses the weight of the VGG network model intercepts the pre-processed images which then transmit them to orderly arranged layers of convolution. There are 3*3 filters for the VGGNet with similar task as that of convolution layer with a dimension 7 x 7. The VGGNet is composed of the following basic features of layers; 19 for weight, 16 representing convolution, 3 as fully connected while 5 for poling. Each variant of VGGNet there found to be 2 fully connected layers, where each and every one consists of 4096 different medium and then subsequently another fully connected layer with a softmax handing the classification problems. In the architectural perception of the VGGNets, it is viewed. There are immediate two convolution layers, where each is made up of 3*3 filters and then composed of 64 filters which is an equivalents of 244*244*64 total volume similar to the convolutions that leads to the 3*3 filters and 1 stride. The subsequent layer is found to be pooling layer usually with a 2*2 dimension for the max-pool followed with a stride of 2, this downsizes the 244*244*64 volume to 112*112*64 in respect to the available height and width. Subsequently, two convolutional layers consisting of 128 numbers of filters which makes the latest dimensional result to further reduce to 112*112*128, when pooling layer comes in, the volume further reduces to 56*56*128. Two additional layers of convolution having 256 where each has 256 filters and then the size is further reduced to 28*28*256. Then, there are two stacks consisting of 3 convolutional layers with a variation in between by a max-pool layer. Lastly, the fully connected layer flattens the equivalent volume of 7*7*512 carrying 4096 different media channels with a 1000 classes obtained from the actions of the softmax function.

2) THE ALEXNET ARCHITECTURE
The AlexNet is a variant of the CNN that is well recognized [10], it performance in terms of image recognition as well as classification attaining high level of achievements. [10]proposes the AlexNet which actually enhanced the workability of the CNN in terms of learning rate with additional depth with additional optimized parameter techniques. The AlexNet utilizes dropout regularization and ReLU [23] activation function as convergence rate enhancer [24] with a depth of 8, error rate of 16.4, input size of 3 277 277   using ImageNet datasets.

3) THE GOOGLENET ARCHITECTURE
GoogleNet was introduced by [25], it is usually used in the area of image classification as well as object detection. GoogleNet originated from what is known as the inception network. As a CNN with 27 layer structure, the model is used to handle many other tasks with 9 inception modules.

B. LONG SHORT TERM M EM ORY
The LSTM is a variant of recurrent neural network that has the ability to learn and determine time series predictions, this capability involves in solving dynamic and difficult problems [26]. The LSTM is a complex area of deep learning that actually addresses the limitations of the RNNs [27] such as oscillating weights, vanishing gradient problems and exploding [28]. In LSTM there exist what is called a cell which serves as a memory that remembers information over time, there are a number of gates that facilitates the overall operational activities of the LSTM as well, these are the input, output and the forget gates [26]. The input gate allows the movement of data inputs to the cell, the forget gate controls the information flow in terms of which segment of the data should be retained or neglected. The output gate generates the required output result after program execution. The sequence output t h is obtained based on Eqn. (3):

C. HYBRID DEEL LEARNING ALGORITHM S
The hybridization of models in the application of deep learning is typically practice to achieve robust, reliable and efficient result when compared with a single model approach. The hybridization process can involve two or more models coming together to either complement the weakness available in another through optimization or evaluate and transform certain processing stages for data input and further processing in or by the subsequent models. Hybridizing deep learning could be in different tracks like hybridizing two or more different architecture of the deep learning or hybridizing the deep learning architecture with a shallow traditional algorithm. For example, [29] hybridized RestNet-50 and support vector machine (SVM), [30] hybridized CNN-LSTM and auto/encoder-CNN-LSTM, [31]hybridized CNN-LSTM, [29] hybridized CNN and SVM, [32] hybridized VGG-16 and LSTM, [33] hybridized CNN and deep reinforcement learning (CNN-DQN), [34] hybridized ANN, CNN and LSTM (ANN-CNN-LSTM), [35] proposed hybrid of MLPauto-encoder called dense auto-encoder (DAE) and hybrid of CNN-auto-encoder known as convolutional auto-encoder (CAE)) algorithms. In examples provided in the previous paragraph, each constituent algorithm in the model play's particular role and the rest handles different roles to work cooperatively to generate the required result. In [36], hybridization of genetic algorithm and dynamic time warping (GA-DTW) is proposed to optimize the operation of DTW. Thus, making it an instance where optimization is achieved through the hybrid system. It is found that the hybridized intelligent systems performance better than the constituent algorithms.

IV. ADVANTAGES AND DISADVANTAGES OF DEEP LEARNING
The deep learning algorithms doesn't require extra step and separate technique for feature extraction. The deep learning algorithms have successfully eliminated the extra efforts and the use of different separate technique to perform feature extraction in view of the fact that deep learning performed the feature extraction automatically. Therefore, the computing resources, feature extraction techniques such as PCA and human efforts typically required in processing data before feeding into typical machine learning algorithms were eliminated. The deep learning learnt complex function because of it is ability to learn features at different abstraction level. The present reality of data is massive generation and collection of datasets from different sources, the commonly generated data includes images, audios and videos. The deep learning has the ability to process very large amount of datasets as it is the present reality, thus, making the deep learning algorithms highly relevant in today's world because of the advancement of technology that makes it possible to generate and collect very large amount of datasets. The performance of the deep learning algorithm increases as the amount of data increases because the deep learning algorithm require very large amount of data to perform well. The ability of the deep learning to process large number of features makes it suitable for processing unstructured data. For example, filters embedded at different layers of the CNN architecture serves as feature extractors. However, the deep learning architecture has the following limitations: The performance of the deep learning suffered in a situation where the sample size of the data is small. The deep learning requires high computing resources to run. The training of the deep learning architecture takes long period of time to converge and millions of parameters requires tuning.

V. SMART ENVIRONM ENT FOR SUPPORTING ELDERLY PERSON HEALTHCARE
Smart environment is created using IoT to provide medical support to the elderly person. The IoT is a system of interconnected/interrelated computing devices that have the ability to communicate and share information or data with one another over a network without any human intervention [37]. Therefore, IoT involved physical objects that have the ability to connect to the internet, hence, any computing device within the IoT environment is referred to as smart or intelligent device. For examples of the applications of IoT involve smart home, smart hospital, smart elderly nursing home, smart clinic, internet of medical things (IoMT), etc. Smart home is simply an intelligent home or residence. A smart home is a systematic connection of internet connected devices (electrical and services) which gives room for remote monitoring /surveillance and management of home activities [38]. Imagine switching lights/bulbs at home remotely using mobile phone, even closing our doors and windows while we are far away from our homes, these are the beauty of smart home. Smart elderly nursing home: A special building or room designated purposely for the elderly care is called nursing home. When smart devices are involved in the nursing home to help care givers in taken care of the elderly people usually involving cloud computing, artificial intelligence, and digital health [39] is referred to as the smart elderly nursing home. Smart Hospital: In hospital environment, smart devices are incorporated and integrated to help in efficient and effective service delivery to patients. These devices help in early detection of situation of patients prior to their occurrence in order to prevent or mitigate against any further worsening health conditions and also help manage the inflow and outflow of patients in the hospital [40].The internet of medical things (IoMT) is also referred to as the healthcare internet of things. The IoMT provides a direct internet connection and interaction between medical devices and application with the healthcare information technology devices usually sensors to promote the activities of medical services for the better [41]. Other than the smart environment, it is found in the study that computer vision play's vital role in supporting elderly healthcare. The computer vision enables computers or smart devices to observe visual inputs in the form of digital images (pictures) or videos and then take necessary actions or simply provide recommendation based on the information derived from the visual inputs. Computer vision is applied in classification, segmentation and object recognition [42]. Table 1 presents the different smart environment used by different project in developing smart device based on deep learning to support elderly person healthcare.

TABLE 1 THE SUMMARY OF THE TECHNOLOGY USED FOR CREATING THE SMART ENVIRONMENT/TASK FROM DIFFERENT PROJECTS
Reference Technology Smart environment/task [43] Security and privacy Privacy protection [44] IoT Sensor network [29] Computer vision Hang gesture recognition [3] IoT Projection augmented reality [45] Decision support Depressed disorder prediction [30] IoT Smart home [6] IoT IoMT [31] IoT Smart home and hospital [46] IoT Smart home [47] Computer vision Dynamic Motion and Shape Variations for Elderly Fall Detection [48] Computer vision Medical (satisfaction) security system [49] IoT Smart home [36] IoT Smart home [50] IoT Smart nursing home [29] Computer vision Hand gesture classification [51] IoT Smart home [32] IoT Smart home [33] IoT Smart hospital [34] IoT Smart home [35] IoT Smart home [52] Computer vision Object/person recognition/identification [53] IoT Home/Nursing home and hospital [54] IoT Nursing home [55] Computer vision Fall detection and alarm system

VI. DETECTING ELDERL Y PERSON BEHAVIOUR BASED ON DEEP LEARNING ARCHITECTURES
A lot of system has been developed for the monitoring of the behavior of the elderly person at smart home, smart clinics, smart hospital and smart elderly nursing home. The focus of our review is on the development of system that detects the behavior of the elderly person based on deep learning algorithms (refer to section 2). It is found that deep learning algorithms are making inroad in the development of the assistive technology for providing healthcare for the elderly person. In the papers that were analyzed about the detection of the behavior of the elderly person using deep learning algorithms, we found that the dominant cases are on the detection of abnormal behavior, falling, hand gesture, body gesture, facial recognition, and depressed disorder. It is from the monitoring of the activities of the elderly person that data are collected and analyze to help provide better insight on the best way to provide healthcare to the elderly person as well as improve the quality of the smart device or the algorithm applied for the data analytic. In turn, provide medical support and improve the quality of life for the welfare of the elderly persons in the society. In view of the evidence in the literature, computer vision is playing a significant role in providing healthcare to the elderly person in the society. We analyze literature and extracted the attributes related to each of the elderly behavior to create a taxonomy shown in Figure 3.   [30]. Deep learning algorithm can be used to detect the abnormal behavior of elderly person based on the attributes previously listed. Falls: Elderly fall is found be the leading cause of fatal and none fatal injuries among the elderly persons. It is estimated that 27,000 elderly persons died, 2.8 million were treated in emergency and 800,000 were hospitalized as a result of falling. Study found that in 2014, 28.7% elderly falls, 7 million injuries from 29 million falls were recorded. The conventional means of reducing the number of falls includes gait and balance assessment, strength and balance exercises and medication review [56]. Recently, deep learning algorithms were applied to develop smart device for detecting elderly falls as shown in Tables 1 some examples of falls positions are as follows: Forward laying, front knees, sideward laying and back sitting-chair [30]. Immediate response to falls by administering treatment increases the chances of fast recovery [43].
Hand gesture: The recognition of the hand gesture by computer can provide realistic natural humancomputer interaction by permitting individual to point or rotate computer model by simple hand gesture. The hand gesture is useful in the control of appliances in the house. The hand gesture could be dynamic or static [57]. The elderly person can use the hand gesture recognition system built based on deep learning algorithm to make a request. For example, the elderly people with limbs disabilities can use the hand gesture recognition system to communicate with family members and caregivers to request water, meal, medication, seeking help, and going to toilet [29]. Facial recognition: The ability of elderly person to recognize faces deteriorated because they find it difficult to quickly recognize people. The error that typically arise in elderly people when identifying people can result in omitting the benefits of visiting and delivery services for the welfare of the elderly people in the society. It also has the disadvantage for a possible easy home inversion to commit crime by unknown people. Therefore, this makes it necessary for the application of deep learning to aid in accurate facial recognition system to be used by the elderly person to recognize people faces. The facial recognition algorithm takes care of facial occlusion and rotational angles [3]. Depressed disorder: The depressed disorder is believed to be one of the main health challenges facing elderly persons in the society. As a result of that, an effective predictive model is required to predict the possibility of the depressed disorder in an elderly person to provide a quick intervention for averting the depressed disorder if predicted correctly. The attributes for the depressed disorder are found to be as follows: enjoyed life, restless sleep, felt depressed, was happy, felt sad, could not get going, everything was an effort, and felt lonely [45]. Body gesture: Body gesture is one of the significant form of communication that is none verbal. The body gesture includes the movement of different part of the body for communication that allow person to communicate feelings, thought and emotions in a variety of ways. Happy and angry face is well known all over the world [58].Body recognition system can be developed based on deep learning algorithm for the detection of elderly person body gesture to show the behavior of the elderly person based on the body gesture. Figure 4 shows the taxonomy of the deep learning algorithms and the task associated to the behavior of the elderly person. The hybrid deep learning algorithms and the CNN were heavily relied upon for the detection of elderly person behavior. Detail discussion about the adoption of the deep learning algorithm to develop smart device to provide medical healthcare to the elderly person is provided as follows:

1)APPLICATION OF CONVOLUTIONAL NEURAL NETWORK FOR DEVELOPING SMART DEVICE FOR ELDERLY HEALTHCA RE
The CNN has being found to be used by different project to develop smart device for providing healthcare support to the elderly person in smart environment, for example, [44] developed activity recognition system based on CNN to improve accuracy of detecting the activities of the elderly persons. The CNN based activity recognition system analyses the signals collected from distributed sensor networks embedded in the clinics in hospital. The propose CNN based activity recognition system was able to detect the daily activities of the elderly person in hospital or a nursing home that is considered as falling cases. The CNN based activity recognition system performs better than 12 classical approaches such as activity windowing (AW), fixed sample windowing (FSW), time weighted windowing (TWW), mutual information windowing (MI1), etc. [46] proposed CNN based angel assistance system (AAS) to improve the fall detection accuracy performance of the AAS for the elderly by minimizing the false positive alerts. The CNN model achieved 98% accuracy with more or less of 17% reduction in the false positive alerts compared to the conventional AAS. [51] proposed CNN for multi-domain activity classification in elderly healthcare. The CNN extracts pattern features and classifies based on six different possible activities. The result shows that the CNN achieved 91% accuracy. When compared with SVM, the CNN outperforms the SVM in achieving higher accuracy with increased delay in task completion whereas the SVM achieved lesser accuracy with a faster task completion. [59] proposed CNN for AI-enabled elderly care robot. The CNN based robot assists the elderly person in object identification/recognition. It reveals the identity of object via voice activation and then notifying the elderly person. The CNN model achieved performance accuracy of 95%.
2)CONVOLUTIONAL NEURAL NETWORK VARIANTS FOR ELDERLY HEALTHCARE [43] developed GoogleNet based radar sensor for the monitoring of elderly person movement. The GoogleNet based radar sensor can detect the fall of an elderly person with high accuracy. The proposed GoogleNet performance is compared with the classical algorithms. The GoogleNet based radar sensor accuracy in detecting the elderly fall is found to be better than the AlexNet and VGG-19. [3] propose YOLO based projection-augmented reality for providing auxiliary functions and ensuring the safety of the elderly person. The YOLO based projection-augmented reality is bidirectional instead of the classical unidirectional projection-augmented reality. The propose projection-augmented reality is experimented for pose estimation, face recognition, and object detection, precisions were found to be high. It is found that the projection-augmented reality elderly healthcare system can support the elderly person than the traditional systems. Table  2 presents the summary of the CNN in monitoring elderly person in smart environment. [47] developed a fall detection system based on VGG-16 net for the elderly. The VGG-16 net used the dynamic motion and shape to determine the fall detection. The result indicated that the VGG-16 net achieved better sensitivity and specificity compare to the hand crafted features based methods (HCFBM), state of the art approaches (SAA), the optical flow and RGB streams (OFRS) as well as the UR fall detection (SAURFD). [48] proposed a VGG-net for the prediction of elderly medical satisfaction. The VGG-net achieved a reasonable accuracy with respect to tangibility, reliability, responsiveness, assurance and empathy. The accuracy of the VGG-net model was enhanced through optimization. [49] proposed AlexNet for fall detection of elderly person. The AlexNet model achieved acceptable performance level in terms of accuracy, specific, sensitivity, detection time per frame/average evaluation of solution (DTPF/AES) and geometric mean (GM) without any model comparison for performance evaluation. [55] applied NSNet-S (slow), NSNet-M (medium) and NSNet-F (fast) for elderly fall detection and alert system at residential home. The models operate based on computer vision system using both real and synthetic data for the training and detection of the elderly falls. The result obtained showed that ShuffleNet-v2 and ResNet-18 outperformed those of the NSNet-S, NSNet-M and NSNet-F.

3) LONG SHORT TERM MEMORY FOR DEVELOPING HEALTHCARE SMART DEVICE FOR THE ELDERLY
The LSTM is found to be used by researchers to develop smart device for monitoring elderly person healthcare. For example, [45]applied a multi-tasking learning for deep LSTM to predict the depressed disorder in elderly people. The data for the study were collected from 20,000 elderly people in US for the period of 20 years. The multi-task LSTM is applied to predict the depress disorder. It is found that the multi-task LSTM can capture temporal and high order interactions within the risk factors. The performance of the multi-task LSTM is compared with SVM, MLP, DBN, single LSTM, and multi-task LSTM without auxiliary. Multi-task LSTM with auxiliary is found to be better in predicting depressed disorder in elderly person. Table 3 present the summary of the LSTM project in monitoring elderly person healthcare. Multi-task LSTM with auxiliary is found to be better in predicting depressed disorder.

4) THE HYBRID DEEP LEARNING FRAMEWORK FOR DEVELOPING SMART DEVICE TO DETECT ELDERLY BEHAVIOURFOR ELDERLY HEALTHCA RE
This section discusses the hybridization of deep learning algorithm to develop smart devices for monitoring elderly healthcare. For example, [29] developed a hand gesture recognition based on RestNet-50 and support vector machine (SVM). The ResNet-50 is used for the extraction of features while the SVM perform the classification task. The propose hand gesture system is for the elderly person with problem of voice or deaf-mute with difficulties in communication.
The hand gesture of the elderly can indicate request for water, going to toilet, seeking for help, medication, and meal. The propose hand gesture recognition system was not compared with the classical hand gesture system, as such, it is difficult to measure it is effectiveness. In [30], investigation for the effectiveness of CNN, LSTM, CNN-LSTM, and autoencoder-CNN-LSTM in predicting abnormal behaviour of elderly people in smart home was conducted. The experiment is performed on two different datasets namely, simulated activities of daily living (SIMADL) collected by OpenSHS, and MobiAct a public dataset for elderly people abnormal behaviour in smart home. It is found that the CNN-LSTM performs better in predicting abnormal behaviour using SIMADL dataset while autoencoder-CNN-LSTM performs better on MobiAct dataset. The effectiveness of the algorithm depends on the datasets used.
[31]hybridized CNN-LSTM for a mobile enabled fall detection (MEFD) framework to enhance the accuracy of fall detection for the elderly in homes and hospital. MobiAct public datasets were used to train the CNN-LSTM offline whereas real life Smartphone-sensor-enabled datasets were used to detect elderly fall online. The CNN-LSTM model achieved better accuracy compared with the LSTM which is a constituent of CNN-LSTM. [36] hybridized genetic algorithm and dynamic time warping (GA-DTW) for video call, whereas separate fast region based CNN ( Faster R-CNN) (ResNet) handles indoor object detection and automatic health data collection for a remote health care system based on moving robot for the elderly in a smart home. As the DTW finds the best path to reflect the relationship between reference template and speech, the GA-DTW finds the best matching path. The result obtained shows that the GA-DTW and R-CNN achieved a better accuracy compared to the DTW/Schuldt/Dollar/Niebles/Jhuang and R-CNN/Fast R-CNN.
[50] hybridized deep and shallow algorithms for the detection of postural control in young and elderly adults. The deep algorithms (VGG-16, VGG-19, AlexNet, ResNet-50, and DenseNet-201) were used to extract image features while the shallow algorithms (logistic regression (LR), SVM and naïve Bayesian (NB)) perform the classification. The result shows that the hybrid of the VGG-16 and SVM achieved the best performance followed by the VGG-19 and SVM. In [29] hybridized CNN and SVM for the elderly based health care system. The CNN does the image features extraction while the SVM does the hand gesture classification. The result obtained shows that the CNN-SVM achieved a recognition rate of 96.62 %. [32] hybridized VGG-16 and LSTM for elderly fall detection. The VGG-16 does the feature extraction while the LSTM does the feature classification. The result shows that the VGG-16-LSTM model achieved 0.916 recalls mean. Table  4 present the summary of hybrid deep learning algorithms application. ANN-CNN-LSTM [66], [67], [68], [69], [70], [71] Smart home Walking behaviour detection The ANN-CNN-LSTM model achieved accuracy of 84% and receiving operating characteristics of (ROC) 96% and has the potential to prevent health problems in the elderly [35] CNN, MLP, CAE and DAE SVM, DT, KNN, XGB, [72], [73] and [74] Smart home Fall detection The CNN is the best model with an accuracy of 99.9% and makes fall to be easily detected [33] hybridized CNN and deep reinforcement learning (CNN-DQN) for fall risks reduction for the elderly. The CNN analyses the elderly fall risks using preparatory data regarding past incidents and accidents whereas the DQN which operates similar to Q-learning algorithm controls mobile robot according to result of the fall risks analysis to reduce slip induced fall for the elderly. The result obtained showed that the CNN-DQN based agent surpasses the rule based agent. [34] hybridized ANN, CNN and LSTM (ANN-CNN-LSTM) for walking behaviour detection for possible health problems prevention in the elderly. As LSTM handles recurrence and learning of dependencies not only in short but also long term, the CNN uses the pooling layer for image map extraction and then the ANN does the detection of the walking behaviour, the result obtained shows that the has the potential to actually detect walking behaviour and prevent potential health problems in the elderly. [35] proposed deep learning algorithms (CNN, multi-layer perceptron (MLP), hybrid of MLP-auto-encoder called dense auto-encoder (DAE) and hybrid of CNN-auto-encoder known as convolutional autoencoder (CAE)) algorithms for light weight neural network elderly fall detection. The CNN does the feature/image extraction whereas the CAE and DAE do the fall detection by condensing input signals into representation of falls. The result obtained shows that the CNN alone supersedes the conventional methods: SVM, decision tree (DT), k-nearest neighbour (KNN) and extreme gradient boosting method (XGB); other light weight neural networks: MLP, (DAE) and (CAE) and base line models: [72], [73] and [74].

5) OTHER DEEP LEARNING ARCHITECTURES
Some of the deep learning architectures falls in this class because no similar common category. For example, [6] proposes deep learning based IoMT framework for the monitoring of elderly patient through cardiac images. The deep learning based framework cardiac image processing for elderly patient outperform the constant transmission power control. [53] proposes deep neural network (DNN) for the identification and analysis of fall situation for the elderly. The DNN detects and analyse fall situation for possible prevention. The result shows that the DNN model performs better than the traditional resultant acceleration (TRA). [54] proposes deep learning model based on [75] for motion detection system in order to prevent fall accidents by the elderly. the result obtained shows that the proposed model achieved an accuracy of 99%. The summary is presented in Table 5. Elderly fall identification and analysis The DNN achieved higher performance compared with the TRA in fall identification and analysis [54] Deep learning Not available Nursing home Motion detection and fall accidents prevention The proposed method achieved 99% accuracy in motion detection and fall accidents prevention

IX. EVALUATION METRICS FOR MEASURING PERFORMANCE OF DEEP LEARNING BASED SMART DEVICE FOR ELDERL Y HEALTHCARE
It is critical to measure the performance of the deep learning based smart device develop for supporting the healthcare of the elderly person. Performance metrics are used to measure the degree of effectiveness or efficiency of a smart device or algorithm/model. Depending on what type of the performance metric is used, it provides information on the performance of the models/algorithms or methods. It shows the advantage or weakness of a propose method/algorithm to the already established methods. Table 6 presents different performance metrics used in evaluating the performance of different deep learning based smart device for supporting elderly healthcare service. Different performance metrics are used by different project, justification for chosen the metrics are hardly provided in the project. We will only provide basic information to the major performance metrics used by different researchers (Eqns. 4 -12).
Accuracy: is the extent to which a particular measurement or calculated value is in conformity with the actual one or suits a defined standard.
We may equally consider precision using the explanation from the recall section; (10) Receiver operation characteristics (ROC): ROC it is a representation that shows how variations do occur between entities. It is associated with the chance for occurrence of event say, which translates as, the mark obtained by a non-regular student is the same or less to the defined score. Therefore, there exists a false alarm rate which represents the partition of the non-regular students total population, which eventually falls in the wrong assumption upon the application of standard rule given by parameter s. ROC equation, can be represented as:

X. LEARNING PARADIGM FOR TRAINING THE DEEP LEARNING TO DEVELOP THE SMART DEVICE
Learning is the ability to acquire knowledge, idea or concept through study, training, being taught or experience. The machine learning paradigms are ways through which deep learning algorithms learn from experience or training and predict or determine future result or outcome. The training process is usually accompanied by a range or number of datasets. The different types of machine learning used in training deep learning algorithm to develop smart devices for supporting elderly healthcare in smart environment are as follows: Supervised learning: it involves learning a function with an input-output relationship in the form of input-output pairs. Each pair consist an input object called a vector and output value known as the supervisory signal [76]. Unsupervised learning: is a machine learning approach where the model does not have any pre-assigned labels or scores for the training data. For the unsupervised, the model or machine will just determine a natural occurring behaviour with respect to the training dataset, for instance clustering so that dataset are grouped based on certain classes meeting a particular criteria or characteristics [76]. Semi-supervised learning: combines both a little part of supervised learning and that of unsupervised learning. This means that some datasets are labelled while others are unlabelled, by so, an algorithm might be able to achieve a reasonably higher accuracy than just ordinarily using it on supervised or unsupervised learning [77]. Federated learning: is also referred to as collaborative learning, it is a machine learning approach that enables training of algorithms across multiple decentralized edge devices or servers holding local data samples without exchanging them by maintaining critical concerns such as data privacy, data security, data access rights and so on. Reinforcement learning as a machine learning approach involves the use an agent to learn from the environment by perceiving and interpreting it and take the necessary actions by trial and error. This usually accompanied by a reward for the desired behaviours while punishment for the undesired behaviours [78]. Multitasking learning: Multitask learning can be referred to as parallel learning that enables the learning of more than one task at a time. Normally, the components learned in one task helps in the learning process of another task [79,80]. Transfer learning: it is learning approach that involves the acquisition of knowledge or experience by solving a particular problem and then apply the acquired knowledge or experience in solving a different but related problem [81]. Table 7 presents the learning approaches used in different studies to develop deep learning based smart device for healthcare support to the elderly person as found in the literature.  T echnology/smart device [43] Radar sensor [44] Wearable sensor [29] Hand gesture recognition system [3] Projection augmented reality [30] Behaviour monitoring systems [6] IoMT [31] Smartphone based accelerometer sensor [46] Angel assistance system [47] Fall detection system [48] Not applicable [49] Fall detection system [36] Remote health care system [50] Detection of postural control system [29] Hand gesture classification system [51] Multi-domain activity classification [32] Fall detection system [33] Accelerometer/Radio sensor/stereo camera and laser range finder [34] Accelerator, temperature and pressure sensors [35] Accelerometer sensor [59] Smart hospital [53] Wearable (pocket) device using accelerometer sensor [54] Sensor panel and controller [55] Sensor

XII. DATASETS
Data play's critical role in machine learning research. Thus, this section is dedicated to discuss the category of data required in developing smart devices based on deep learning algorithms for supporting elderly healthcare. Different approaches of collecting different type of data exist in the literature. The focus of this paper is on the data collected for the purpose of developing smart device based on deep learning algorithms. Table 9 presents different method used for data collection, category of data and equipment/devices used for experiment in different projects that developed smart device based on deep learning to support elderly person healthcare. There are different categories of the datasets: Real world data are dataset collected from the implementation of a real world system. The data obtained can be said to be operational data because they are generated as the system works or operate during actual implementation.
Benchmark data are performance standards data or software metric [83]aims to compare different tools to identify the level of performance of technologies or model. Synthetic (simulation) data are collected from the mimicking of the real world scenario/instance to predict or determine future situation [84]. Real world data has advantage over the benchmark and synthetic datasets because it captured the real dynamics in the real world environment. However, it is expensive, time consuming and tedious to collect the real world data because it involved the us e of real equipment/devices to collect the data from a population in the real world. It requires expensive data engineering compared to the benchmark/synthetic datasets. On the other hand, benchmark and synthetic data has advantage over the real world data in terms of easy collection, less expensive as it doesn't require physical equipment for experiment and it is not time consuming and tedious to collect. Different device/equipment and collection method were used in different project to collect the data for the project.

Real world
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3186701 [44] Real time Published studies Real world [29] Experiment Hand gesture recognition system Real world [3] Experiment Projection augmented reality Real world [45] Quantitative Questionnaire Real world [30] Simulation and experiment OpenSHS, and MobiAct Synthetic and real world [6] Experiment NICTA published [85] Real world [31] Experiment MobiAct and Public Real world [46] Experiment Fall video recognition Real world [47] Experiment Video surveillance and depth maps/wireless accelerometer Real world [48] Quantitative Questionnaire Real world [49] Experiment Video surveillance environment Real world [36] Experiment Remote health care system Real world [50] Experiment Kinect device Real world [29] Experiment Microsoft kinect sensor Real world [51] Experiment Frequency-Modulated Continuous Wave (FMCW) radar Real world [32] Experiment RGB videos (ImageNet and UCF-101) Real world [33] Experiment Caltech 101 Real world [34] Experiment Footwear (Smart insole) Real world [35] Experiment SisFall datasets (smart watch) [86] Real world [59] Experiment ImageNet datasets [52] Real world [53] Experiment Wearable device using Bluetooth low energy signal Real world [54] Experiment Sensor panel and controller Real world [55] Experiment Sensor Real and synthetic

XIII. REAL WORLD APPLICATIONS
For readers to appreciate the bridge between theory and practice in developing smart device based on machine learning to support elderly person healthcare, case studies are presented in this section. Real world application is the implementation of basic research outcome to solve a problem in the society focusing on a particular area, person, group or situation. Table  10 present the real world applications of smart devices develop based on machine learning to support the elderly person healthcare in a smart environment. It is found that the hybrid deep learning algorithms always performs better than the constituent algorithms in terms of performance effectiveness of the smart device.
In the case of learning paradigm, most of the research adopted supervised learning because of the nature of the research that require both input and output data for modeling the deep learning algorithm to develop smart device for supporting elderly person healthcare. From the view point of data collection, mostly researchers used experimental data collected from real world scenario. The performance of the smart device developed based on real world experimental data is likely to be more effective in real world than the smart device developed based on benchmark/synthetic datasets. This is because the experimental data capture the dynamics of the real world characteristics unlike the benchmark/synthetic that it has been clean and lacks current real world dynamics. It is better to develop a deep learning based smart device for supporting the elderly person healthcare using the real world data because it has the characteristics of the dynamics of real world situation, therefore, when deployed in the real world it has high chances of succeeding to perform its function in the real world environment as expected. Unlike the benchmark and synthetic datasets that lacks the dynamics of the real world scenario compared to the real world data. When benchmark or synthetic data is used to develop smart device based on deep learning algorithm to support the elderly healthcare, it is not necessary that the smart device can work as expected in the real world environment because of the dynamics in the real world activities. In the next section, we will discuss the challenges of the current methodology and suggest new perspective for solving the challenges in the future.

XV. OPEN RESEARCH CHALLENGES AND FUTURE DIRECTION
Despite the fact that the deep learning algorithms have been successfully applied in developing smart devices for supporting the elderly person healthcare, there are still challenges that need to be resolved to further improve the healthcare of the elderly persons. In this section, the challenges and possible future direction are discussed to give researchers alternative ways of solving the challenges. It has been found from the review that most researchers focus on detecting the behaviour of the elderly persons via deep learning based smart device in indoor scenario such as smart home, smart nursing home, smart clinic, and smart hospitals. This definitely pose a challenge on the understanding of the behaviour of the elderly persons in the outdoor scenario because the behaviour of the elderly person in the outdoor is not fully understood, as such, providing healthcare support in the outdoor is extremely difficult. We suggest that researchers in the future should put into consideration the behaviour of the elderly person in the outdoor scenario. This is because the analysis of the user behaviour in outdoor can give the full understanding of the elderly person healthcare [30].Therefore, for effective and efficient smart device for supporting the elderly person healthcare, the future deep learning based smart device for the elderly persons should be developed based on data from both indoor and outdoor scenarios to provide effective healthcare support to the elderly person in the society, thereby, ensuring safety in both indoor and outdoor. Many deep learning based smart devices were developed for the detection of elderly person behaviour such as falls detection, facial expression detection, gesture detection, etc.
There is difficulty for the smart device to accurately differentiate between the unintentional behaviour and intentional behaviour. The smart device is mixing intentional and unintentional behaviours of the elderly person. For example, an elderly person may decide to laydown intentionally on the floor of the house, in such a case, the smart device may misclassify that behaviour as falling down on the floor. This is reducing the effectiveness of the smart device and may turn out to be nuisance for the elderly person.
In the future, we propose researchers to develop deep learning based smart device for the elderly person by gathering sufficient data on intentional behaviour related to falls, facial expression, depressed disorder, etc. This could improve the effectiveness of the smart device in accurate classification of the elderly person behaviour. It is evident that computer vision is playing a critical role in developing deep learning based smart device for supporting the elderly person healthcare. In computer vision, image processing is a core component. Altering the background of images may likely prompt poor performance of the deep learning algorithm when modelling. [49] argued that the limitation in image processing is that, altering the background video images like shadow and movable objects may leads to performance degradation of a model. We suggest researchers in the future to avoid data engineering that will degrade the performance of deep learning algorithm.
In the papers published on developing deep learning based smart devices for improving elderly person's healthcare, the state-of-the-art accuracies achieved by different deep learning architecture are presented in Table 11, for uniformity, we consider only projects that used accuracy as performance metric. The type of the deep learning architecture used in each of the study can be seen in Tables  2 -5. It is clearly indicated that there is room for improvement as none of the study achieved 100% accuracy. The percentage improvement required by each of the study is provided in Table 11. How to achieve 100% accuracy remain an open challenge. Researchers can consider different deep learning architecture or modification to investigate the same problem to find out if the accuracy will be improved or achieved 100%. 97% 03% Need improvement [46] 98% 02% Need improvement [48] 88% 12% Need improvement [49] 99.98% 0.02% Need improvement [36] 97.14% 02.86% Need improvement [50] 97% 03% Need improvement [29] 96.92% 03.08% Need improvement [51] 97.20% 02.8% Need improvement [34] 84% 16% Need improvement [35] 99.94% 00.06 Need improvement [59] 95% 05% Need improvement [54] 95% 05% Need improvement [55] 87% 13% Need improvement It has been observed from the performance metrics analysis as shown in Section 6 that different studies used different performance evaluation criteria depending on the study. Similar findings were reported in mobile phone authentication system [91,92]. This signified that this research domain lacks uniform performance metrics in evaluating the efficacy of deep learning based smart device.
As this is a healthcare domain, it is imperative to have a uniform performance evaluation metrics as healthcare is universal in nature. As such, universal evaluation metrics for the deep learning based smart device for supporting elderly person healthcare should be proposed by researchers in the future to avoid inconsistency in passing judgement on the performance of deep learning based smart device for elderly person healthcare.
Performing context-aware operations based on the behaviour of the elderly person and its environment is extremely difficult when using single camera. Single camera has no capability for the context aware operations because it has the limitation of gathering images within the house and the environment. To deviate from this limitation of single camera, it is suggested that in the future, multiple cameras should be mounted for accurate capturing of the context aware for the elderly person in both inside the house and the environment [3]. We suggest the development of deep learning based context-ware smart device for the elderly person to provide effective healthcare support. The functions of home appliances are becoming complicated as technology improves likewise operating the appliances becomes complex. As a result of that, those appliances can be difficult for the elderly person to operate at home. In addition, physical operation of the appliances by the elderly person can be difficult and worrying. For example, opening or closing doors and windows [3]. We suggest that adaptive deep learning based smart device should be developed for the purpose of controlling the home appliances of the elderly person without requiring the elderly person to physically operate the appliance or use remote control.

XVI. CONCLUSION
In this paper, we intend to present comprehensive review on the adoption of deep learning algorithms to develop smart devices for supporting the elderly healthcare. The paper presents review from both technical and applications perspective for deep learning in developing smart devices for elderly healthcare support. New taxonomies were created, case studies, synthesis and analysis were presented. The paper shows bridging the landscape of research for the intersection between smart environment and deep learning in developing smart device for supporting elderly healthcare in smart home, smart elderly nursing home, smart clinic and smart hospital. Present challenges and future research directions that can help more practical works in the future were outlined and discussed.