IoT Based Approach for Load Monitoring and Activity Recognition in Smart Homes

Appliance load monitoring in smart homes has been gaining importance due to its significant advantages in achieving an energy efficient smart grid. The methods to manage such processes can be classified into hardware-based methods, including intrusive load monitoring (ILM) and software-based methods referring to non-intrusive load monitoring (NILM). ILM is based on low-end meter devices attached to home appliances in opposition to NILM techniques, where only a single point of sensing is needed. Although ILM solutions can be relatively expensive, they provide higher efficiency and reliability than NILMs. Moreover, future solutions are expected to be hybrid, combining the benefits of NILM along with individual power measurement by smart plugs and smart appliances. This paper proposes a novel ILM approach for load monitoring that aims to develop an activity recognition system based on IoT architecture. The proposed IoT architecture consists of the appliances layer, perception layer, communication network layer, middleware layer, and application layer. The main function of the appliance recognition module is to label sensor data and allow the implementation of different home applications. Three different classifier models are tested using real data from the UK-DALE dataset: feed-forward neural network (FFNN), long short-term memory (LSTM), and support vector machine (SVM). The developed activities of daily living (ADL) algorithm maps each ADL to a set of criteria depending on the appliance used. The features are extracted according to the consumption in Watt-hours and the times where they are switched on. In the FFNN and the LSTM networks, the accuracy is above 0.9 while around 0.8 for the SVM network. Other experiments are performed to evaluate the classifier model using a new test set. A sensitivity analysis is also carried out to study the impact of the group size on the classifier accuracy.


I. INTRODUCTION
Nowadays, the applications of smart home concepts and home energy management systems (HEMS) have been gaining increasing attention in the research community due to many advantages they offer. These technologies aim to facilitate users' operation and management of household appliances to operate automatically and optimally. Furthermore, they represent a crucial step in achieving energy efficiency. To build such management systems, it is necessary to identify and control the energy consumption of major The associate editor coordinating the review of this manuscript and approving it for publication was Moussa Ayyash . appliances in the household responsible for a higher electrical consumption [1].
The identification of appliance usage opens the door for the implementation of a series of useful applications. Among them, demand response (DR) and load planning programs focus on analyzing individual load levels in homes or buildings. This analysis enables the possibility of identifying less efficient or malfunctioning devices and implementing the appropriate actions intended for reducing consumption. In this context, consumers become a key factor; they not only participate effectively in the sustainable smart grid system, but they can also have a direct feedback on the statistics concerning power consumption in real-time [2]. Additional VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ useful information could also be inferred from appliance data such as consumers' behavior patterns, including occupation, sleep patterns, and other activities. These activities are commonly known as activities of daily living (ADL), with applications both in the energy domain and in other fields, ranging from commercial services (e.g., customer profiling and targeted marketing) and legal sector (e.g., monitoring of curfews and detection of illegal activities) to remote healthcare monitoring for elder people living alone [3]. To contribute to the development of an efficient HEMS, it is necessary to carry out a process that allows identifying and monitoring main loads in the household. The methods to manage such processes can be classified into two categories: methods based on hardware and those based on software, as shown in Fig. 1. On the one hand, software-based methods include measurements from only a single point of sensing (smart meter device). These methods, commonly known as non-intrusive load monitoring (NILM), offer an attractive solution, essentially due to their low-cost implementation since they only need a single point of detection. Although these solutions have centered the attention of most studies in the field for the last five years, they have shown less precision and greater difficulty for implementation in real scenarios compared to hardware-based methods. NILM algorithms are mostly based on event detection, sampling the aggregated signal captured by smart meters to obtain individual profiles of electrical appliances. The aggregated signal can be very noisy, and only a few electrical appliances could be detected, depending on the sampling frequency. Even with advanced artificial intelligence (AI) algorithms, it could be possible to monitor only a few major appliances: e.g., oven, washing machine, airconditioner, and electric vehicle (EV) [2], [3]. When facing these kinds of scenarios in terms of the type of appliance used, performance remains inconclusive on different datasets [4]. On the other hand, techniques based on hardware include methods for intrusive load monitoring (ILM), also known as distributed sensing. This technique can be divided into two sub-categories: one that refers to a model in which energy consumption profiles are obtained at device level using submeasurement sensors attached to appliances. The second is smart appliances (SA) which are devices with built-in capabilities to monitor and report their energy usage [2].
Although these solutions can be relatively expensive, they provide higher efficiency and reliability than NILMs. Direct sensors have great potential since they have sensing and control operation of various devices and appliances because they can be co-located (e.g., turning off a light when an occupant leaves a room). An additional benefit is that these methods typically require a less complex solution regarding appliance recognition. The appliance recognition system assigns a label that corresponds to the device or appliance connected to the sensor. Moreover, future load monitoring techniques are expected to be hybrid, combining the benefits of NILM and individual power measurement by smart plugs, smart appliances, and HEMS [5].
Since smart appliances are not widely used due to their high market prices and interoperability issues, distributed sensing becomes an attractive solution. To allow the integration of all electrical devices, a home area network (HAN) is required. This communication network will carry control data generated by sensors attached to home appliances, carrying control commands from the home gateway to the appliances, and from the utility to the appliances registered in the home gateway [6]. Taking all this information into consideration allows thinking of an ILM solution as an internet of things (IoT) platform for load monitoring and its numerous applications. Among these applications, identifying ADLs is a good choice in terms of resident autonomy. One of the most frequent examples is that it has allowed older people to be nursed at home, and it also enabled to build a consumer profile that can contribute to more efficient energy use. Therefore, ADLs are the best suited to be provided as inputs for different home applications [7]. Table 1 summarizes the comparison of both techniques. Compared with the ILM solution, NILM uses only one sensing point (smart meter). Therefore, a communication network that allows data exchange between sensors and the home gateway is not necessary. These aspects have expanded the acceptance of a massive deployment of these NILM solutions; however, the reliability of these systems is still a challenge. Most of the previous research works are related to NILM techniques. However, access to smart meter measurements is still limited and challenging in some countries due to regulation and implementation issues. Furthermore, high-resolution data cannot be achieved with most commercial smart meters today with complexity in setup, data storage, and cost. On the other hand, with the advances in IoT and communication technologies, the ILM solution has become an affordable option to overcome the difficulty of implementing NILM solutions. ILM is a promising approach for the future development of residential load monitoring for different 45326 VOLUME 9, 2021 applications such as home automation, load forecasting, demand response, energy feedback, and health care system. However, different requirements should be considered concerning data resolution, accuracy, real-time, and the number of appliances to be covered. To the best of our knowledge, there is no previous work available of ILM for appliance recognition and ADL classification due to issues of cost, installation, and communication [5], [8].
This paper proposes an ILM approach for load monitoring and activity recognition based on IoT architecture in smart homes. First, appliances are identified by a machine learning (ML) based appliance recognition system. Three different models are tested in this regard: A vanilla feed-forward neural network (FFNN), a long short-term memory (LSTM) neural network, and a support vector machine (SVM) classifier. All three models are compared in terms of accuracy, precision, recall, and F1-score. Then, the best model is used for ADL identification. The ADL identification is an algorithm that maps each ADL to a set of criteria depending on the appliance used. The features are extracted according to their consumption in Watt-hours (Wh) and the times where they are switched on. Experimental results show that the proposed system is an efficient solution for the classification of activities of daily living. The main contributions of this paper are summarized in three main aspects: • A novel ILM solution for load monitoring and ADL identification is developed and analyzed as part of an IoT architecture. This architecture can support other applications, which guarantees overall system scalability.
• Three ML classifier models are benchmarked to be integrated with the appliance recognition system: FFNN, LSTM, and SVM. The objective of appliance recognition module is to label sensor data to allow the implementation of different home applications such as ADL classification.
• The proposed ADL classifier is applied and tested in different experiments employing real data gathered in the UK-DALE dataset.
The remainder of this paper is organized as follows: in section two, a literature review is provided, recent proposals regarding appliance recognition and human activity classification in the context of smart homes. In section three, theoretical aspects of ILM systems are discussed, and in section four, the proposed system architecture is presented. Results of ML models benchmark and classification experiments are presented in section five. Finally, a discussion of the results and the conclusions are provided in sections six and seven.

II. RELATED WORK
There is plenty of work done regarding appliance recognition in the context of smart homes [9]- [13]. In [9], the authors presented an approach for detecting and identifying in-use appliances analyzing low-frequency monitoring data gathered by meters (e.g., smart plugs) distributed in a smart home. The system implements a supervised classification algorithm with artificial neural networks validated using a dataset of power traces collected in real-world home settings. Since the objective was to develop an appliance recognition system, they mainly focus on the application level for experiments. In [10], the authors proposed an electrical device identification model based on three features: energy consumption, time usage, and location. The information enhanced in such features was used to train six different ML classifier models: Random Forest (RF), Bagging, LogitBoost, Decision Trees (DT), Naive Bayes, and SVM. Results showed a high level of accuracy, which represents good performance of the proposed features. In that work, authors focused on standard techniques as the objective was to obtain a neutral assessment of the features. Thus, non-specific applications such as ADL classification were performed. They considered the system as part of a smart grid environment. However, they only centered on application-related issues without giving any information about the infrastructure or the IoT-based architecture to support the system.
A supervised learning classifier was developed in [11] for appliance classification based on its power signature. Besides building an individual appliance metering device, the objective was to create what authors called a ''load library'' of appliance power signatures for training and recognition. The model employed for classification was a K-Nearest Neighbors (KNN), and results have proved that the timing of data acquisition is critical. Even though experimental results showed high accuracy, the authors did not compare the KNN with any other ML model or algorithm. A recent approach in [4] aimed to design and develop an IoT endto-end solution to recognize electric appliances that can operate in real-time considering low hardware cost. Three ML algorithms, K-nearest neighbors (KNN), Decision Tree (DT), and Random Forest (RF), have been implemented for classifying the operating appliances. Authors do not impose any requirements regarding the instant when data collection needs to be carried out throughout the appliance's operational cycle, or the amount of data that needs to be collected before classification takes place. Only using a high-resolution CT-sensor, they guaranteed cost reduction yet obtaining satisfying results. Their implementation, in a laboratory, was described as a data acquisition system that further processed the data for classification. Although they achieve a high classification accuracy, around 95 %, the work did not give any details regarding ADL classification or any other application deployment.
The authors in [12] presented a survey on intrusive load monitoring, which gives details about its implementation requirements. Though the paper only focused on summarizing the main ILM techniques proposed in the literature, the authors have defined the architecture, feature extraction, and ML models typically used for ILM applications. That work allows envisioning the ILM systems as an IoT platform with more opportunities for enhancing different smart home applications. Regarding the classification of daily living activities, authors in [3] presented a deep learning approach based on multilayer feed-forward neural networks (FFNNs) that can identify common electrical appliances in a household from a typical SM measurement (i.e., a NILM solution). The performance of this approach was tested and validated using a publicly available UK-DALE dataset. The detected appliances were used to identify householders' activities. These activities are usually referenced as activities of daily living (ADLs). Thus, they developed an ADL classifier to provide useful information to consumers, including detailed feedback on the energy usage and its main contributors, allowing the creation of itemized energy bills. Moreover, the information can then be used to emphasize opportunities for energy saving and costs reduction and identify inefficient and/or malfunctioning home appliances. The proposed classification algorithm could be extended to be used as an ILM solution.
In [13], the authors presented an activity recognition and anomaly detection approach to identify daily activities in a smart home context. The system is a unified deep learning approach bases on a Probabilistic Neural Network (PNN) classifier that processes pre-segmented activities, so there is no need for an appliance recognition system. Then, a H2O autoencoder detects anomalies within each activity class. This system could be implemented on COVID-19 scenarios, as recovery from this virus requires isolation to stop spreading the disease and minimize the risk of contagions. Therefore, a remote healthcare system could help effectively treat patients without hospitalization.
On the one hand, in [5], the authors explained that the development of power electronics significantly improves the accuracy and flexibility of power control, but greatly affects the applicability of NILM methods. Power converters not only allow the power of appliances to be continuously adjusted, but also eliminate harmonics and compensate the reactive power. As a result, features extracted from the appliances will become indistinguishable. Furthermore, the authors agree that future residential load monitoring is expected to be a hybrid form with the combination of NILM, individual power measurement by smart plugs, smart appliances, and HEMS. The author in [14] presented a survey that establishes a base for the development of important applications for the remote and automatic intervention of energy consumption inside buildings and homes. The work provided a theoretical background of the load monitoring methodologies, concluding that it is feasible to have finegrained monitoring and control of appliances using ILM in smart houses to provide healthcare, convenience, entertainment, energy efficiency, and security.
On the other hand, previous research in ADL classification [3], [8], [15] has been based on NILM techniques. In [16], the authors introduced a framework in which the daily activities are detected via a data-driven activity detection approach, using the data provided by a NILM system. They aimed to estimate the personalized appliance usage for different daily activities performed by regular occupants in a building. Experiments were carried out in three single-occupancy testbed apartment units, using a supervised learning model for activity recognition. The authors of [8] modeled a SVM and a random decision forest classifier using data from three test homes. The trained models were used to monitor two patients with dementia during a six-month clinical trial, undertaken in partnership with Mersey Care NHS Foundation Trust. Using the data collected from electricity readings, the technology can accurately identify the use of individual electrical devices in the home and the routine behaviors of people to detect when anomalies occur.
In a recent work [17], the authors examined different ways in which smart energy data could be used in remote health and well-being monitoring. The authors considered three broad application domains: ambient assisted living support, population-level screening and support, and self-monitoring. The report also considered energy-health sector research synergies and opportunities for realizing solutions at scale. It emphasized the potential benefits of smart energy data in supporting the health and care system, giving a complete description of the two main categories in which the research was focused on: NILM and IoT-based methods (ILM).
Other approaches as [18], [19] presented a solution based on IoT, but considering wearable sensors, such as accelerometers and smart devices, and in the case of [20], the authors proposed an intrusive approach based on computer vision techniques: a background subtraction of images, followed by 3D Convolutional Neural Networks. They used a camera to record the video and a processor that performs the task of recognition, which raises privacy concerns and hence, a low opportunity for a massive adaption of the system.
A survey presented in [21], thrives to lead to a fully integrated IoT-based health care system, acknowledging the need to integrate the various IoT services. These applications produce a large amount of data to be handled properly for monitoring. In that sense, cloud computing can take an important role, as it is a promising approach for efficient knowledge processing in the health sector. Another approach [22] presented an overview of sensor fusion technology and explored the relationship between sensor fusion and dense sensor networks. The multi-sensor approach can achieve an impressive result due to the comprehensive description of activities from the sensors deployed in an indoor environment. Recent applications in remote healthcare have reaffirmed the above approaches, proposing innovative solutions in this regard. The authors in [23] presented a smart home control platform that offered fully customized automatic control schemes and performed the analysis of historical records of the use of home automation devices, in order to detect residents' behavior patterns through IoT and machine learning, improving the comfort schemes of domestic systems.
A different solution is presented in [24], where authors designed a distributed platform to monitor the patient's movements and the status during rehabilitation exercises. This information can be processed and analyzed remotely by the doctor assigned to the patient. Real-time monitoring of the elderly can benefit from the use of data mining algorithms, namely Support Vector Machine (SVM), from the use of data mining algorithms, namely Support Vector Machine (SVM), Gaussian Distribution of Clustered Knowledge, Multilayer Perceptron, Naive Bayes, Decision Trees, ZeroR, and OneR to gain insights into the data in order to detect and even predict future fall.
Based on the above discussion, it is possible to state that: • Future trends in energy and load monitoring need to feed on IoT technologies in order to achieve state-of-the-art performance.
• To the best of our knowledge, our work proposes a novel solution based on ILM for ADL classification, that allows identifying the daily activities of house occupants in a simple manner, which can be useful for a series of applications.
• Most promising results in IoT for activity recognition have been obtained in remote healthcare applications. There is a limited number of proposals in other domains such as energy consumption understanding, malfunctioning, and anomaly detection. Table 2 gives a summary of the main aspects of interest analyzed in previous research work. The table highlights and compares different research domains regarding general system architecture, appliance recognition model, real-time implementation, and ADL classification. The given analysis can serve as a comparison that allows better understanding and visualizing the proposal given in this work. While almost every paper has focused either on appliance recognition or real-time implementation, this paper covers aspects of overall infrastructure and applications.

III. INTRUSIVE LOAD MONITORING
This section discusses the main concepts of intrusive load monitoring (ILM). The ILM technique is based on measuring the electricity consumption of appliances using a low-end metering device. The applications of ILM could be implemented in both household and building contexts.
The submeters or sensor nodes are typically placed close to the target appliances. The name intrusive is used since extra types of equipment, in addition to smart meters, are needed to identify electrical devices correctly. According to [12], ILM solutions can be categorized into three groups in terms of equipment deployment granularity: • ILM Group 1: This category includes submeters used to monitor households' zone or areas, measuring the consumption after the primary utility energy meter.
• ILM Group 2: This category groups submeters at the plug level to directly monitor appliances connected to the outlet or multi-outlet.
• ILM Group 3: This category consists of submeters directly embedded in the appliances or placed in a dedicated outlet (i.e., outlet for a specific appliance). As previously mentioned, for an ILM deployment, not only sensor nodes are needed, but a home area communication network is also required. Hence, it is feasible to analyze the architecture of such a system from an IoT perspective. Fig. 2 shows a detailed description of different layers for an IoT architecture that could be implemented for load monitoring applications.
• The lower part holds the physical devices layer, in which data is collected to form a data flow that is further sent to the upper layer. This layer is where energy transactions take place. The physical devices layer includes common home appliances (e.g., refrigerators, lamps, iron, microwave, oven, etc.) and major loads such as EVs and heating, ventilation, and air conditioning (HVAC) systems.
• The second level is the perception layer responsible for data acquisition since many sensors and actuators are deployed to gather information. For this process, depending on the target application, the electricity consumption may vary [12]. A critical parameter to be considered is data sampling, classified into high-speed and VOLUME 9, 2021 low-speed sampling. Data sampling values above 1 kHz are considered high, increasing the complexity of data storage, transmission, and processing compared to lowspeed sampling. Therefore, high-speed sampling is often considered far from being a practical approach for large-scale applications [25]. It is possible to measure both time-dependent and/or frequency-based features. Typical time-dependent features are active power (P), reactive power (Q), voltage (V), current (I), and V-I trajectories. From these features, others can be obtained, such as the complex power and apparent power. Popular frequency-based features include Discrete Fourier Transform (DFT) and Fast Fourier Transform (FFT), but these last features are preferred when a high sampling rate is considered [12].
• The third layer is the communication network layer, which enables the integration and communication among different devices. The technology adopted for load monitoring depends on the location of the server at which data is sent. In HAN communication architecture, data servers are often at the edges of the network. Thus short-range wireless communication is preferred where the most popular standards are Bluetooth, Wi-Fi, and Zigbee [5].
• The middleware layer mediates the interaction between IoT devices and software applications. Since computational requirements for this layer are very high, most referenced solutions sit at the domains of cloud and fog computing (e.g., VM-based: MagnetOS and TinyVM, databased: SINA and TinyDB, service-based: LinkSmart, SenseWrap, FIWARE and AutoSec, and Fog node-based: EMCP and eclipse Kura). Therefore, this layer acts as an architecture abstraction between the user interface and all deployed devices. Functional requirements, including data management, data storage, big data analysis, real-time data analysis, and deep data analysis with AI should be considered [25]. • At the top of the architecture lies the application layer. It refers to the specific services dedicated to users. Thus, this layer defines a variety of applications in which ILM solutions could be performed. These solutions feed on the energy usages of appliances that are measured separately by submeters. Therefore, appliance-level load data needs to be directly labeled [5].
Among the most common ILM applications are understanding local and global energy consumption, evaluation and simulation of NILM environments, human activity recognition, and appliance localization. Details concerning each of these applications can be found in [12]. This work focuses on human activity recognition since this information can provide insights into household occupants' behavior, which has several applications in the energy domain and other fields [3]. For example, understanding human-building interactions (HBI) at the appliance level in a smart building context can improve demand-supply balance efficiency, identifying the patterns of using different flexible loads (e.g., EV) in a building. In addition, to provide a statistical measure to evaluate the benefits of engaging end-users in adaptive management of loads (e.g., engagement in DR programs) [26].
In healthcare monitoring, assistive technologies are often proprietary and tailored to specific application scenarios. By identifying ADLs, it is possible to provide early intervention for patients with a mental condition such as dementia [8] and monitor the wellbeing of elder people living alone [15]. The concept of ADL was first proposed by Katz [27]. The authors demonstrate that age-related diseases directly impact ADLs by creating ADL indices to measure the dependence level of a person. Fig. 3 shows the proposed framework for load monitoring and activity recognition system. Before performing ADL classification, it is necessary to recognize appliances properly. For this matter, machine learning (ML) and deep learning (DL) models have proven to be very efficient [7], [11], [12]. Specifically, supervised learning has allowed the correct generalization in front of unseen data [7], [11]. In Fig. 4, a typical ML-based framework for appliance recognition is shown. Feature extraction block provides a vector of features that extracts individual characteristics of each sample (e.g., the shape of the consumption profile, maximum power value, number of transitions). The ML-based block is presented as a black box, meaning that it is possible to implement several ML models, such as Support Vector Machines [12], KNNs, DT, RF [4], FFNN, and LSTM networks [9]. The output of this system will be the target appliance class, in other words, the type of each appliance (e.g., kettle, boiler, washing machine).

B. LOAD CLASSIFICATION
George Hart, as part of his work of the early 90s [28], states that appliances can be classified according to their operational state as follows: • Type 1: Devices with only two operational states (ON/OFF appliances), e.g., toaster, kettle, etc.  • Type 2: Multistate devices could be represented by finite state machines (FSMs), e.g., washing machines, refrigerators, heat pumps, etc.
• Type 4: Permanent consumer devices, which remain actives for a long time (weeks or days), consume energy at a constant rate, e.g., TV receivers, telephones set, smoke detectors, etc.
Continuously variable devices are considered extremely difficult to recognize since they can exhibit significantly different patterns depending on their usage [4]. Based on their consumption patterns, the class of permanent consumer devices can be understood as a sub-class of ON/OFF appliances [5]. FSMs or multistate devices have a variant number of states, and so the consumption of each appliance. This work processes all appliance profiles in the same manner, analyzing the load as a general and this procedure can be applied to the previous four types of data. The feature extraction module that makes it possible is further explained as part of the proposed system.

IV. PROPOSED SYSTEM
In this section, the proposed ADL classification system based on ILM techniques is presented. As previously discussed, the ADL classification system operates at the application layer of the IoT platform, depicted in Fig. 1. Such a system allows identifying the daily activities of house occupants in a simple manner, which can be useful for various applications. In more specific, the advantages of the proposed system can be analyzed from three points of view: • Architecture: As part of an IoT infrastructure, this system can be coupled with different home applications that contribute to implementing a sustainable smart grid and more efficient energy usage.
• Classification: This process requires providing a label for each sensor. This label corresponds to the device or appliance connected to the sensor. Machine learning models, such as FFNN, LSTM networks and SVMs allow labeling the data and contributing to the correct classifier generalization, which implies a proper performance in front of unseen data and removes the need for manually setting a label for each sensor.
• Consumers: They will be aware of their electrical consumption and activities and act accordingly to use energy efficiently. The size of each group was set to 105 samples, and to analyze these power measurements from the first nonzero sample, as suggested in [9], allows that those devices with a long duration, such as washing machines or dishwashers, can be represented with a full-length load profile. Three different models were tested for ML-based classification, including two neural networks: a FFNN and a LSTM, and a SVM classifier implemented instancing the SVC class provided by the scikit-learn library referenced in [29].
Details of these three models will be discussed in the next subsections. In the three cases, the model was trained with the same number of training samples in each class. After every device is identified, these labels along with the timestamps of every sample are used to recognize common activities of consumers. The specific algorithm implemented for this process is further discussed in subsection IV-E.

A. FFNN CLASSIFIER
A feed forward neural network or Multilayer Perceptron (MLP) is a machine learning model where information flows from the input through intermediate computations to finally reach the output. There are no feedback connections, meaning that none of any layer's outputs is fed back into itself. When determining a FFNN model configuration, no specific procedures are established to choose the number of hidden layers and neurons units. Too many parameters will conduct to overfitting, which affects model generalization and performance; on the contrary, a very simple model tends to underfit and thus, more features need to be extracted from  data. The number of hidden layers and neurons is directly proportional to system requirements such as computational power, time, and labeled data [3]. The proposed FFNN classifier architecture is represented in Fig. 6. It consists of 10 input neurons as the same number of features are extracted from sensor data. Then, two hidden layers of 500 and 100 neurons, respectively, alternated with a dropout layer that reduces the overfitting and achieves higher accuracy. The number of neurons in the output layer depends on the number of classes into input data that will be classified. This paper considers five different appliances; therefore, the output layer has only five neurons.

B. LSTM CLASSIFIER
A long short-term memory network is a type of recurrent neural network (RNN) model that employs a memory cell with gated inputs, outputs, and feedback loops. Its main contribution is that it allows to address the vanishing gradient problem, very common for RNNs, where gradient information disappears or explodes, and it is propagated back through time. Thus, this kind of system is reported to be better suited for time series data [30]. The proposed LSTM classifier architecture model can be seen in Fig. 7. Similar to the FFNN classifier, the number of cells in the input and output layers is 10 and 5, respectively, since the feature vector length is 10 and the number of target classes is 5. Then, an intermediate dropout and another LSTM hidden layer were included to improve system generalization and to ensure capturing nonlinearities of the input data.
C. SVM CLASSIFIER Support vector machines are a ML technique that bases on the structure risk minimum principle of statistical theory. It can be used for both classification and regression problems, and its main advantage lies in its working principle. It constructs a hyper-plane or set of hyper-planes in a high or infinite dimensional space. A good separation for hyper-planes implies a larger distance to the nearest training data points of any class, which is often referenced as functional margin. The larger the margin, the lower will be the generalization error of the classifier.
To be adapted to nonlinear data, the only requirement is to change the kernel. Kernel function aims to take input data and transform it into the required form. In this case, default 'rbf' function was selected for implementation, which means that Radial Basis Function acts as kernel. This is a real function whose value depends only on the distance from the origin or as an alternative on the distance to some center [31].

D. TARGET APPLIANCES
Target appliances were selected considering the daily activity that could be inferred from its use. To identify various activities a set of FSMs appliances was analyzed, but the same analysis could be performed for every device in a household. The only requirement is to attach a sensor node next to each target device. Selected appliances, along with some useful  metadata available in UK-DALE dataset are summarized in Table 2. In this work, only appliances from House 1 were used for training.

E. ADL CLASSIFICATION ALGORITHM
The proposed ADL classification algorithm maps each ADL according to a set of criteria based on appliance usage: their power consumption and the timestamps when they are switched on. Following this algorithm, it is possible to detect a total of eleven ADLs: Washing the dishes after breakfast, Washing the dishes after lunch, Washing the dishes after dinner, Baking food for breakfast, Baking food for lunch, Baking food for dinner, Ironing, Drying hair, Doing laundry, sleeping and unoccupied.
The sleeping and unoccupied ADLs are identified by an absence of detections of major appliances during the hours when the householder is most likely to be ''asleep'' or ''out of the house'' for the night or day hours, respectively. Continuously variable and permanent consumer devices (low power draw appliances) are, according to [4], extremely difficult to recognize since they can exhibit significantly different patterns depending on their usage or they can be regarded as a sub-class of ON-OFF appliances. In the case of the proposed algorithm, we assumed a scenario with only five multistate (FSM) appliances, and therefore, no other loads are considered in the decision for non-activity.
The pseudo-code for the classification algorithm is represented in Algorithm 1. The proposed algorithm analyzes a time-window of sensor data. After the classifier model for appliance recognition is loaded, active power and samples timestamps of every target device is ridden. Then, several groups of samples are formed and for each one of them features are extracted. The array of features obtained is stored in a CSV file to be further inputted to the ML classifier model. Depending on the size of the time-window, it will be the total of groups formed. The group size can also be modified, but for training, it was set to 105 samples. From each vector of features, an activity with its corresponding date and time is returned. If no activity is detected (it could be a vector of zeros or a vector from a standby mode), then no activity is registered. In these cases, if the timestamps indicate a night hour, then the house is classified as unoccupied; otherwise, sleeping is returned. The criteria for classifying ADLs should be customized to each individual household, and the algorithm could be adapted for the most common daily activity.

V. RESULTS
This section discusses all experiments and results obtained with the designed system. First, the three ML-based classifier models are compared in terms of accuracy, precision, recall, and F1 score, and then a sensitivity analysis is performed considering different groups.

A. EXPERIMENTS 1) UK-DALE DATASET
In order to perform every experiment, real-time data is needed. Since no customized or proper data is available, the United Kingdom-Domestic Appliance Level Electricity (UK-DALE) dataset [32] was employed. It contains aggregated and disaggregated appliance data for five houses in London, England, over several years. The dataset has two types of resolution data available: 6s and 1s. The data is stored in CSV files. The first column is a UNIX timestamp, and the rest can vary depending on the resolution data used. For the 6s data, the second column in each CSV file is a nonnegative integer that records active power from the individual appliance. The data gathered were obtained through smart plugs attached to individual appliances [3] to measure their energy consumption. Only House 1 was considered during training, mainly due to the fact that each house in the dataset has a different number of appliances, and to identify different daily activities, House 1 offers more possibilities. The same appliances from House 5 were used to test the performance in front of new data. The results for this last experiment will be discussed later in this section.

2) CLASSIFIER MODELS
As previously mentioned, three different classifier models were employed: a vanilla neural network (FFNN), a LSTM network, and a SVM classifier. The three models were implemented in Keras running over Tensorflow. Random parameters were set to initialize the system, and depending on the results, they were further readjusted to finally obtain the definitive configuration detailed in the previous section. The confusion matrix for each of the three classifiers is presented in Fig. 8 (a-c). As it can be seen, the best results are obtained with the FFNN classifier with only three misclassified samples. A higher number of incorrectly classified samples are obtained with the SVM classifier. LSTM classifier achieved a slightly lower accuracy than the FFNN, but still, it is above 0.9 using less parameters. Table 3 shows a comparison for each class of the classifiers in terms of accuracy, precision, recall, and F1-score. For the three cases, misclassifications are related mostly to hair dryer, dishwasher, and oven. For the SVM classifier, the iron obtained the best results as it returned a 100% precision, recall and F1-score. As for the LSTM, the best results in terms of precision, recall and F1-score can vary depending on the model. Higher precision is obtained for the iron, washing machine, and oven; however, lower recall is attributed to the latter, achieving a 0.86957 F1-score for this appliance. In the case of the FFNN classifier, in all cases, F1-score was above 0.9.
Given these results, the FFNN was stored and further used in the activity recognition module. A second experiment was performed to test the model generalization in front of new data. The same five appliances were selected for a different house, in this case, House 5. The confusion matrix for this experiment is shown in Fig. 8 (d). From this figure, it is evident that there is a decrease in all metrics, meaning that if the system is implemented in different houses, it will be necessary to collect data and then retrain the classifier. For this scenario, the accuracy decreased to 0.61.

3) SENSITIVITY ANALYSIS
A sensitivity analysis was performed to analyze the impact of the group size on the accuracy of each model. Fig. 9 shows the results corresponding to this test. Three different values of the group size: 50, 75, and 150 were set and compared to the initial 105. From the graph in Fig. 9, it can be seen that if the group size is decreased, the accuracy is also diminished, but more samples will be obtained for training. On the contrary, if the group size is incremented it can either improve or decreased depending on the model employed. For the LSTM classifier, with a 150-group size, no misclassifications are obtained achieving a 100% accuracy, but with a lower amount of training samples, which could not be beneficial for its behavior in front of unseen data.
In all cases, the algorithm can recognize simultaneous activities within the analyzed time window. It is also important to note that simultaneous activities can occur while neither is based on the absence of activations (i.e., sleeping, and unoccupied).

VI. APPLIANCE RECOGNITION AND ADL CLASSIFICATION
In this section, a discussion of the results and future work will be presented. Based on the results obtained through different experiments, the system performance is satisfactory. For a certain input window, groups of samples are gathered, and features are extracted from each of them to be later labeled and associated with an activity executed by house occupants. Although the system does not work in real-time, it can identify simultaneous activities as it analyses a sequence of feature vectors independently of the timestamps.
To be used in a different household, the classifier should be first retrained. This happens since houses may have the same type of appliances, but their consumption can vary depending on the vendor or appliance model. Feature extraction influences the fact that different appliances can be modeled as a general and not depending on the type of loads (e.g., ON/OFF, FSM), which means that a TV, a kettle, or a vacuum cleaner will be analyzed in the same way. With a variation in the group size, the accuracy will also be different and some modifications in the model will be needed to achieve a higher score. In order to compare our work with the state-of-the-art techniques, in [9], among the target appliances covered by the authors were a washing machine, an iron, a microwave oven, and a dishwasher. However, they run different experiments to validate the results using real-world data collected in an initial phase of a trial of the Energy@home system. In the case of the ADL classification, our algorithm is similar to the one presented by authors in [3]. Both map each ADL according to a set of criteria based on appliance usage such as the power consumption and the timestamps when switched on. However, authors in [3] used a NILM approach to recognize appliances; therefore, the system needs to disaggregate the smart meter power consumption signal. The two algorithms associate the lack of use of electrical appliances in a certain time-period, specifically during the night or daily hours, with the ''sleeping'' and ''unoccupied'' activities, to represent the period of time where occupants are in sleeping hours or the house is empty, respectively. In the case of [3], the authors used a different algorithm per activity inferred while we present a unified proposal. To compare appliance recognition and ADL classification processes as a general, Table 5 describes the two classifiers concerning the input and output of both models.
Furthermore, Fig. 10 shows a visual representation of the two concepts expressing how they complement each other and how they were conceived for the proposed system. In the case of the proposed algorithm, we assumed a scenario with only five appliances and therefore, no other loads are considered in the decision for non-activity. We based this criterion only on the absence of all appliances' activations. For a practical implementation of the system, this restriction should be modified, and more appliances need to be considered to obtain proper insights into the household occupants' behavior. By testing on real data, such as UK-DALE publicly available dataset, we prove that the proposed classification system performs accurately and, therefore, it can be implemented practically.
Future work will be focused on making the system capable of working in real-time, more appliances can be included to identify a higher number of activities and proper insights VOLUME 9, 2021   into the behavior of the household occupants, which will also make the system suited for a smart home with a considerable number of appliances. Furthermore, future work will study the ways of including special devices such as movement sensors to better distinguish among these situations. Such a system can be implemented in a laboratory environment to gather and process real data that reflects the habits of house occupants. The contribution of this study is significant since there is a great number of applications. To implement the real-time system, a communication network should also be carried out to allow data exchange between the different appliances and the proposed ADL classification system. Therefore, in the next stage, our work will be focused on designing and implementing the complete IoT platform in a laboratory environment.

VII. CONCLUSION
Smart homes aim to facilitate the operation and management of household appliances so that it can be operated automatically and optimally. With the identification of appliance usage, a series of smart grid applications could be carried out such as demand response and load planning. This work presented a framework for an IoT approach able to support distributed sensing and ADL classification system based on ILM in smart homes. ILM is based on low-end meter devices attached to home appliances in opposition to NILM techniques, where only a single point of sensing is needed. This work proposed an ADL classification system that combines state-of-the-art solutions among its different modules. ML models are applied in the appliance recognition module. Specifically, three different models were tested using the UK-DALE dataset: a FFNN, a LSTM and an SVM. Accuracy was above 0.9 for the FFNN and the LSTM classifiers and around 0.8 for SVM. Once appliances are recognized, the ADL classification algorithm infers an activity based on the appliance label obtained and on the timestamps of the samples. To test the performance of the classifier in front of new data, the system was applied for a different house in the same dataset, notably decreasing the accuracy. Results suggest that before having the system in full operation, it might be necessary to retrain the classifier with the new data. Another experiment was performed to analyze the impact of the group size on the ML classifier accuracy. These groups gather a fixed number of samples from which appliances are identified. If the group size decreases or does the opposite, the same behavior can be expected for the accuracy, apart from the LSTM model that increases the accuracy when the group size is bigger. Future work aims to improve the results obtained and implement the system in a laboratory environment.