Unsupervised Recognition of Multi-Resident Activities in Smart-Homes

Several methods have been proposed in the last two decades to recognize human activities based on sensor data acquired in smart-homes. While most existing methods assume the presence of a single inhabitant, a few techniques tackle the challenging issue of multi-resident activity recognition. To the best of our knowledge, all existing methods for multi-inhabitant activity recognition require the acquisition of a labeled training set of activities and sensor events. Unfortunately, activity labeling is costly and may disrupt the users’ privacy. In this article, we introduce a novel technique to recognize multi-inhabitant activities without the need of labeled datasets. Our technique relies on an unlabeled sensor data stream acquired from a single resident, and on ontological reasoning to extract probabilistic associations among sensor events and activities. Extensive experiments with a large dataset of multi-inhabitant activities show that our technique achieves an average accuracy very close to the one of state-of-the-art supervised methods, without requiring the acquisition of labeled data.


I. INTRODUCTION
The ability to recognize the activities going on in smarthomes is a central requirement in several application domains, including healthcare and home automation [1]. As a consequence, a plethora of activity recognition methods have been proposed in the last decades to recognize activities based on sensor data and artificial intelligence algorithms [2]. However, most activity recognition methods assume the presence of a single person in the home, while it is a common situation for people to live together and to concurrently execute activities. Unfortunately, techniques for single-resident activity recognition are ineffective when multiple persons concurrently execute activities in the smart space. Indeed, multiple streams of sensor events generated by the execution of different activities by different persons are treated as a whole by the single-resident activity recognition system. In general, the resulting single stream of sensor events does not match any activity model, thus confusing the recognition system. This fact limits the applicability of single-resident methods to restricted scenarios or to specific user categories.
The associate editor coordinating the review of this manuscript and approving it for publication was M. Anwar Hossain .
A few previous works have tackled the challenging issue of recognizing multi-resident activities [3]. Existing multiresident techniques generally apply two steps-recognition. In the first step, ''data association'' is applied to identify the resident who triggered each sensor event. In the second step, the result of data association is used to recognize the activities of each resident. Activity recognition is generally achieved by probabilistic models, such as Hidden Markov Models [4] and Conditional Random Fields [5], or by other machine learning algorithms.
To the best of our knowledge, all existing multi-resident activity recognition systems are based on supervised learning: they rely on a labeled training set of sensor data and activities. Unfortunately, activity labeling by an external observer may undermine the inhabitants' privacy. Moreover, the acquisition of such datasets incurs high overhead. In the literature, it is reported that labeling one hour of single-user activities may require from 30 minutes to 10 hours, depending on the data acquisition modality, and on the level of detail of annotations [6], [7]. Of course, labeling is even harder when multiple activities are executed concurrently by multiple persons in a smart environment.
In this article, we propose a novel method to recognize multi-resident activities without the need of labeled VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ datasets. Our method relies on a weaker form of data association, which we name ''resident separation'', that consists in determining whether a pair of observed sensor events was generated by the same resident or by different residents. We propose an unsupervised data mining method to perform resident separation based on a a stream of sensor events acquired in the smart-home during the execution of everyday activities by a single resident. Then, we apply knowledge-based reasoning to recognize the multi-resident activities going on in the smart-home considering the mined resident separation model and the stream of observed sensor events. Unsupervised activity recognition is achieved exploiting semantic correlations among sensor events and activity classes, which are extracted by reasoning with an OWL 2 ontology [8].
We have implemented our algorithms, and experimented our system with a well-known dataset of multi-resident activities performed by 26 couples of individuals in a smarthome instrumented with several kinds of sensors. The results show that our unsupervised method for resident separation is accurate, correctly classifying 87.9% pairs of sensor events as either being generated by the same resident or by different residents. Moreover, our multi-resident activity recognition algorithm achieves an accuracy close to the one obtained by a state-of-the-art supervised method based on Hidden Markov Models, without requiring any data labeling.
The rest of the paper is structured as follows. Section II discusses related work. Section III introduces our multiinhabitant activity recognition system. Algorithms for resident separation are illustrated in Section IV, while Section V explains the algorithms for activity recognition. Section VI reports experimental results, while Section VII concludes the paper and illustrates directions for future work.

II. RELATED WORK
Techniques for multi-resident activity recognition are generally composed by two parts: data association, and activity recognition. The goal of the former is to associate each sensor event to the resident who triggered it. The latter aims at recognizing the activity of each resident based on the result of data association. Some studies, including [9]- [12], focus on the problem of data association, while others consider this problem already solved, and directly focus on the recognition of activities [13]- [16].
Activity recognition systems may be classified in two categories: data-driven or knowledge-based ones [2]. To the best of our knowledge, all existing methods for recognizing multi-resident activities adopt a data-driven approach based on supervised learning [3]. On the contrary, in our work, we propose a hybrid technique combining data-driven resident separation (i.e., a weaker form of data association) and knowledge-based activity recognition. A strong point of our work is that it does not rely on labelled activity datasets, whose acquisition is costly and may negatively impact the resident privacy. Furthermore, we do not need any technique or device to perform data association, because we apply resident separation based on streams of unlabeled sensor events.
Different techniques and tools have been proposed to perform accurate data association using cameras or wearable devices [17], [18]; however, those tools are often perceived as obtrusive by several people [19]. Hence, in our work, we aim at performing multi-resident activity recognition without the use of cameras or specific wearables.
Crandall et al. propose to use Hidden Markov Models (HMM) for resident identification [9]. In their model, the hidden states represent the possible residents, while observations represent the sensor events emitted as a consequence of the residents' activities. Their supervised method achieved around 90% accuracy on different datasets. However, their method relies on a dataset of sensor events labeled with the resident that triggered the event, while our method does not require any labeled training set. Cook et al. propose the use of event density maps and Bayesian models to automatically track resident movements and sensors activations in sensor-dense environments such as smart-homes and smartworkplaces [10]. The result of resident tracking is later used to recognize resident interactions through supervised machine learning using HMM. The use of Conditional Random Fields was proposed by Hsu et al. for both data association and multi-resident activity recognition [11]. Chen and Tong propose the use of combined labels to represent the activity performed simultaneously and independently by multiple residents [12], and adopt supervised probabilistic models for activity recognition.
Singla et al. propose the use of HMM for multi-resident activity recognition assuming exact knowledge of data association [13]. With exact data association, a HMM is built for each resident, achieving 73.15% accuracy on activity recognition on the CASAS multi-resident dataset, which is the same dataset that we use in this work. Without data association, a single HMM is built for the two residents, and accuracy drops to 60.6%. Their results indicate that the ability to associate sensor events to different inhabitants is of foremost importance for multi-resident activity recognition. Exact knowledge of data association is also assumed by Ul Alam et al., which mine association rules from labeled data to reduce the complexity of a Hierarchical Dynamic Bayesian Network to capture correlations among sensor events and activities [14]. Other researchers assumed exact knowledge of data association, and applied different supervised learning algorithms, including Recurrent Neural Networks used by Tran et al. in [15], and Incremental Decision Trees used by Prossegger and Bouchachia in [16].
Most existing techniques for multi-resident activity recognition do not explicitly consider the interaction among inhabitants. However, modeling interaction may increase recognition rates when the residents perform collaborative activities. For this reason, Chiang et al. introduce a binary ''interaction feature'' attribute stating whether two residents are in the same room [20]. Gu et al. use Emerging Patterns distinguishing individual and collaborative activities [21]. However, all these methods require the acquisition of labelled training sets, and accurate tools or devices to associate each sensor event to the resident that triggered it. On the contrary, our technique does not require labeled data, and relies on the existing smart-home sensor infrastructure, without the need to wear or install additional hardware for data separation.

III. MULTI-INHABITANT ACTIVITY RECOGNITION SYSTEM
Before explaining our algorithm for resident separation, we provide a formal description of the problem. We denote by event type the pair: which indicates a possible value of a given sensor. For instance, the event type fridge_door_sensor, open indicates that the door of the fridge has been opened. An actual occurrence of an event type is given by a sensor event, that is a tuple: A sensor event indicates that a certain event_type occurred at a certain timestamp as a consequence of the action of a given resident. For simplicity, we denote that the sensor event was triggered by a given resident. For instance, the sensor event '2019-05-29 11:58:01.424', fridge_door_sensor, open, Alice indicates that the fridge door has been opened on 2019-05-29 at 11:58:01.424 due to an action of the resident Alice.
Resident separation consists in determining whether a pair of sensor events were triggered by the same (possibly anonymous) resident, or by different residents. The objective is to facilitate activity recognition when multiple residents are doing activities in the same environment. Note that this problem does not correspond to the data association problem described in Section II, which aims at identifying the specific individual that activated the sensor.

A. FRAMEWORK
In this work, we assume that two residents live together in a smart home. Through an infrastructure of sensors, the smart home system detects the basic actions of residents (e.g., opening the fridge, entering the kitchen), which generate sensor events. Sensor events are processed by an artificial intelligence module to recognize the residents' activities (e.g., cooking, cleaning).
The high-level scheme of our system is represented in Figure 1. The first step Sensor installation consists in setting up the sensor infrastructure. As in any smart-home system, when a new sensor is installed, it is necessary to assign a semantics to the data it produces, in order to profitably use the data for activity recognition. For example, for the recognition of the 'cooking' activity, it is useful to know when the sensors connected to the stoves have been triggered by a resident. In some cases, it may also be necessary to specify the position of the sensor within the home. We assume that the infrastructure includes a network system to convey the data to an artificial intelligence module, deployed either in the home or on the cloud, for processing.
In the second step (Data acquisition), for a certain initial period the system acquires unlabeled sensor events to be used for resident separation. Those data are acquired when only a single resident is doing activities in the home. Different methods may be used to automatically detect the presence of a single person in the home, based on power consumption analysis or inexpensive sensors [22], but this aspect is out of the scope of this article. Single-resident sensor events are used by our system to build the model for resident separation. As shown in our experiments reported in Section VI, a few days of unsupervised data acquisition are sufficient to reliably build the model.
Once enough data has been acquired, our system processes the data to build the model for Resident separation. In particular, for each couple of sensor events se 1 , se 2 generated at time t 1 and t 2 , respectively, the model states whether se 1 and se 2 were likely triggered by the same resident or by different residents. The model of resident separation is built offline. The model can be refined incrementally when new singleresident sensor events are available by re-executing the model construction algorithm.
Finally, considering the predictions of online resident separation, the Activity recognition algorithm is in charge of recognizing the current activities carried out in the home, as described in Section V. We denote by activity class an abstract activity (e.g., eating or working), and by activity instance the actual occurrence of an activity of a given class during a certain time period.

B. PROBLEM FORMULATION
We model the resident separation problem as a binary classification task, in which each record is a pair of sensor events: and the class of the record is 1 if se 1 and se 2 were triggered by the same resident; it is 0 otherwise. Consequently, in order to evaluate the accuracy of our resident separation algorithm, we define: • False negative (FN): records in which the sensor events are activated by the same resident, and misclassified. As explained before, in order to make our system feasible to real-world applications, we adopt an unsupervised approach. In fact, a supervised approach requires, in an initial period, the annotation of the activities performed by the inhabitants, but it is well known that activity annotation is costly and obtrusive. In an unsupervised system, on the other hand, no manual annotation is required: this brings benefits in terms of costs, comfort and usability of the system in a real-world context.
C. ARCHITECTURE Figure 2 shows our architecture for multi-inhabitant activity recognition. The smart-home is instrumented with sensors to detect presence at certain locations, and interactions with items, devices and furniture. Raw sensor events are collected by the smart home infrastructure, and passed to the Semantic aggregation module. That module is in charge of applying simple preprocessing rules to transform raw data into sensor events. For example, if at time t the fridge door binary sensor S_11 produces a raw value of type ''1'', the module transforms the raw sensor data into the sensor event t, fridge_door_sensor, open, r . The stream of sensor events is passed to the Resident separation module, that applies resident separation according to the trained model, as explained in Section IV. Finally, the stream of sensor events, as well as the classified sensor records, are used by the Activity recognition module to recognize the activities occurring in the smart home based on an ontological model, as explained in Section V.

IV. RESIDENT SEPARATION ALGORITHMS
Since our algorithm is unsupervised, we use unlabeled records to extract statistics useful for our classification task. In particular, our intuition is that if two sensor events of given types often occur in consecutive temporal order (e.g., events of type open_fridge_door and close_fridge_door, respectively), or within a small time interval, they are likely triggered by the same person in order to perform a certain activity or a set of concurrent activities. On the contrary, if two sensor events of given types are rarely observed in temporal proximity (e.g., events of type open_fridge_door and flush_the_toilet, respectively), they are likely triggered by different residents executing different activities in the home. Hence, for building our resident separation model, we mine the training set of sensor events to extract statistical information about the co-occurrence of events of given sensor types, and use it to perform the classification of new records.
Formally, we represent the training set as a temporal sequence of sensor events: where, for each couple (se i , se j ) with i < j, se i was generated before se j . We denote by type(se) the event type of the sensor event se.
We have devised different approaches and algorithms for building the resident separation model. The first approach is named consecutive events. With this approach, the model considers two statistical measures. The first measure is named event type occurrences. For each couple of event types c = et i , et j , we count the number of consecutive events se k , se k+1 in T of types et i and et j , irrespectively from their order: Then, when we observe a consecutive pair of sensor events se k , se k+1 of types et i and et j , respectively, we compute the value of n c , where c = et i , et j . If the value of n c is below a certain threshold τ , we classify those events as triggered by different residents; we classify them as triggered by the same resident otherwise.
The second measure is named event type frequency f c , and relies on the following formula: where n c is the number of occurrences of the event types c in T , and n se k is the number of occurrences of type(se k ) events in T . Event type frequency is introduced because the number of occurrences of the couple of event types does not consider the numerousness of the individual types composing the couple. For example, we could have an event type et 1 with 3 occurrences only in T , which in all 3 cases is observed immediately before a sensor event of type et 2 .
In this case, although the number of occurrences of events of those types is low, its event type frequency is high, possibly indicating that the two events are related to the execution of a certain activity. Then, in order to classify pairs of sensor events, we adopt the same method explained above. The algorithm pseudo-code for building the model is reported in Algorithm 3. The second approach is named temporally close events. With this approach, we consider all the couples of sensor events generated within a time window of seconds, including non-consecutive sensor events. In fact, it is likely that sensors generated within a restricted time window were triggered by the activity of a single resident. We denote by τ (se) the UNIX timestamp of a sensor event. Given a couple of event types c = et i , et j , we compute its event type occurrences measure n c according to the formula below: The event type frequency measure is computed as: where n se k is the number of occurrences of type(se k ) events in T . The classification method is the same used in the consecutive events approach.

V. RECOGNITION OF MULTI-INHABITANT ACTIVITIES
The Activity recognition module receives the stream of sensor events, as well as the results of the resident separation algorithm. The goal of activity recognition is to associate each sensor event to the class of the activity that generated it. We propose a novel unsupervised methods for activity recognition, which relies on Hidden Markov Models (HMM). In the literature, HMMs have been extensively used for supervised activity recognition [2]. HMM is a statistical Markov model, characterized by joint probabilities among random variables representing hidden and observable states [23]. Figure 5 represents our formulation of HMM for activity recognition. The observable layer consists of the sensor events, while the hidden layer consists of the performed activities. We denote by A the set of activity classes, and by E the set of event types. The system is characterized by the following probability distributions: • emission probabilities represent the probability that a sensor event type se i ∈ E is observed given that the current activity class is ac i ∈ A; • transition probabilities represent the probability of the current activity class being ac i+1 given that the previous activity class was ac i ; • initial probabilities represent the probability distribution of the first activity class ac 1 . Based on an observed sequence of sensor events and on the HMM parameters, we use the Viterbi algorithm [24] to derive the most likely sequence of activities that may have generated the observations.
In the following, we present two approaches to instantiate the parameters of the HMM. The first, named baseline, is supervised, and it is commonly used for activity recognition [2]. The second is unsupervised, and it is based on automatic extraction of the HMM parameters through knowledge-based reasoning. We use the former as a baseline to evaluate the latter, which is the unsupervised HMM-based approach that we propose in this work. VOLUME 8, 2020

A. BASELINE SUPERVISED HMM-BASED METHOD
In the baseline method, we extract the HMM parameters from a labelled dataset D of sensor data acquired during the execution of a set of activities.
The emission probability matrix EPM contains in each cell [ac i , se j ] the conditional probability of observing a sensor event type se j given that the current activity class is ac i . This value is computed according to the following formula: where we denote by occurrences(se j , ac i ) the number of times that se j is observed during ac i in D, while occurrences(ac i ) is the number of times that ac i occurs in D.
The transition probability matrix TPM is calculated for each cell [ac i , ac j ] according to the following formula: where we denote by occurrences(ac i , ac j ) the number of times that a sensor event emitted by ac i is observed in D, followed by the observation of a sensor event emitted by ac j . We denote by occurrences(ac i ) the total number of sensor events emitted by ac i in the dataset. The initial probability array IPA contains for each activity class ac i the probability that it is the first performed activity class. The formula is: where occurrences(ac i ) t1 is the number of times that the first activity of a resident k in D is ac i , and N res denotes the number of residents in D.

B. UNSUPERVISED HMM-BASED METHOD
In order to avoid the acquisition of a labelled dataset of activities, we propose a novel technique to derive the HMM parameters using a knowledge-based approach. In particular, our method relies on semantic correlations [25], which represent probabilistic dependencies among event types and activity classes. Given an event type et ∈ E and an activity class ac ∈ A (where E is the set of event types and A is the set of activity classes), the semantic correlation function SC : 1] gives the probability of et being triggered by the execution of an activity of class ac. As a consequence, given any event type, SC is a probability distribution over all activity classes: By definition, semantic correlations correspond to the emission probabilities of our HMM. In order to derive them in an unsupervised fashion, we compute semantic correlations extending the knowledge-based method described in [26]. In particular, we re-use an OWL 2 ontology [8] modeling activities, context, and sensors in the smart home. The ontology is available online. 1 In the ontology, activities are defined in terms of the key objects that are typically used during their execution.
Example 1: For instance, the ontology defines Prepar-ingHotMeal as an activity that requires the usage of a CookingInstrument.
Moreover, the ontology is filled with instances of sensors in the smart home that are related to the usage of certain objects.
Example 2: Suppose that the smart home includes a power sensor monitoring the usage of the oven. Then, an instance of Sensor in the ontology is related by the property detect-sUsageOf to an instance of the class Oven, which is a subclass of CookingInstrument.
Then, though ontological reasoning, we compute the correlations among sensor types and activity classes.
Example 3: Continuing the above example, the ontological reasoner determines that the oven power sensor is related to the activity ''preparing hot meal'', since it detects the usage of a cooking instrument, which is a key object for that activity according to its ontological definition.
The method to derive semantic correlations among object sensors and activity classes is described in detail in [26]. In our extension, we also consider the presence of the user at certain locations as an indicator of a given activity. For instance, PreparingMeal is defined in our ontology as an activity that is typically executed in the kitchen. Hence, using the same method described above, we derive semantic correlations among presence sensors deployed at certain locations and activity classes.
We manually set transition probabilities based on common sense. Since a user normally performs the same activity for a given lapse of time before changing activity, we assign a higher probability p to transitions between the same activity class, and we uniformly distribute the remaining (1−p) probability to transitions to the other classes. However, depending on the set of considered activities, the transition matrix can be fine-tuned based on the typical order of activity execution. For instance, it is common that ''eating'' happens after ''cooking'', while the contrary is unlikely.
Finally, we set the initial probability to the uniform distribution, since activity recognition may start at any time of the day and on every possible context condition; hence, we have no knowledge to set initial probability values based on common sense.

C. UNSUPERVISED HMM-BASED METHOD WITH RESIDENT SEPARATION
In the unsupervised HMM-based method with resident separation, we exploit the result of resident separation to assign each observed sensor event to an anonymous resident. Of course, the number of residents has in impact on the resident separation algorithm. For the sake of this work, we assume that there are two residents in the smart-home. The first event is assigned to an arbitrary ''resident 0''. Then, if the record composed by the first and second events are classified as generated by the same resident, also the second event is assigned to ''resident 0''. Otherwise, it is assigned to a different ''resident 1''. We repeat the same procedure for the record composed by the second and third sensor events, and so on.
Finally, we separately apply the unsupervised HMM-based method described in Section V-B to each resident's stream of sensor events.

VI. EXPERIMENTAL EVALUATION
In this section, we report our experiments about resident separation and multi-inhabitant activity recognition.

A. DATASET AND EXPERIMENTAL SETUP
In our experiments, we used a real-world dataset acquired and labeled by researchers of the Center for Advanced Studies in Adaptive Systems (CASAS) at Washington State University. The dataset is available online. 2 The dataset was acquired in a smart-home instrumented with more than 60 sensors, including passive infrared motion sensors, temperature sensors, and sensors attached to doors, furniture, and items. Based on manual inspection, each sensor event was manually annotated with the resident that triggered it, and with the activity that he/she was performing.
The data were collected while two participants performed a set of fifteen scripted activities in a smart-home. The activities were executed by 26 pairs of residents. The considered activities are listed in Table 1.
Since our activity recognition method does not aim at recognizing the identity of the actor, we had to disregard certain activities. In particular, we disregarded activities 4 and 10 because they are identical, except for the actor's identity. For the same reason, we disregarded activities 14 and 15, since they only differ on the actor and used tools, and the smart home does not provide any sensor to detect the 2 http://casas.wsu.edu/datasets/adlmr.zip interaction with those tools. Finally, we disregarded activity 5 (watering plants), since the smart home does not provide enough sensors to recognize it. Indeed, the sensor infrastructure cannot detect events related to the interaction with the watering can or with plants, which are essential to detect that activity. Hence, in the following, we report on experiments performed using the remaining 10 activities only.
When evaluating the recognition techniques, we count the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The metrics used to evaluate the effectiveness of our algorithms are: • accuracy is the percentage of correct predictions of the classifier and is defined as: TP+TN TP+TN +FP+FN ; • precision = TP TP+FP ; • recall = TP TP+FN ; • F 1 score is the harmonic mean of precision and recall and is defined as: 2 · TP 2 · TP + FP + FN .

B. RESIDENT SEPARATION
We model the resident separation problem as a binary classification problem, in which the positive class for an instance se i , se j is ''same resident'' (i.e., both se i and se j were triggered by the same resident), while the negative class is ''different residents''.
In these experiments, we evaluated our resident separation algorithms introduced in Section IV. We applied crossvalidation with 26 iterations, iteratively using one couple of residents data for test, and the other 25 couples data for training the model. We recall that, since our technique is unsupervised, while training the model for resident separation we do not consider the dataset labels (residents and activities), but only the unlabeled sensor events.

1) CONSECUTIVE EVENTS APPROACH
At first, we evaluated the consecutive events approach. The results of our algorithm based on event type occurrences are listed in Table 2. We evaluated the technique using increasing values of the τ threshold. The best results are obtained with VOLUME 8, 2020   τ = 8, achieving and accuracy of 0.8741. In general, reducing the value of τ , we reduce the number of false negatives (i.e., consecutive sensor events generated by the same resident but wrongly classified), but we increase the number of false positives. Values of τ between 6 and 8 provide the best tradeoff between false positives and false negatives.
We achieve slightly better results using the event type frequency measure. Results are reported in Table 3. The best results are obtained with values of threshold τ of 0.01 and 0.02. Even with event type frequency, reducing the τ value determines a reduction of false negatives and an increase of false positives.

2) TEMPORALLY CLOSE EVENTS APPROACH
Results obtained with the temporally close events approach are in line with those of the consecutive events one. In particular, the best results are obtained using the frequency-based measure; we do not report results achieved by the occurrencybased measure due to lack of space. Results using a time window of 1, 2, and 3 seconds are shown in Tables 4, 5 and 6, respectively.  Overall, the achieved accuracy is slightly lower than the one obtained with the consecutive events approach. This result may be due to the nature of the dataset, which contains scripted activities carried out in short time periods. When training the model with activities carried out in naturalistic environments for longer time periods, we expect to obtain higher accuracy using the temporally close events approach.
In the rest of the experiments, we adopt the consecutive events approach, using the event type frequency measure. Overall, with that method we achieved an accuracy close to 88%. Compared to other approaches, we consider this result as a positive one. For instance, the supervised HMM-based method for data association proposed in [9] achieves accuracy of around 90%. The accuracy of our method is slightly lower, but our method has the advantage of not requiring the acquisition of a training set of sensor events labeled with the resident that triggered the events.

C. ACTIVITY RECOGNITION
In the following experiments, we evaluate our activity recognition method based on resident separation. For each sensor event se, the objective is to identify the class of the activity that triggered it.

1) BASELINE SUPERVISED HMM-BASED METHOD
As explained in Section V-A, we use this supervised method as a baseline to evaluate our unsupervised technique. For the sake of this experiment, we extract the HMM parameters from the data, using the same cross validation approach explained in Section VI-B. However, the activities in the dataset were scripted imposing a fixed order of activity execution. This fixed order is not representative of a real-world situation, in which humans perform most activities in variable order. Hence, extracting the TPM from the data would determine a strong bias, which would affect the results. For this reason, we manually set the TPM matrix with diagonal values d, uniformly distributing the remaining probability values among the other cells. We choose the value d = 0.9 because it achieves the highest recognition rates in our experiments.
We evaluate the supervised HMM approach without data association; i.e., considering the whole sensor data stream as triggered by a unique resident. The achieved average accuracy is 0.6962.

2) UNSUPERVISED HMM-BASED METHOD
In this experiment, we evaluate the unsupervised HMM-based technique described in Section V-B. The achieved results are reported in Table 7. Overall, the achieved accuracy is 0.6711. Compared with the supervised HMM-based method without data association, the unsupervised method is less accurate. This result is explained by the fact that semantic associations derived by ontological reasoning are less accurate than emission probabilities computed from the data. Indeed, semantic correlations represent generic relationships among sensor events and activities. On the contrary, emission probabilities extracted from the data are fine-tuned to the specific environment.
By closely inspecting the results, we notice that certain activities are particularly hard to recognize. The lowest F 1 score was achieved by activity 9 (set dining room table), probably because it is an activity involving several movements and items use, that are difficult to distinguish from other activity executions. On the contrary, activity 1 (fill medication dispenser) achieved the highest F 1 score, probably because it involves the usage of specific items, and the presence at a specific location.

3) UNSUPERVISED HMM-BASED METHOD WITH RESIDENT SEPARATION
Finally, we evaluate the HMM-based technique described in Section V-C, which exploits our resident separation method.
Results are shown in Table 8. Overall, the achieved accuracy is 0.7213. Hence, our unsupervised technique achieves higher accuracy than the baseline supervised HMM-based method. An additional benefit of our technique is that it does not need the acquisition of labelled datasets of activities and sensor events. Moreover, the introduction of resident separation into the unsupervised HMM-based method improves accuracy of 0.05.
Compared to other techniques applied to the same dataset, the accuracy obtained by our unsupervised method is close to the one obtained by the supervised HMM-based technique reported in [13]. Indeed, that method achieves 0.7315 accuracy assuming exact data separation. On the contrary, our method does not assume neither data separation, nor the existence of a labeled training set. Our method also outperforms the supervised technique based on Conditional Random Fields proposed in [11], which achieves 0.6416 accuracy.
By comparing the results of the unsupervised HMM-based method with and without resident separation, we notice that resident separation improves the F 1 score of most activities. The only activities that are negatively affected by resident separation are activities 3, 11, and 13. In fact, in all those activities, a resident asks help to the other resident to complete the task. Hence, they are group activities, which do not benefit from separating the actors. If we consider only the individual (i.e., non-group) activities in the dataset, the introduction of resident separation increases the accuracy from 0.7 to 0.77. These results show that the introduction of our unsupervised resident separation technique determines a relevant improvement of recognition rates.

VII. CONCLUSION AND FUTURE WORK
In this article, we tackled the challenging issue of recognizing multi-resident activities based on sensor data acquired from a smart-home infrastructure. We have proposed a hybrid method based on unsupervised data mining, and on ontological reasoning. To the best of our knowledge, this is the first effort to recognize multi-resident activities without the need of labeled training data. Experimental results show that our method achieves an accuracy close to the one of a state-of-the-art supervised technique. Indeed, our unsupervised method achieves 0.7231 average accuracy, while the state-of-the-art supervised technique achieves 0.7315 average accuracy.
Several research challenges remain open. First of all, our current resident separation method assumes the presence of at most two persons in the home. Extending our method to more residents is not trivial, both for the technical issues introduced by the extension, and for the lack of extensively labeled datasets to evaluate the technique with more than two inhabitants. Another limitation of our resident separation framework is the lack of specific support for handling collaborative activities, in which multiple persons execute the tasks needed to perform a given activity. As future work, we will also investigate methods to explicitly model interaction among inhabitants, in order to effectively recognize collaborative activities.