User Modelling Using Multimodal Information for Personalised Dressing Assistance

Assistive robots in home environments are steadily increasing in popularity. Due to significant variabilities in human behaviour, as well as physical characteristics and individual preferences, personalising assistance poses a challenging problem. In this paper, we focus on an assistive dressing task that involves physical contact with a human’s upper body, in which the goal is to improve the comfort level of the individual. Two aspects are considered to be significant in improving a user’s comfort level: having more natural postures and exerting less effort. However, a dressing path that fulfils these two criteria may not be found at one time. Therefore, we propose a user modelling method that combines vision and force data to enable the robot to search for an optimised dressing path for each user and improve as the human-robot interaction progresses. We compare the proposed method against two single-modality state-of-the-art user modelling methods designed for personalised assistive dressing by user studies (31 subjects). Experimental results show that the proposed method provides personalised assistance that results in more natural postures and less effort for human users.


I. INTRODUCTION
Assistive humanoid robots in home environments have gained significant popularity, not only because of the increasingly sophisticated hardware capabilities of robots and the rapid development of artificial intelligence but also due to the huge potential of reducing the need for human labour in daily care, especially considering the ageing population problem [1]- [3]. Assistive robots have been commonly used as companion robots for older adults, providing verbal or gestural interaction, reminders through touch screens, and house cleaning services. For example, research work in [4] integrated different functionalities for an assistive robot in homes of users with mild cognitive impairments. It included functionalities such as fall detection, assistance in turning off electric appliances, medication intake monitoring, detection of improperly placed objects. To handle multipurpose assistive scenarios, Partially Observable Markov Decision Processes were adopted for decision making in [5]. However, for robots to become versatile assistants in the daily lives of human The associate editor coordinating the review of this manuscript and approving it for publication was Xiwang Dong. beings, there are complicated tasks which require close physical contact with the person, such as assistive dressing. For such challenging tasks, personalisation plays a significant role in increasing an individual's comfort level as different users may have completely different behaviours.
To provide personalised assistance, obtaining user models is an effective approach for assistive robots to adapt their behaviours to accommodate different users [6]- [9]. To build user models, user information is collected using sensors or questionnaires and analysed using machine learning to extract user or class-specific information. When assisting in the interaction, robots can make use of the user models to optimise their behaviours for each particular individual.
Vision data was used in [7] to build user models that enable natural postures of different humans in assistive dressing. For each user, the movement space of each upper-body joint was modeled independently using Gaussian Mixture Models (GMMs) so that the most frequently reached position of each joint can be learned. The dressing assistance was personalised for each user according to the movement space modelling of the human upper-body joints and the real-time upper-body pose estimation. However, the movements of the upper-body joints are not entirely independent and the movement relationships of these joints were not considered in [7]. In subsequent research, force data was used to build user models to reduce the effort of human in assistive dressing in [8]. For each user, the robot searched for a dressing path that minimised the force resistance during the interaction. However, the minimised force path could not guarantee natural user postures.
In personalised assistive dressing, enabling users to maintain natural postures with minimal effort plays a significant role in improving their comfort level. However, to the best of our knowledge, no existing research has yet considered the impact of both postures and user effort. In this paper, we propose a user modelling method using multimodal information to enable the robot to search for an optimised dressing path in comfort level, while guaranteeing natural joint trajectories that demand less effort from a user ( Figure 1). We use a human arm skeletal model to describe the movement relationships of the human upper-body joints and make a link between the vision and force data. We compare the proposed method against the methods presented in [7] and [8] on both synthetic data and real-world data. In the real-world experiments, a Baxter robot assisted human users in putting on a jacket with sleeves, which is more challenging than the jacket without sleeves setting used in [7] and [8]. The proposed multimodal user modelling method enables a Baxter robot to search for an optimised dressing path in comfort level, while guaranteeing natural joint trajectories that demand less effort from a user. The red, blue and yellow area represents the movement space modelling of the user's hand, elbow and shoulder. The black dashed line shows the initial dressing path where black circles indicate the path points. The optimised dressing path is denoted by the blue circles, which are connected via blue lines.
The main contributions of this paper can be summarised as follows: (i) concurrent exploitation of both vision and force data in user modelling methods during human-robot interaction; (ii) a multimodal user modelling method that guarantees natural postures and less effort of a user in personalised assistive dressing; (iii) evaluation of the proposed method on an assistive dressing scenario with 31 human users and comparison experiments against two state-of-the-art user modelling methods in assistive dressing.

II. RELATED WORK A. USER MODELLING
User modelling is defined according to [9] as the process of building up and modifying a conceptual understanding of the user, where the main goal of this process is customisation and adaptation of systems to the user's specific needs. According to user modelling studies, there are usually four types of user models: static user models [10], dynamic user models [11], stereotype-based user models [12] and highly adaptive user models [13]. Static user models are the most basic type of user models. Once these static models are built, they no longer undergo changes during an interaction with the system. However, such models lack the ability to adapt to the personal requirements of users. Dynamic user models instead allow changes of personal preferences during an interaction with the system, thus accommodating updates based on new goals or the latest available information on an individual. With stereotype-based user models, users are classified into different stereotypes after collecting their relevant information. For a new user with little-known information, the computer/application can still infer the relative characteristics of this user after classifying him/her into a stereotyped group. However, sometimes a user's personal attributes may not match any of the existing stereotypes and the user models cannot flexibly deal with such situations. Highly adaptive user models aim to build a representation for a particular user, therefore allowing the system to offer high flexibility and adaptivity. Since a unique model for each user is built, it can avoid the disadvantages of stereotype-based user models. In this paper, we focus on building a highly adaptive user model for each user using vision and force data.
There have been various applications of user modelling in human-robot interaction. Canal et al. [14] proposed a robot personalisation framework which considered the differences among users and the adaptation of generic pretrained robot skills and applied the method in an assistive feeding scenario. With a hands-off assistive robot during poststroke rehabilitation therapy, Tapus et al. [15] investigated the extroversion-introversion personality matching between the robot and users. Since there is a strong relationship between a human's personality and their behaviour [16], they argued that robots should act in accordance with the user's personality during human-robot interactions. Experimental results showed that human users preferred robot behaviours that aligned with their own personalities. Research has also taken place on studying user preferences in an object handover scenario. For a companion robot approaching a seated person in a helping context, Dautenhahn et al. [17] studied user preferences for the comfortable approach directions by considering factors such as gender differences, age, and handedness. Cakmak et al. [18] presented a user study on human preferences of robot hand-over configurations, using a simulated kinematics model of humans to collect information on user preferences. Moreover, the spatial reasoning of users, such as user visibility and arm comfort, was considered to be an important factor in [19] for object hand-over tasks. Understanding human behaviours is also helpful in learning user models. Kostavelis et al. [20] addressed the human behaviour modelling problem with a Dynamic Bayesian Network that operated on top of the Interaction Unit to enable a robot to understand daily activities such as ''meal preparation'', ''cooking'', ''having a meal'' and ''medication intake'' in real home environments.

B. ASSISTIVE DRESSING ROBOTS
Dressing is one of the most common daily activities of human beings and providing dressing assistance therefore remains an important but challenging problem for robots. There has been interesting prior research on assistive dressing by humanoid robots in home environments. Some researchers focus on the robot motion learning of dressing skills. Both Tamei et al. [21] and Matsubara et al. [22] proposed to use reinforcement learning to teach a dual-arm robot to learn the motions of dressing T-shirts. Colome et al. [23] enabled a WAM robot to wrap a scarf around a human mannequin's neck through reinforcement learning [23]. Through human demonstrations, Pignat and Calinon [24] enabled a Baxter robot to learn adaptive dressing assistance with a hidden semi-Markov model to assist users in putting on the sleeve of a jacket, as well as putting on a shoe. Some researchers focused on the estimation of the humancloth relationship in assistive dressing. Koganti et al. [25] proposed the offline learning of a cloth dynamics model with different sensor data and applied this learned model to track the human-cloth relationships online using a depth sensor. Later the authors improved the work by using the GP-based nonlinear dimensionality reduction technique and Bayesian non-parametric LVM to generalise the cloth-state model to an unseen environment [26]. Kapusta et al. [27] proposed to use data-driven haptic perception to infer the human-cloth relationship. With only force information, hidden Markov models were used to classify the tasks into one of the three outcomes.
There are other interesting aspects of assistive dressing studied by researchers, such as force prediction, shoe wearing and garments differentiation. Erickson et al. [28] proposed a long short-term memory network using a 9D input of force, torque, and velocity data at each time step to infer a force map across the person's body in the simulation. Colomé et al. [29] proposed to use reward-weighted GMM as decision-making system in a robotic shoe-wearing scenario. A body-part tracking method from partial view depth data was investigated in [30] to support lower body tracking. Chance et al. [31] studied whether the robot, a load cell, and an IMU can differentiate between different garments and detect dressing errors by recording sensor data when dressing both a mannequin and human participants. Chance et al. [32] also proposed to resolve interaction ambiguity through non-visual cues for a robotic dressing assistant.
However, there has been less work on improving the user's comfort level in personalised assistive dressing.
Klee et al. [33] proposed a framework to enable the user and robot to take turns moving to complete a shared goal, where the user's limitations were gradually learned by the robot to provide personalised interactions when assisting the user in putting on a hat. Zhang et al. [34] proposed a hierarchical multi-task control strategy to plan the robot dressing motion when assisting the user to wear a sleeveless jacket. The robot was able to update the dressing trajectory based on a user-specific movement model whenever the user suddenly moved their arm for a secondary task. Later Zhang [35] combined the hierarchical multi-task control with a probabilistic filtering method to estimate user postures when the user performed unexpected movements such as pulling or pushing. Gao et al. [7] used vision data to model the movement space of the human upper-body joints for each user to enable more natural postures of users in the dressing assistance. However, severe vision occlusions can cause the failure of human upper-body pose estimation when the robot's arms, the human body, and the clothes are in close contact. Assistive dressing requires close contact with human users and this contact information was not considered in [7]. Later Gao et al. [8] used force data to enable the robot to search for a minimised force dressing path for each user to minimise the user's effort in the dressing assistance. In this paper, we focus on optimising a complete dressing path for a specific user while the user performs natural arm movement in daily dressing to improve the user's comfort level.

C. HUMAN-ROBOT INTERACTION USING MULTIMODAL INFORMATION
For robots to interact more naturally with humans, robots should be able to utilise multi-sensor or multi-modality information [36]- [38]. Using multimodal data could compensate for the disadvantages of using unimodal data when noise or ambiguity occurs in the unstructured environment.
Force and vision information could be used together in human-robot collaborative tasks. Rozo et al. [39] proposed an approach incorporating dynamical systems, probabilistic learning and stiffness estimation. They enabled a robotic arm to learn the physical collaborative behaviours from human demonstrations by satisfying both the position and force constraints. The robot was able to learn the behaviours such as lifting, moving and landing an object with a human partner, as well as an assembling task that involved holding a wooden table while a human partner screwed the four legs to the table. Kruse et al. [40] presented a feedback controller using both force and vision information and enabled a Baxter robot to collaboratively unfold a piece of cloth with humans by responding to force and vision changes. There are humanrobot interactive tasks that intrinsically require multimodal information. Zambelli and Demiris [41] proposed an imitation learning method to enable a humanoid robot iCub to learn how to play with a piano keyboard from a human teacher. Through self-exploration, the robot learned sensorimotor representations on multimodal task spaces including vision, touch and proprioception. Given a new task, the robot inferred its own motion by fulfilling multimodal constraints. Schmidts et al. [42] combined Gaussian Mixture Regression and Hidden Markov Models to teach a robot hand to imitate human grasping skills from motion and force data. In terms of the generalisation capability for grasping other similar objects, the study's results showed learning from motion and force data outperformed learning from solely motion data. However, there have been no user modelling methods using both vision and force data to improve the user's comfort level in assistive dressing scenarios. In this paper, we explore how to combine the vision and force data in building user models for personalised dressing assistance.

III. BACKGROUND
In this section, we briefly review two methods that are most relevant to this paper. The first is the human upper-body movement space modelling method [7], and the second is the online iterative path optimisation method to minimise the force resistance [8].

A. VISION-BASED UPPER-BODY MOVEMENT SPACE MODELLING
The human upper-body pose was recognised with a topdown view depth sensor using the randomised decision forest method [43], which provided the positions of the human upper-body joints in the camera coordinates. The positions of these joints were then converted into the robot coordinate space according to the spatial relationships of the two coordinate frames. The set of joint m is defined as J m , where its spatial distribution is modeled using GMMs as follows [7]: where the N J m |µ m k , m k represents a Gaussian distribution with mean µ m k and variance m k . π m k is a mixture weight and K m is the number of Gaussian models. The Expectation-Maximisation (EM) learning algorithm in [44] was adopted to estimate the parameters of GMMs. However, the movement space of each human upper-body joint is modeled independently, which often fails to represent realistic body movement.

B. FORCE-BASED DRESSING PATH OPTIMISATION
In [8], an online iterative dressing path optimisation method which can enable the robot to iteratively search for a minimum resistance path using force sensor information was presented. The method adopted a stochastic gradient descent approach (i.e. Adam [45]), which was designed to find the global optimum for a stochastic objective function. When external force resistance was detected during dressing trials, the robot updated the gripper's position according to the force information. The path kept updating until there was no force resistance detected in the current trial or the maximum iteration number was reached. However, the path update is only based on the force data. Whether the natural postures of human users can be guaranteed was not included in the paper's considerations.

IV. METHODOLOGY
Both vision and force data reflect different aspects of a user's information and they were separately studied in [7] and [8]. The vision data can show the user's postures and the force data can reflect the user's effort during dressing. In this section, we propose a user modelling method using multimodal information, namely vision and force, to search for a personally optimised path in comfort level. We consider two significant aspects when defining a comfortable dressing path. The definition of comfortability of a dressing path in this paper is that this path enables natural postures and less effort for a human user. Only one measure has excellent performance while the other performs poorly can lead to a reduced comfort level of the user. In order to jointly model the movement of the human upper-body joints, we introduce a human arm skeletal model. This skeletal model connects two upper-body joints of the human arm to describe the movement relationships between the joints.

A. NOTATIONS
In this work, the robot uses two grippers to grasp the shoulder parts of a jacket. The dressing path is for one arm of a user and it is the path of the robot's gripper. The path consists of several high-level path points that will be optimised. The highlevel path points represent positions such as the user's hand, elbow, or shoulder. The low-level path of the robot's gripper between one high-level path point to another is planned using the hierarchical multi-task control strategy in [34], where force resistance at the end-effector of the robot's gripper can be easily detected. The starting path point (e.g. the user's hand position) and the ending path point (e.g. the user's shoulder position) is fixed for the path.
The input of the algorithm (Algorithm 1) is the initial path W 0 0→K , length of the skeletal model for a user's arm L, maximum iteration number t max , energy threshold τ energy , force threshold τ g , and Adam parameters α,β 1 ,β 2 . The initial Similarly, the path in the t th iteration can be represented where P i is a high-level path point and N is the total number of high-level path points. N can be different for each sub-path W t k→k+1 in different iteration. Irrespective to how the last path point of sub-path W t k−1→k updates during the iterations, the starting path point of sub-path W t k→k+1 always takes the value of the last path point of sub-path W t k−1→k . The output of the algorithm is the final optimised path is the optimised sub-path of W 0 k→k+1 . Either the forearm or the upper arm of a user can be viewed as a skeletal model, where the length of the skeletal model can be measured using the vision data. In assistive dressing, an initial dressing path for a user's arm contains three high-level path points, which are the user's hand(0), VOLUME 8, 2020

Algorithm 1 Online Iterative Path Optimisation Using Multimodal Information
Plan low-level path p from P cur to P next using hierarchical multi-task control in [34] for each p(n) (n th path point of the low-level path p) do Detect force resistance g if g > τ g then (1), and shoulder (2). The movement space of each human upper-body joint is modeled with GMMs. The initial . Therefore, these three special path points of W t 0→2 follow the distributions of the three highlevel path points of the initial path.
There are two terminating conditions for a complete iteration. One is when the maximum iteration number t max is reached, the other is when the energy in the current iteration E energy is smaller than the energy threshold τ energy . In this paper, the energy E energy represents the total detected force resistance in a single iteration. We also adopt the Adam method [45] for parameter updates. We use p to represent the planned low-level path from one high-level path point (e.g. P i ) to another (e.g. P i+1 ) and we use p(n) to represent an intermediate path point of p. When the force resistance g is larger than the threshold τ g , we calculate M temp , V temp , M temp ,V temp and update p(n) following the Adam update rule. M temp and V temp are the biased first and second moment estimates.M temp andV temp are the bias-corrected first and second moment estimates. β 1 and β 2 are the exponential decay rates for the moment estimates. α represents the learning rate and is the smoothing term. M and V are intermediate vectors growing with M temp and V temp .

B. ONLINE ITERATIVE PATH OPTIMISATION USING MULTIMODAL INFORMATION
In this work, the path of the robot gripper is iteratively updated trying to optimise a user's comfort level by minimising force resistance with natural postures. For each step of the iteration, we obtain force resistance at each lowlevel path point. When the force resistance is larger than the force threshold, we update the current gripper's position and the next high-level path point. We optimise the path points P i of every sub-path W t k→k+1 using the function UpdatePath. We subsequently update m t+1 and v t+1 by averaging over all elements of M and V within this iteration.
UpdatePath: P cur is one high-level path point and P next is the following one. For the robot to plan the low-level path from one high-level path point to another, we adopt the hierarchical multi-task control in [34]. A force controller is employed to enable the robot to be compliant during the interactions with users. The external force and torque applied at the end effector of the Baxter robot is estimated from the measured joint torques as proposed in [40]. The measured force is then mapped into the desired velocity for the robot's end effector using the standard generalized damper approach [46]. During the low-level path planning, if force resistance is detected at path point p(n), the gripper stops and updates its position following the Adam update rule. The updated p(n) is added to W new and it becomes a high-level path point in W new . E energy is updated with the force resistance. Vectors M and V are grown with M temp and V temp . W 0→K is the full path. If P next is not the last path point of W 0→K , then function ChooseNextGoal is called to update P next , otherwise P next is not updated. This is because the ending path point of the full path is fixed. After this, function UpdatePath is called again with the updated parameters. The final updated P next is added to W new and it also becomes a high-level path point in W new . In another condition inside function UpdatePath, if no force resistance is detected during the low-level path planning from P cur to P next , then P next is directly added to W new .
ChooseNextGoal: The goal is to update P next based on P mid and L. We make use of the skeletal model to update P next . Within j max number of iterations, we randomly generate candidate positions P j nextTmp within a predefined search range. P mid and P j nextTmp should be on the same skeletal model, where the two ending points of the skeletal model are unknown. We use function CalculateProbability to calculate the largest possible log probability of P mid given P j nextTmp , which is represented as log(P mid ) j max . Finally, we update P next with P j nextTmp , where log(P mid ) j max is the largest among {log(P mid ) j max } j max j=0 . Our reasoning for maximising log(P mid ) j max inside function CalculateProbability is explained in the following.
CalculateProbability: Given P mid , P nextTmp and L, the goal is to calculate the largest log probability of P mid using the skeletal model. The largest log probability of P mid is represented as log(P mid ) max , which is initialised to a large negative value. Because the starting point and the ending point of the skeletal model are independently modeled with GMMs, we use the sum of the log probabilities of the starting point and the ending point of the skeletal model to represent the log probability of P mid . When the log probability of P mid is large, the log probability of the starting point or the ending point of the skeletal model is also large. The starting point or ending point of the skeletal model is used to simulate an upper-body joint of a human. The larger the probability is, the easier this position is reached by the user's upper-body joint. When the probabilities of the user's joints' positions are large, this human posture can be viewed as a natural posture of the user. The two ending points of the skeletal model are not known. We therefore randomly generate different combinations of the starting point and the ending point of the skeletal model along the line in the space determined by P mid and P nextTmp , in order to search for the largest log probability of P mid . We use Figure 2 to help explain the spatial relationships among the path points in the space. Within q max number of iterations, we first calculate the unit vector v between P mid and P nextTmp . The length of the skeletal model L is known before. In the assistive dressing application, L is the Euclidean distance between the user's hand and elbow, or between the user's elbow and shoulder, which can be measured using the vision sensor. l tmp is randomly generated in the interval (0, L). V is the vector pointing from P mid to P endTmp . From vector calculus, we calculate the ending point P endTmp and the starting point P startTmp of the skeletal model. Then we calculate the log probability of P mid in the current iteration log(P mid ) using the sum of VOLUME 8, 2020 the log probabilities of P endTmp and P startTmp . If log(P mid ) is larger than the largest log probability log(P mid ) max , then log(P mid ) max is updated with log(P mid ).
The proposed user modelling method integrates the upperbody movement space modelling and a human arm skeletal model into an online iterative path optimisation method. Before dressing, a vision sensor is used to record a user's upper-body motion and a preferred initial pose. Then the movement space of each upper-body joint is modeled using GMMs. The user-specific parameters are calculated and sent to the path update method. The adaptation is performed in four aspects. First, the parameters of GMMs for each user's upper-body are unique. Second, the lengths of the skeletal models of the user's forearm and upper arm are obtained specifically for this user. Third, the whole path update process is based on an initial human pose that is determined entirely by a user according to his/her preference. Fourth, the online iterative path optimisation method updates the path whenever the force resistance occurs by taking the user-specific parameters of GMMs, skeletal models and initial pose into consideration. The optimised dressing path is personalised for a user according to his/her preferred arm movement trajectory, the interactive force resistance and the user-specific upperbody movement space modelling.

V. EXPERIMENTS AND DISCUSSION
We evaluate the proposed method by comparing against the state-of-the-art methods presented in [7] and [8] on both the synthetic data and real-world assistive dressing data. In this paper, we consider two significant aspects to improve the user's comfort level. One is to enable a user to have more natural postures. The other is to impose less strain and effort on a user during the dressing procedure. To evaluate the performance quantitatively, we define two criteria, which are the reachability criterion and the resistance criterion. We represent the final optimised path as W final = {w i } N i=0 , where w i is a path point and N is the total number of path points. Resistance Measure: With real-world experiments, the resistance is the total detected force resistance of the final path, which can be directly measured. With synthetic data, we use the distance between the final path and a minimum force path to simulate the force resistance. The definition of the resistance criterion with synthetic data is

Reachability Measure: The definition of the reachability criterion is
where d(w i ) is the Euclidean distance from the path point w i to the minimum force path. The resistance criterion is calculated as the average of the sum of d(w i ). The smaller the resistance, the smaller the average distance from the path points of the final path to the minimum force path. Thus, the resistance criterion can be used to evaluate whether the dressing path enables a user to undergo less effort.

A. EVALUATION WITH SYNTHETIC DATA
In order to evaluate the proposed method, we randomly generate 100 sets of synthetic data, where each set contains an initial path, a minimum force path and distributions of the initial path points modeled with GMMs. In the synthetic data, an initial path contains three path points, where each path point follows the distribution modeled with GMMs. A minimum force path represents the path with minimum external force. The starting point and ending point of an initial path are the same as the ones corresponding to the minimum force path. The goal is to search for an optimised path according to the minimum force path and the distributions of the initial path points.
With the synthetic data, we compare the proposed method against the methods in [7] and [8]. In [7], the method was directly applied in the real-world assistive dressing application. The dressing path was determined according to the movement space modelling of the human upper-body joints and the real-time human upper-body pose. However, realtime human upper-body pose estimation can fail when severe occlusions occur. The main factor when deciding a dressing path in [7] is the movement space modelling of the human upper-body joints. To decide an initial path in a set of synthetic data, we first randomly generate the distributions of the three initial path points modeled with GMMs. For each distribution, we calculate the position that maximises the probability of the model and uses this position as the initial path point by adding some randomness. With the method in [7], the initial path also becomes the final path. This is because only vision information is considered and there is no path update.
There are two main parameters that can affect the performance of the proposed method and the method in [8], which are α and the maximum iteration number t max in Algorithm 1. Therefore, we run comparison experiments with different combinations of α and t max , where α ∈ {0.1, 0.3, 0.5} and t max ∈ {10, 20, 30, 40, 50}. For each combination of parameters, we run experiments with the 100 sets of synthetic data. According to [45], good default settings for the Adam parameters are β 1 = 0.9, β 2 = 0.999, and = 10 −8 . For the proposed method and the method in [8], the low-level path was planned using the linear regression with the step length set to 0.3. The energy threshold τ energy was set to 1 and the force threshold τ g was set to 0. For the proposed method, j max was set to 30 and q max was set to 50. The search range for P j nextTmp was set to (−0.2, 0.2). The method in [7] is not affected by α and t max . Therefore, the experiment results of the reachability and resistance are always the same no matter how α and t max change. Because there is no path update with the method in [7], we only show the comparison results of the runtime between the proposed method and the method in [8]. We ran the experiments in Matlab without parallel processing. All computation was conducted on a standard desktop computer with quad-core Intel i7 processor. The experiment results are shown in Figure 3.
In both the proposed method and the method introduced in [8], the terminating conditions are when either a maximum iteration number is reached or the energy is smaller than the energy threshold. With the synthetic data, the 2 nd terminating condition can be viewed as when the final path is close enough to the minimum force path. For both methods, when the final optimised path becomes closer to the minimum force path, it leads to a decrement of the resistance. However, as the reachability of the minimum force path may be either small or large, being closer to the minimum force path cannot guarantee a larger reachability value. This explains why a decrement of the resistance can lead to a decrement of the reachability. For the proposed method, due to the constraints of both the minimum force path and the distributions of the initial path points modeled with GMMs, the optimised path may not be close enough to the minimum force path. Therefore, the maximum iteration number is the main terminating condition for the proposed method. This explains why the runtime always increases as t max increases whatever α is. For the method in [8], because there is no vision constraint, the main and only goal of the path update is to be as close as possible to the minimum force path. Since α controls the proportion of the adjusted distance in each update, more iterations are required when α is small. When α is larger, the initial path gets close to the minimum force path more quickly and a larger maximum iteration number would not have too much impact on the result.
For all combinations of α and t max , the method in [8] performs best in resistance while it performs worst in reachability. This is because the minimum force path is the main and only reference during the path update and no vision information is considered. For the method in [7], although it performs best in reachability, it performs worst in resistance. This is because the initial path, which is determined only by the vision information, also becomes the final path, the minimum force path is not considered at all and no path update is performed. However, the proposed method can take both the vision and force information into consideration when updating a path. With the synthetic data, a compromise between the minimum force path and the distributions of the initial path points can be made if necessary. Two-sample t-test on both the reachability and resistance measures of the proposed method shows statistically significant results compared against the other two single-modality methods. When α is 0.1, the mean reachability of the proposed method and the force based method [8] are −6.27 and −15.05 (t(553) = 20.22, p < 0.01 when all iteration numbers are considered). When α is 0.3, the mean reachability of the proposed method and the force based method [8] are −9.61 and −35.09 (t(711) = 50.80, p < 0.01). When α is 0.5, the mean reachability of the proposed method and the force based method [8] are −20.30 and −38.70 (t(997) = 27.74, p < 0.01). When α is 0.1, the mean resistance of the proposed method and the vision based method [7] are 1.56 and 1.72 (t(983) = 17.12, p < 0.01 when all iteration numbers are considered). When α is 0.3, the mean resistance of the proposed method and the vision based method [7] are 1.08 and 1.72 (t(771) = 50.71, p < 0.01). When α is 0.5, the mean resistance of the proposed method and the vision based method [7] are 0.69 and 1.72 (t(686) = 69.30, p < 0.01). From the results, it can be seen that the proposed method outperforms the method in [8] for the reachability criterion and it outperforms the method in [7] for the resistance criterion. We visualise the comparison results of three sets of synthetic data in Figure 4.  From the results, it can be seen that the proposed method outperforms the method in [8] for the reachability criterion and it outperforms the method in [7] for the resistance criterion. We visualise the comparison results of three sets of synthetic data in Figure 4.

B. REAL-WORLD PERSONALISED ASSISTIVE DRESSING
In [7] and [8], a Baxter humanoid robot assisted human users in putting on a jacket without sleeves. In this paper, we used a jacket with sleeves. We invited 31 healthy users (8 female) ages 23-33 (mean: 27.29, std: 2.38, informed consent was obtained) to participate in the experiments. For each user, we ran 3 comparison experiments with the proposed method, the methods in [8] and [7], where the order of the three methods was randomly chosen. In each experiment, the robot mainly assisted the user to wear the right part of the jacket.
For the proposed method and the method in [8], the force resistance threshold τ g was set to 4N, the energy threshold was set to 0N, α was set to 0.03, and the maximum iteration number t max was set to 5. With the Baxter robot, the force was detected at 100Hz. In each update, the force resistance was averaged within the time steps when the detected force was larger than the force resistance threshold. For the proposed method, j max was set to 20 and q max was set to 40. The search range for P j nextTmp was set to (−0.05, 0.05). For each user, the motion of the right arm was first collected. The user's upper-body pose was recognised with a front-view depth sensor using ROS OpenNI skeleton tracker. The user was asked to bend the right arm a little bit while moving, except in the area of the rest position. This motion information was used to generate the distributions of the hand, elbow and shoulder modeled with GMMs. A preferred initial pose of the user was subsequently recorded. It was FIGURE 4. Visualisation of the comparison results of three sets of synthetic data. For each set shown in the 1 st column, we show the initial path, the minimum force path, and the distributions of the initial path points modeled with GMMs. In the 2 nd , 3 rd , and 4 th column, we show the experiment results with the methods in [7], [8], and the proposed method. The path points of the final path are shown in the blue dots connected by the blue dashed line. For the three sets of synthetic data we visualise, we choose the parameter combinations of α = 0.1 and t max = 50, α = 0.3 and t max = 30, and α = 0.5 and t max = 10. For each set, the final path with the method in [7] is the same as the initial path because there is no path update at all. While this path has the largest reachability, it also has the largest resistance. For the method in [8], the final path is very close to the minimum force path. This is because only the force information is considered during the path update. While this path has the smallest resistance, it has the smallest reachability. For the proposed method, the final path is situated in between the initial path and the minimum force path. The reachability of this path is larger than that of the path with the method in [8] and the resistance of this path is smaller than that of the path with the method in [7]. used to determine the initial dressing path of the 1 st trial, which consisted of three high-level goal positions -the user's initial hand position, elbow position and shoulder position. The depth sensor was only used to collect a user's motion before dressing and to record a preferred initial pose of this user. At these two steps, no occlusions occurred. As the robot started to assist the user to dress, the depth sensor was not used any more due to the severe occlusions when the robot's arms, the human body and the clothes were in close contact. To tackle the skeletal uncertainties during assistive dressing, we made use of the skeletal model of the human upper body within the proposed method to estimate the spatial relationships of the human upper-body joints based on the detected force resistance and current updated path point. The details of the calculations have been described inside function ChooseNextGoal and CalculateProbability of the proposed method. For the proposed method and the method in [8], the robot's dressing path kept updating according to the human-robot interaction in each trial until one of the terminating conditions was met. Between each trial, there was an 8s pause. When a new trial started, the robot moved the jacket to the initial hand position of the user again. The starting path point, which was the user's initial hand position, and the ending path point, which was the user's initial shoulder position, remained the same during the experiment. There was no path update for the method in [7]. We recorded the total iteration number and execution time in each experiment. For each method, we evaluated the dressing path of the last trial using the reachability and the resistance criteria.
The experiment results are shown in Figure 5. For the reachability criterion, the proposed method performs best. With the synthetic data, the method in [7] performs best in reachability. The reachability is calculated as the average of the sum of the log probabilities of all the path points. With the vision based method, there is no path update and the final path is the same as the initial path. The initial path has three path points where each one follows the distributions modeled with GMMs. The three path points simulate the user's joint positions of hand, elbow and shoulder. With the synthetic data, each path point is selected by choosing the position that maximises the probability of the corresponding model and adding some randomness. This choice of path points is an ideal situation without considering the geometric constraints of the three path points. With the real-world data, the user provides a preferred pose which is then used to decide the initial dressing path. The three path points have to satisfy the geometric constraints of the human arm and skeletal model of the upper-body joints. These real-world constraints make the initial path points less than ideal compared with the synthetic data, which leads to smaller reachability of the vision based method. Two-sample t-test shows statistically significant results. The mean reachability of the proposed method is 8.72. Comparing against the proposed method, the mean reachability of the vision based method [7] and the force based method [8] are 4.61 (t(57) = 7.81, p < 0.01) and 7.43 (t(60) = 2.20, p < 0.05). For the resistance criterion, the proposed method and the method in [8] reach 0N in the last trial of all the experiments, while a larger force resistance (mean = 5.34N) is detected in every experiment of the method in [7]. Similar to the experimental results on the synthetic data, the proposed method outperforms the method in [8] for the reachability criterion and it outperforms the method in [7] for the resistance criterion. Both the force resistance and the vision information are considered when optimising the path with the proposed method. The performances of the total iteration number between the proposed method and the method in [8] are similar, which are within the range of 2-4, with an exception of 5 for the method in [8]. For the proposed method and the method in [8], we observe that a larger force resistance is detected in the first one or two trials for almost all users. This is because users are not familiar with the robot's motion at the very beginning, especially when wearing a jacket with sleeves that can easily interfere with the dressing procedure. The robot is able to adjust its motion according to the human-robot interaction and gradually optimise the dressing path. Because there is no path update with the method in [7], the total iteration number is 1 in every experiment. This also explains the minimum execution time of the method in [7].
With the proposed method, our goal is not to simply assist a user in dressing, which was already achieved in [7], but rather to improve the user's comfort during the whole assistive dressing process. We not only evaluated the methods quantitatively with the defined criteria but also collected the user's subjective feeling on the three comparison experiments. After finishing all the experiments, we asked each user to vote for the comfort level of the final optimised dressing path of each method. The comfort level ranges from very uncomfortable, uncomfortable, neutral, comfortable to very comfortable. The survey responses for the comparison experiments are shown in Figure 6. It can be seen that the proposed method gains the largest votes for the comfortable and very comfortable level and the least votes for the neutral and uncomfortable level. If we assign the values from 1 to 5 to represent the comfort levels from very uncomfortable to very comfortable, the larger the value, the more comfort level is. The median vote of the proposed method and the vision based method [7] are 4 and 3; the distributions in the two groups differ significantly (Mann−Whitney U = 689.50, sample common language effect size f = 72%, n 1 = n 2 = 31, p < 0.01). The median vote of the force based method [8] is also 4. However, the distributions in this group and in that of the proposed method differ significantly (Mann−Whitney U = 836, sample common language effect size f = 87%, n 1 = n 2 = 31, p < 0.05). We collected comments from the participants to analyse why they made each choice. Users who disliked the vision based method commented that the robot's dressing path was not adaptable at all and the robot ignored the interactive force resistance. The main reason that some users preferred the vision based method was that the dressing process was fast. Users who disliked the force based method thought that their arm postures were not natural enough during the robot's path update while users who preferred this method thought that the robot could adjust its motion when force resistance occurred. For the proposed method, users who voted for the uncomfortable and neutral levels gave the reasons that the completion time was longer comparing with non-adjustable dressing. Users who liked the proposed method felt that their arm postures were natural during dressing while the robot can keep updating the path based on the real-time force resistance. Some of them mentioned that they were happy to have such a robot assistant for dressing in the future.
We visualise the comparison results of three users in Figure 7. We show some screenshots of the assistive dressing experiment in Figure 8. A video of the experiment can be found at www.imperial.ac.uk/personal-robotics/videos.

VI. LIMITATIONS AND FUTURE RESEARCH
We conducted experiments with 31 human participants, collecting their upper-body motion trajectories, calculating userspecific parameters, optimising dressing paths and collecting survey responses. In this section, we discuss the current limitations and future plan.
Most users prefer the dressing paths after optimisation with the proposed method. This result is consistent with our considerations of enabling natural postures and exerting less effort of users in improving the comfort level. However, according to the survey responses, there are still a few participants who hold a neutral or less positive opinion with the optimised paths. What we learned from this phenomenon is that although reachability and resistance criteria play significant roles in assistive dressing, considering other factors, for example speed of execution, may help further improve the user's comfort level.
Due to the iterative nature of the proposed method, the completion time was larger compared with the vision based method. Decreasing the completion time would be beneficial for reducing the user's fatigue in practical applications. If a user's upper-body pose can be consistently detected, the robot can learn an optimised dressing path for the user more quickly. However, this is challenging due to the vision occlusion of the human body. In real life, a human assistant can adapt the dressing path for another user without much effort by consistently inferring the user's upper-body pose from the configuration of the clothes. For instance, if the sleeve opening of the clothes points up, it can be inferred that the user's arm bends upward. If it points down, we can estimate that the user's arm stretches downward. For future research, markers can be attached to a human assistant's hands and the clothes to record the demonstrated dressing paths along with the clothes configuration information. From human demonstrations, an optimised dressing path can be generalised for the robot to provide dressing assistance more efficiently.
With the proposed method, we focus more on a high-level robot path update while the low-level robot path is planned using a hierarchical multi-task control. The future extension includes learning the robot's controllers directly, which can further improve the user's comfort level during dressing. Recently, preference learning methods have attracted much attention within the reinforcement learning field [47]- [49]. Reward functions encoding a human's preferences can be efficiently learned to determine how a dynamical system should act. It is an interesting research direction to incorporate preference learning methods with assistive dressing so that a user can directly specify how he/she wants the assistive robot arm to move.
Since no assumptions were made on the clothes, the proposed method can be applied to other similar clothes with sleeves such as a coat or a gown. However, the types of clothes are still limited in our current research. With a jacket, the robot only needs to interact with the user's arms. The main challenge in wearing clothes such as a T-shirt is to assist the user's head to enter the head opening of the T-shirt while guaranteeing the safety of the head. The proposed method has not considered this part yet. Future work can extend the current method by considering the interaction of the clothes with the human head.
The participants in the experiments are all healthy subjects. Although healthy people can also benefit from assistive dressing, the target population would be the elderly or people with constrained mobility. Assisting this population is challenging as the requirements from users can be more diverse. It will be useful for future work to conduct some surveys with this population by asking them to fill in questionnaires on how they would like the robot to assist and to provide any additional demands on the robot or the clothes. The base of the Baxter robot in the experiments is not moveable yet. By enabling the mobility of the robot's base, the robot can significantly improve its operational space. For instance, after assisting a user to wear the clothes, the robot can move in front of the user and face the user to adjust the clothes or perform delicate operations on the buttons or zippers of the clothes.
Currently, the robot only interacts with human users through motions. Adding verbal communications would make human-robot interactions more efficient. A human user FIGURE 8. Screenshots of the assistive dressing experiment. The Baxter robot assisted a human user in putting on a jacket with sleeves. For each user, three comparison experiments were run with the method in [7], [8] and the proposed method.
can be more sensitive in feeling force resistance compared with the robot. For instance, if the user feels that the arm gets stuck in the clothes, he/she can immediately send verbal commands such as 'up', 'down', 'left', 'right' or a combination of these to notify the robot to adjust the dressing path.

VII. CONCLUSION
We have introduced a user modelling method using multimodal information to improve the user's comfort level during robot dressing assistance. We proposed an online iterative path optimisation method that incorporated both vision and force data to enable the robot to iteratively optimise the dressing path for a user. We used vision data to guarantee that users held natural postures during an interaction, whilst force data was used to minimise the effort imposed on a user. To make a link between vision and force data, we introduced a human arm skeletal model to describe the movement relationships of the upper-body joints. We compared the proposed method against two single-modality state-of-the-art user modelling methods designed for personalised dressing assistance in a dressing scenario with 31 human participants. Experimental results showed that the proposed method achieved the best performance by enabling more natural postures that required less effort from test subjects.
Although vision and force information are considered together in the user modelling method in this paper, more user factors could be included in future work, such as the shapes of the user's arms and the clothing already worn by user. A user with a wide or large physique could cause a stronger force resistance in comparison to a user with a more slender build. Likewise, the force resistance could also be different depending on whether a user is already wearing a thin or thick garment of clothing. Features of the clothing, such as its texture or material should also be considered in the future to enable the robot to assist human users that wear a diverse range of clothes.