The CoDyCo Project achievements and beyond: Towards Human Aware Whole-body Controllers for Physical Human Robot Interaction

—The success of robots in real-world environments is largely dependent on their ability to interact with both humans and said environment. The FP7 EU project CoDyCo focused on the latter of these two challenges by exploiting both rigid and compliant contacts dynamics in the robot control problem. Regarding the former, to properly manage interaction dynamics on the robot control side, an estimation of the human behaviours and intentions is necessary. In this paper we present the building blocks of such a human-in-the-loop controller, and validate them in both simulation and on the iCub humanoid robot using a human-robot interaction scenario. In this scenario, a human assists the robot in standing up from being seated on a bench.

The CoDyCo Project achievements and beyond: Towards Human Aware Whole-body Controllers for Physical Human Robot Interaction References Controller Intention (q, ν) (q, ν), ν, f Section IV Section II Section III Prediction τ f Fig. 1: Collaboration between robot and human.High-level objectives generate references for the human-aware controller that commands the robot torques.Thanks to the wearable sensors, we can estimate the human state and dynamical quantities and provide them as feedback to the controller.A human Intention predictor may be used to detect the human intention given the estimated human dynamics.
workers, i.e. the so called collaborative robotics described in the Industry 4.0 manifest [1, Sect.2.2].In all these scenarios, the human presence is central and cannot be overlooked.
Research in physical human-robot interaction (pHRI) focuses on trying to answer the following question: "How can we predict human intentions so as to synthesise robot controllers that are aware of and can react to the human presence?" The most common methods to estimate human intention are mainly based on minimum jerk models [2], or on imitation learning techniques.In the latter, the movements of two human actors are typically retrieved with motion capture techniques, clustered in motion databases [3], [4], [5] and then used to learn the interaction skills [6], [7], [8].In contrast to these approaches, we formulate the problem as the following: given a human model, and measurements provided by wearable sensors, we want to estimate the current human configuration (state) and intention (dynamical quantities: force-torques, accelerations, etc.); a robot controller formulated with this information can then be aware of future human actions, and act accordingly.
In this paper, we propose and describe the main blocks that are needed to perform interaction between a human and a human-aware-robot, as depicted in Fig. 1.In particular the contributions of this paper are mainly three:  shows the external forces acting on the human and the effort estimated at the human joints as grayscale coloured spheres • Design of a momentum-based balancing controller to take into account and exploit the support the human offers (c.f.Section II) • Generation of high-level references, bootstrapping from human experiments, and refining them through optimisation and machine learning methods (c.f.Section III) • Online estimation of human dynamical quantities such as force-torques, accelerations and internal torques (c.f.Section IV).We validate all the elements both in the Gazebo simulator and on the real iCub humanoid robot and we report the results in Section V. We perform a pHRI experiment, where the robot has to stand from a bench helped by a human subject, see Fig. 2a and 2b for a snapshot of the experiment and the proposed human dynamics monitoring tool respectively.
What is currently missing, is the dashed block and lines in Fig. 1.By adding the feedback from the human dynamics estimator to the robot controller we can synthesise fully humanaware controllers.Closing this loop is the main motivation of the newly founded EU Horizon 2020 An.Dy.project: designing human in the loop robots controllers.Future perspectives in Section VI conclude the paper.

A. Background
In this work, we model both the robot and the human with the same mathematical formalism.This choice is motivated by the fact that we would like to use the estimated human state and dynamics as additional feedback in robot controllers when pHRI scenarios are considered.Furthermore, a richer representation, e.g.modelling muscle activation, of the human dynamics seems premature, given the current state-of-the-art of robot controllers for pHRI.
The application of the Euler-Poincaré formalism [9] to the interacting agents, leads to four sets of equations describing: i) the dynamics of the robot, ii) the dynamics of the human, iii) the linking equations characterising the contacts between agents and environment, and iv) the contacts between human and robot: The state of an n (internal) degrees of freedom (DoF) freefloating dynamical system is composed of its configuration q ∈ SE(3) × R n and its velocity ν ∈ R 6+n .The matrix M and vector h are the mass matrix and nonlinear bias vector respectively, while actuation is provided by the internal torques τ .Eq. (1a) describes the dynamics of one of the two agents, e.g. the robot, while Eq.(1b) describes the dynamics of the other agent, i.e. the human.Note that, while the formalism remains the same, the quantities and degrees of freedom of the two systems are in general different, i.e. all quantities related to the human are denoted with the (•) symbol.The Jacobians J ∈ R 6ke×n+6 and J I ∈ R 6k×n+6 , similarly, group all the Jacobians corresponding to the k e contact force-torques with the environment, f e ∈ R 6ke , and the k interaction forcetorques, f ∈ R 6k .It is worth noting that the interaction forcetorques f are the same in both agents described by Eq. (1).Finally, equations (1c) and (1d) describe the constraints due to the rigidly-assumed contacts between the agents and the environments and between the two agents, respectively.

II. ROBOT CONTROLLER TO EXPLOIT HUMAN HELP
State-of-the-art whole-body controllers are often decomposed in two different stages, each of which solves a different control objective [10], [11], [12].The first stage is responsible of controlling the robot momentum that, when expressed at the center of mass and with the inertial frame orientation, is defined by H := m ẋ c H ω , with m the total mass of the robot, x c ∈ R 3 the position of the robot center of mass, and H ω the angular momentum of the multi-body system.The dynamics of the momentum is linearly related to all the external forces and torques acting on the system, i.e.
where mg is the force due to gravity, c X i ∈ R 6×6 is a proper frame transformation matrix and f i is the i-th external force-torque.All the force-torques and transformation matrices can be compactly written with f ∈ R 6k and X ∈ R 6×6k respectively.Ideally, we can consider these force-torques as virtual control inputs and one can choose f so that Ḣ = Ḣ * , where Ḣ * ensures that x c → x d c and H ω → 0. Additional constraints are usually enforced on the variable f , so that the problem is generally formulated in term of an optimisation problem, i.e. as a quadratic program (QP).Once f * , the optimal value for f is determined, the second stage consists in computing the joint torque required to actually reach the desired value for the contact force-torques at the feet.This computation is achieved by solving a constrained QP, with f * and torques τ related by equations (1a),(1c).A secondary postural task can be used in order to reach some desired configuration, e.g. the one corresponding to a standing posture, and implemented as an impedance controller in joint space, i.e. such that with q j the joint coordinates.The postural task is usually attained with a lower priority with respect to the realisation of the desired contact force-torques f * .More details and properties of this kind of controllers can be found in [13].
In this paper, we focus on the task of letting the robot stand from a bench helped by a human subject.This in turn implies that additional, unpredictable forces, i.e. the one applied by the human to the robot, act on the robotic system.Consider Eq. ( 2) and add the additional human force-torque f hum , expressed in the same frame of the momentum: Given a measure of f hum , one possibility is to completely cancel out this term by using the controllable force-torques f .The macroscopic effect of this cancellation is that if a user would like to help the robot stand up, the robot motion would be invariant with respect to the help provided by the user since the effects of the external force-torques are cancelled out.An alternative approach is to cancel out only a part of the human force-torque while keeping the component which may help the robot stand.Recall that Ḣ * , i.e.
. This equation stresses the fact that an eventual help from a user to lift the robot up is useless: the rate of change of V does not depend upon the external force-torques, so the standing up motion is invariant to the user interactions.The modification proposed here is based on a decomposition of the external force-torque f hum that highlights the component of this external forcetorque that helps decrease the function V .More precisely, one can decompose the external supportive force-torque as follows: Note that the scalars α and β are the components of the external force-torque f hum along and perpendicular to the momentum error H. Now, one can re-define Ḣ * as follows ) and choose the control input f such that Ḣ(f ) = Ḣ * .By computing the time derivative of (5) along the system evolution ( 4)-( 6), one easily verifies that: The fact that the external supportive force-torques help the robot stand up is encompassed in the right hand side of the above equation: a negative α, i.e. the external force-torques are in the direction of motion, make the Lyapunov function decrease faster.Hence, (6) can be used to compute the f * needed to help the robot during standing up motions.

DEFINITION
The controller described in the previous section needs to be provided with a center of mass (CoM) trajectory reference and joint references for the postural task.It is the responsibility of the module or the user providing these references to make sure they can be actually tracked by the robot, i.e. not leading to a fall.This section describes the procedure used to define the CoM and joints references, starting from human acquisitions, to their adaptation to the humanoid robot.

A. Bootstrap with Human Motion Capture Acquisitions
To obtain the motion of a human while performing a "standup" motion with assistance task, we designed an experiment where we record the standing-up subject's angles of the ankle, knee, hip, shoulder and elbow joints.In the experiment one of the subjects (Subject 1) performs a stand-up task and the other subject (Subject 2) participates as an assistant, i.e. (Subject 2) helps Subject 1 to stand.The experiment is repeated three consecutive times and we use the measurements of the last trial only.We measure the kinematics of Subject 1 using the 3D Investigator Motion Capture System (NDI, Waterloo, Ontario, Canada) consisting of 1 × 3 camera array at a sampling rate of 100Hz.To perform the motion capture of Subject 1, we placed clusters of active markers on 5 body segments of the right side, namely the foot, shank, thigh, upper arm and fore arm, and 2 on the back, i.e. lower back and upper back.By using the NDI First Principles Motion Capture Software, we compute the angles of the joints connecting the considered segments.The zero position, i.e. the position such that all the joint angles are equal to zero, is defined as their value when the subject is standing straight and with his arms extended in front of him, see Fig. 3.
The joint angles trajectories captured by the motion capture system are then used as initial guess for the robot joint angles, and consequently the initial CoM trajectory is thus computed.Obviously, the resulting trajectories cannot be directly applied as references to the robot, and the remainder of this section is dedicated to their modification.

B. Optimising the CoM task compatibility
The controller presented in Section II is an example of a hierarchical-based control architecture.In this kind of controllers, each level of the hierarchy is agnostic of the others by design, and is responsible of stabilising a desired reference that is usually provided by the higher level.However, the decoupling between levels eliminates any guarantee that the planned task trajectories will be executed properly by the lower control layers.The end result is typically unstable or undesirable whole-body behaviours, and these tasks can be qualified as incompatible.Prioritisation techniques use weighted sums [14], [15], hierarchies [16], [17] or a mix of both [18], [19] to manage task incompatibilities at the whole-body control level, but are difficult to tune and may not actually solve the problem.Given that it is the task reference values which generate the incompatible control objectives, an alternative to prioritisation tuning is to modify the task trajectories and make them compatible as initially suggested in [20].To do so, we introduce a feedback loop that measures the errors induced by incompatibilities and changes the task trajectories to reduce them.This loop should take into account the controller hierarchy, as well as the robot's dynamics and environment.Given the complexity of the proposed feedback loop, we improve the trajectories through a model-free trialand-error process that minimises a cost function using blackbox optimisation solvers [21].
Here, we iteratively improve the task trajectories by trialand-error learning.Following the work in [22], we define a task compatibility cost function that combines trajectory tracking, end-point reaching and energetic costs.This cost is evaluated by performing the prescribed tasks and we use Bayesian Optimization to update the trajectory parameters, namely way-points.Starting from the initial CoM trajectory as described in Section III-A, we use the task optimisation process to improve it so as to guarantee better performances in the stand-up motion.
In Fig. 4, the evolution of the CoM for the original and optimised movements is provided.The whole-body motion produced by the original CoM trajectory, Fig. 4a, is unstable and causes the robot to loose balance.The optimised CoM trajectory, on the other hand, produces a stable sit-to-stand transition as shown in Fig. 4b.At the moment the bench contacts are deactivated in the controller, the original motion immediately tends to lift the CoM upwards, despite an inappropriate x-location of the CoM (not close enough to the foot polygon of support).This inconsistent CoM trajectory does not respect the dynamic balancing conditions [23] and causes the robot to fall.The optimised trajectory moves the CoM more aggressively in the forward direction as well as lowering it prior to the contact deactivation instant, thus leading to a successful sit-to-stand transition.

C. Optimising the Postural Task
CoM dynamic manipulability is a metric to measure robots physical abilities to accelerate their CoMs in different directions.It is defined as a velocity independent metric which depends only on robot configuration and inertial parameters [24].CoM dynamic manipulability determines the CoM acceleration due to weighted unit norm of actuated joint torques.This norm is defined as where W τ is a symmetric positive definite weighting matrix.By using Eq.(1a), we can find the CoM acceleration as a function of joint torques as ẍc = J τ τ + ẍcvg , where J τ ∈ R 3×n is a Jacobian that maps the joint torques to the CoM acceleration and ẍcvg ∈ R 3 is the gravity and velocity dependent part of the CoM acceleration.By applying torques which satisfy (8), the CoM acceleration will be bounded as This inequality defines an ellipsoid in the CoM acceleration space.The center of this ellipsoid is ẍcvg and its radii and orientation can be determined by the eigenvectors and eigenvalues of matrix J τ W −1 τ J τ .This implies that this ellipsoid depends on J τ , which is a function of the robot parameters and configuration, and on W τ , which is a user dependent parameter based on the application.
In this paper, the weighting matrix is chosen so as that the resulting ellipsoid accommodates for all possible CoM accelerations due to torque limits, see [24, Sec.III-A].The maximum joint torques are assumed to be 40Nm for the legs and 20Nm for the arms of the iCub.The robot is in a sitting configuration with joint angles as result of the human motion capture experiment in Section III-A.We decide to use the shoulder pitch angle and elbow angle as two optimisation variables to maximise the CoM dynamic manipulability in a desired direction.The desired direction is assumed to be horizontal since this corresponds to the first segment of the desired CoM trajectory, i.e. moving the CoM from the bench to lie on top of the feet.
The CoM dynamic manipulability for different arm configurations (i.e.different values of the shoulder pitch and elbow angles) is shown in Fig. 5.The maximum manipulability can be found at shoulder pitch = −33[deg] and elbow = 30 [deg].The values corresponding to the maximum manipulability are then used as desired joint coordinates q d j in Eq. (3).

IV. REAL-TIME ESTIMATION OF HUMANS DYNAMICS
This section describes the theoretical formulation and the software architecture we used to estimate in real-time the human dynamics.
To efficiently describe the dynamics of a mechanical system, and of humans as previously motivated in Section I-A, we use the Newton-Euler equations 1 .This set of equations is commonly used to describe the dynamics of articulated rigid body systems such as robots, and we combine them with measurements equations for a multitude of sensors, such as accelerometers, gyroscopes, force/torque sensors, etc., which might be redundant.
Assuming that the considered system is composed of N B rigid bodies, we want to estimate the variable d ∈ R n D which contains dynamics quantities related to each rigid body and joint composing the system.We also denote with y ∈ R n Y the vector containing all the measurements coming from the various sensors located on the system body.We can thus rewrite the Newton-Euler equation and the measurements equations in a compact matrix form: where the matrix D ∈ R neq×n D , bias vector b D ∈ R neq and the matrix Y ∈ R n Y ×n D , bias vector b Y ∈ R n Y are statedependent elements obtained by manipulating the model and measurement equations respectively, and n eq is the number of equations resulting from the Newton-Euler formulation.The two systems in Eq. ( 10) can be grouped together in a linear system in the only unknown variable d.Solutions to the system (10) can be obtained with different methods depending on the number of measurements available.For example, if only joint accelerations are available, then ( 10) is a square system and the solution roughly corresponds to the one obtained by computing the inverse dynamics.In case of more measurements, the solution in general does not exists and it is common to obtain minimum error solutions, i.e. a least square solution.
In this paper, we adopt a different technique.We consider each of the variables composing the system as random variables with an associated a-priori probability density function.Given the actual measurements coming from the sensors, we want to find an estimate of d that maximises the likelihood of the a-posteriori probability density function p(d|y).We thus perform a Maximum A-Posteriori (MAP) estimation.We refer the reader to [25], [26] for a more thorough description of the method.
The MAP estimation has been implemented as YARP C++ modules and the whole software architecture (see Fig. 6) has been validated in the human-robot collaboration scenario.The human is modelled with 22 spherical joints, i.e. allowing three rotational degrees of freedom (DoF), connecting 23 bodies.We use the Xsens MVN motion capture wearable suit to obtain the 3D position and orientation of each of these bodies, together with their velocities.Interaction between the human and the floor occurs at specified location and it is measured by two force plates.The force-torques exchanged with the robot are estimated by the robot itself thanks to the presence of the artificial skin, the force/torque sensors mounted on the robot and of the whole-body estimation algorithm [27, Sec.As an intermediate step, we need to convert the information about the human motion coming from the motion capture system into a representation compatible with the formalism in Eq.(1b).In fact, we need to obtain the state of the system in term of its generalised coordinates, i.e. (q, ν).
We use inverse kinematics to map the bodies pose to the configuration q.In particular, denote with T i , T j ∈ SE(3) the pose of two connected bodies in the original human model and with T i (q), T j (q) ∈ SE(3) their pose in the rigid-body model in Eq. (1b).We can now define the relative pose to be T ij = T −1 i T j and T ij (q) = T −1 i (q) T j (q) for the human and the rigid-body model respectively, where q are only the degrees of freedom of the joints connecting the bodies i, j.We now solve the following nonlinear optimisation problem where error : SE(3) → R is an error function and qmin , qmax are the joint limits.
To compute the generalised velocities of the joints connecting the frame i and j, i.e. ν, we measure the relative angular velocity of the two bodies ω ij = ω i − ω j .The generalised velocities are then computed by inverting the following relation: where i J j is the relative Jacobian of the link j with respect to the link i and ω ij is the angular velocity of the body j with respect to the body i using the model in Eq. (1b).Note that, in general ω ij = ω ij as it depends on the rigid-body model.
To obtain ν we solve Eq.( 12) in the least square sense.
The optimisation problem (11) and Eq. ( 12) are solved for every (i, j) pair.Note that the mapping procedure (T ij , ω ij ) → (q, ν) is quite generic, allowing one to use simpler models in the estimation process by changing the kinematics of the rigidbody model.
Finally, the human state as processed by the inverse kinematics together with the force-torques measurements are supplied as input to the MAP estimation algorithm, which runs online at 100Hz.

V. SIMULATION AND EXPERIMENTAL SCENARIO
In the experimental scenario a human subject stands in front of the robot agent so as to help it stand up from being seated on a bench, see Fig. 2 for a snapshot of the different phases of the experiment.
Our test platform is the iCub robot, a state-of-the-art 53 degrees of freedom humanoid robot [28].For the purpose of the present experiment, only the principal 23 degrees of freedom, located in the legs, torso and upper arms are torque controlled.All the other degrees of freedom are position controlled.The robot is controlled by the momentum-based balancing controller described in Section II and implemented in Simulink R by using the WB-Toolbox [29].References to the controller are coordinated by an internal state-machine, whose states trigger depending on external signals, such as changes in the contact force-torques.The trajectories commanded by the state-machine are the one described in Section III.

A. Gazebo simulations
We first tested the proposed controller and references by using the Gazebo simulator [30] together with the YARPbased plugins [31] to connect the iCub simulated model to the controller.Interaction with the human partner is simulated by using the Geomagic touch haptic device, which has been integrated in the YARP middleware.The frame corresponding to the tip of the Geomagic is virtually attached to the endeffectors of the simulated iCub, so as to simulate the human grasping the robot arms at this location.When the human clicks and holds the button of the Geomagic and moves the device at the same time, the new position of the tip is used to compute the interaction force to apply to the robot, following a linear spring model whose constant value is determined by the max force that can be applied (chosen by the user, e.g.30N).At the current state, no force feedback is provided to the user.The reference frame is chosen with the origin on the left foot while the robot is standing; the z axis points against the gravity, the x axis points forward, and the y axis completes the right-handed base.Fig. 7 shows the results of the stand up experiment with and without the human assistance.An initial interaction force ≥ 10N in the x-direction is used to trigger the stand-up movement, whereas the proper assistive force is in the z-direction.Even if the applied forces are not very big, it is possible to notice the effect of the assistive force on the CoM position.

B. iCub Experimental results
During the interaction with the real robot, the human dynamics is continuously estimated and monitored by the software architecture described in Section IV.The human subject wears the Xsens sensorized suit and stands on the two force plates by positioning each foot on a platform.In the current experiment the human cannot move the feet outside the force plates, as those are the only source of information to measure the ground reaction force-torques.As we currently lack of a global base pose estimation for the robot, the distance between the robot and the human remains constant and known a-priori throughout the entire experiment.
We performed two different experiments for the standup motion.In the first experiment, we performed 10 trials where the robot stands without the human assistance.We then repeated the experiments with the presence of the human.In this second scenario, we asked 6 subjects of different height, sex and experience in interacting with robots to perform each 7 different trials trying to help the robot standing up, and 7 trials trying to hamper the robot action, as a test for the robustness of the controller.
To understand if the human provides help during the standup, we used, as a basis for comparison, the norm of the robot torques that can be assimilated to the electric power used by the robot motors, i.e.P e ∝ τ .Indeed, as the iCub robot is equipped with electric motors, motor torques are proportional to motor currents and, as they are driven by constant voltage, also to electric power.Fig. 8 shows the robot torque norm average, together with the 95% confidence region, in both scenarios.Notice that the robot needs to provide less torque when helped by the human.
Fig. 9 shows the estimation of the human torques during the robot standup.We plot the average torques, together with the 95% confidence region, of the L5-S1 (Lumbosacral) joint of one subject across the different trials.As all the joints are modelled as spherical (3 DoFs) joints, we plot the torques around the three axes.Nevertheless, only the torque around the y axis has as significant component as expected given the performed movement.

VI. CONCLUSION AND FUTURE PERSPECTIVES
In this paper we presented a pHRI scenario where a robot has to stand from a bench helped by a person.The state-of-theart momentum-based balancing controller has been modified accordingly, thus exploiting the human help instead of simply cancelling it out.The references for the controller have been generated by optimisation and machine learning techniques.We also presented a method and the necessary software architecture to perform online estimation of human dynamics during the pHRI experiment.
Fig. 1 describes a general pHRI scenario, and its blocks have been presented throughout this paper.What we did not present, and that will be the focus of future research, are the dashed block and connecting lines, i.e. how do we use the human feedback in the robot controller.
As an example, assume that the control objective for the robot is to keep the balance.Then, this objective can be viewed as the asymptotic stabilisation of an output function h ∈ R k , usually representing the robot center of mass, momentum, etc. [13].For instance, if the function h represents the robot center of mass, the function only depends on the robot's position q, i.e. h = h(q).Then, one may attempt at its control by computing the second order time derivative of h, and use the robot torque input τ to impose desired dynamics for h.This process of feedback linearisation may involve the knowledge of the state of the human (q, ν) as well as the human joint torques τ .
Another very promising perspective of the results presented in this paper is the control of human related quantities via the robot actuation.For instance, assume that the above task for keeping the robot balance leaves free some robot control actuation.Then, the question is: "Can we use the robot input redundancy to control the human body?"To answer this question, one must first define the control objective for the human.Assume that a paralysed human wants to balance by using the help of a robot.Then, the human control objective may be the control of the human center of mass, i.e. an output h H = h H (q) that clearly depends only on the position q of the human body.Then, the control of this quantity can be attempted by imposing a desired dynamic for ḧH that can be influenced by the redundancy of the robot input actuation while balancing.All these research directions will be the scope of forthcoming publications and the main focus of the newly funded EU H2020 An.Dy.project.

Fig. 2 :
Fig. 2: pHRI experiment.iCub stands up from a bench with the help of a human subject wearing a sensorized suit.The visualisation (b) shows the external forces acting on the human and the effort estimated at the human joints as grayscale coloured spheres

Fig. 3 :
Fig. 3: Motion Capture experiment performed with the human subject.The figure describes the configuration for which the zero angles are defined.

Fig. 4 :
Fig. 4: Original and optimised CoM reference trajectories and their resultant whole-body motions.The original trajectory produces an unstable standing motion causing the robot to lose balance.The optimised CoM trajectory, however, produces a successful sit-to-stand transition.

Fig. 6 :
Fig. 6: Schema describing the YARP modules used for the real-time human dynamics estimation.

Fig. 7 :
Fig. 7: CoM during the stand-up motion simulated in Gazebo with and without human assistance.

Fig. 8 :Fig. 9 :
Fig. 8: Robot torques norm with and without human assistance.The lines show the sample means across the different trials.The shaded regions represent the associated 95% confidence region.