Decentralized Ability-Aware Adaptive Control for Multi-Robot Collaborative Manipulation

Multi-robot teams can achieve more dexterous, complex and heavier payload tasks than a single robot, yet effective collaboration is required. Multi-robot collaboration is extremely challenging due to the different kinematic and dynamics capabilities of the robots, the limited communication between them, and the uncertainty of the system parameters. In this letter, a Decentralized Ability-Aware Adaptive Control (${DA}^3\!{C}$) is proposed to address these challenges based on two key features. Firstly, the common manipulation task is represented by the proposed nominal task ellipsoid, which is used to maximize each robot's force capability online via optimizing its configuration. Secondly, a decentralized adaptive controller is designed to be Lyapunov stable in spite of heterogeneous actuation constraints of the robots and uncertain physical parameters of the object and environment. In the proposed framework, decentralized coordination and load distribution between the robots is achieved without communication, while only the control deficiency is broadcast if any of the robots reaches its force limits. In this case, the object's reference trajectory is modified in a decentralized manner to guarantee stable interaction. Finally, we perform several numerical and physical simulations to analyse and verify the proposed method with heterogeneous multi-robot teams in collaborative manipulation tasks.


I. INTRODUCTION
Collaboration with other agents can often be beneficial.For example, a multi-robot team like the one shown in Fig. 1 is more dexterous and robust in heavy and large object manipulation tasks [1] than a single robot.Also, in human-robot collaboration scenarios [2], the human's input can improve the intelligence and adaptability of the team.Yet, collaboration is not trivial, due to the effects of one agent's actions on the planning, control and decision of others.
Here, we investigate multi-robot collaborative manipulation tasks, where a decentralized robot team needs to achieve a common objective, while each robot has different motion and force capabilities.To perform such collaborative tasks, each robot within the team should maximize its contribution to the task, appropriately distribute the load among other robots, and adapt its behaviour according to the capabilities of the other robots in the team.Traditional centralized control methods have been used for multi-robot collaboration, but assume access to an accurate model of the robot team and full observability of the state of the other robots and the object.However, when considering the characteristics of real-world multi-robot teams [3], the most pressing problems are: (i) heterogeneity of the robots' capabilities, (ii) uncertainty of the system's physical parameters and (iii) lack of high bandwidth communication between the robots.
Therefore, we propose a method to maximize the force capability of each robot while designing a decentralized adaptive controller.Using this framework, we achieve the shared manipulation task under modelling uncertainties, input constraints and band-limited communication.
Motion and force capabilities: A manipulability metric [4] was first proposed as a measure of the capability of robotic mechanisms, and has been broadly used for redundancy control [5].Utilizing the task-oriented manipulability measure, the optimal joint configuration of a redundant manipulator can be determined [6].An efficient closed-form calculation of the task space manipulability was presented for a 7-DOF manipulator [7].For a multiple-arm system, the task-space force and velocity manipulability ellipsoids were given in [8].
Based on a study of the dynamic manipulability of robots, two physically meaningful choices for weighting matrix were provided [9].To simplify the calculation of the dynamic manipulability, the weighted manipulability ellipsoid can be used to approximate the manipulability polytope [10].In this paper, the weighted force manipulability ellipsoid (WFME) will be adopted to optimize the force polytope of the manipulator.Redundancy exploitation: For dexterous manipulation [11], [12], redundant manipulators were adopted to enlarge workspace, avoid singularities and collisions with the environment.Further, manipulator's redundancy can also be explored to maximize manipulability [13].The stiffness feasibility regions of redundant manipulators and a global task-oriented stiffness optimization were used to select robot poses [14] and configurations [15] for force and stiffness control, respectively.This highlights the important role of the robot's configuration in interaction tasks.The configuration of a redundant robot was also optimized while manipulating an object in space, to minimize disturbances on its base [16].Geometry-aware methods were used to provide a manipulability-based redundancy resolution [17], [18], which enabled tracking of manipulability ellipsoids.Yet, the desired ellipsoid is either pre-defined by a human expert or pre-recorded from demonstrations.Here, we define a nominal task ellipsoid which is generated from the manipulation task automatically and present a task-oriented manipulability optimization to match the WFME of the robot with the nominal task ellipsoid.
Decentralized adaptive control: A decentralized model reference adaptive controller was proposed to deal with uncertain physical parameters of the collaborative multi-robot system [19].For manipulation tasks with inaccurate kinematic model, the adaptive controller was used to handle the closed kinematic chain constraint and achieve accurate motion tracking with minimum-norm actuation force [20].Recently, Deep Neural Networks were adopted to model system uncertainties in model reference adaptive control [21].In studies of multiple mobile manipulators, distributed coordination control and synchronous cooperation control were presented to deal with time delays and switching topologies [22].A distributed cooperation scheme was adopted for networked mobile manipulators, which exploits the formation-based task allocation and task-oriented strategy [23].A distributed impedance controller has also been used for collaborative manipulation with event-triggered communication [24].Recently, a decentralized adaptive controller [1] for multiple collaborative mobile robots was introduced.This controller can track the reference velocity trajectory without a priori knowledge of the agent's position and payload properties.Yet, all adaptive controllers described above did not consider force and torque constraints (manipulability) of the robots.
Considering a multi-robot collaborative manipulation task, a decentralized ability-aware adaptive control (DA 3 C) framework (shown in Fig. 2) is proposed.Our method can handle both uncertain system parameters and input constraints without full communication between the robots.In the investigated multi-robot collaborative manipulation setup each robot has access to the desired manipulation task, which is described as a nominal task ellipsoid.Each robot tracks the nominal task ellipsoid using the task-oriented manipulability optimization method, while the DA 3 C enables multi-robot coordination with respect to the common manipulation task.The main contributions of this paper are summarised as follows: 1) A nominal task ellipsoid is defined based on the common manipulation task, and it is used to optimize the force capability of each manipulator.2) A decentralized adaptive controller under input constraints is designed and proven to be Lyapunov stable.3) Different heterogeneous multi-robot systems with input and communication constraints realize collaborative manipulation tasks using the proposed decentralized ability-aware adaptive control that guarantees stability and convergence.The remainder of this paper is organized as follows.The preliminary work is presented in Section II.In Section III, we define the nominal task ellipsoid and present the task-oriented manipulability optimization.The decentralized ability-aware adaptive control is described in Section IV.In depth analysis of the proposed method is carried out in Section V, where several numerical and physical simulations of collaborative manipulation tasks are performed.The conclusion and future work are discussed in Section VI.

II. PRELIMINARIES A. Manipulability Ellipsoid and Force Polytope
Generally speaking, the force manipulability ellipsoid can be used to approximately describe the force capability of the manipulator.In the simplest case, if we consider an arbitrary n-DOF manipulator robot k with the same torque limit across all its joints, we can obtain the force manipulability ellipsoid [4] using joint torque τ k such that τ k 2 1.This ellipsoid is a subset of all realizable forces and is defined as where J k is the Jacobian matrix of robot k and F k is the force (and torque) at the end-effector of robot k.
However, typically manipulators have different torque limits for each joint, expressed as Thus, the force polytope [10] of the manipulator is described by 2n bounding inequalities as As shown on the left side of Fig. 3, the force polytope corresponding to joint torque limits in (2) can be approximated by the weighted force manipulability ellipsoid (WFME) as where is a weighting matrix used to formulate the WFME.

B. Object's Dynamics
For multi-robot collaborative manipulation scenarios, the equation of motion of the object can be written as where m o and I o are the mass and inertia of the object, v o and ω o are respectively the linear velocity and angular velocity of the object, G o = [m o g T , 0] T is the gravity, F f is the linear and rotational friction force which is modelled as T , µ l and µ r are the linear and rotational friction coefficients, respectively.F t is the external force exerted on the object (in this paper it is task-related).All the variables are expressed in the world coordinate system.

III. TASK-ORIENTED NULL-SPACE MANIPULABILITY OPTIMIZATION
To utilize each robot's maximum capability, its force polytope needs to be optimized with respect to the manipulation task.As WFME is a conservative approximation of the force polytope as shown in Fig. 3, we optimize WFME instead to maximize each robot's force capability.To do this, we first define the nominal task ellipsoid to encode the task's force characteristics.Second, we optimize the robot's null-space motion to match the WFME with the nominal task ellipsoid.

A. Nominal Task Ellipsoid
The nominal task ellipsoid is defined as an ellipsoid of revolution, also called a prolate spheroid, whose two principle axes have the same length, while the third principle axis is the longest.The nominal task ellipsoid can be generated by the transformation of unit sphere as where c e is the Cartesian coordinates of the nominal task ellipsoid, c s is the Cartesian coordinates of the unit sphere.
C is the transformation matrix and can be calculated as C = R r R s with R s being the scaling matrix and R r being the rotation matrix.The scaling matrix R s is defined as R s = diag (l x , l y , l z ), where l x , l y , l z , are the lengths of the principle axes.In this paper, l x = 1 corresponds to the longest axis that is aligned with the desired force.l y = l z = c f l x correspond to the other two axes and c f ∈ (0, 1] can be set according to the task requirements, e.g.c f = 1 (isotropic ellipsoid in Fig. 3) for manipulation tasks that require force to be equally distributed along different directions, while c f < 1 (other cases in Fig. 3) for manipulation tasks that require force along a specific direction.
Given the desired force F t on the object and the unit vector a x of the longest axis, the common perpendicular vector u and the included angle φ between them, can be obtained by respectively.Therefore, the rotation matrix R r -which is used to align the longest axis with the desired force-can be obtained according to the angle-axis representation as where s φ = sin(φ), c φ = cos(φ), E 3 is the 3 × 3 identity matrix and u × is the skew symmetric matrix of u.From (5), we can obtain the following equation where the symmetric positive definite matrix M t represents the nominal task ellipsoid.A few nominal task ellipsoids for different manipulation tasks are visualized in Fig. 3.The shape of a nominal task ellipsoid is determined by the coefficient c f , while its orientation is determined by the direction of the desired force F t .

B. Null-Space Manipulability Optimization
As multiple robots manipulate the object jointly, the endeffector of each robot can be assumed to be fixed on the object via the corresponding grasping point.Thus, the velocity of the end-effector of each robot ẋk can be derived from the velocity of the object ẋo as where G pk is the grasp matrix of robot k, r k is the position vector from the object's center of mass to the grasping point of robot k, and ẋo = v T o , ω T o T .By differentiating (9) the acceleration constraint can be obtained as At the same time, the pseudo-inverse solutions for joint velocity Θk and acceleration Θk of each robot are respectively; where J † k is the pseudo-inverse of J k .In this paper, the null-space motion of the redundant manipulators is used to optimize the force manipulability given the current WFME of robot k and a desired nominal task ellipsoid (Section III-A).By using the tensor representation and exploiting the fact that ellipsoids lie on the Riemannian manifold of symmetric positive definite (SPD) matrices [18], the velocity-level inverse kinematics with task-oriented manipulability optimization can be derived as The first term of the right hand side is same as in (11), while the second term projects the difference between the nominal task ellipsoid and WFME to the null space motion of the manipulator.K M is the scaling matrix, vec () denotes the vectorization of symmetric matrices with Mandel notation 1 , the Log operator is a logarithm map [25] which can find the tangent vector between two points in the SPD manifold.M f k represents the WFME of robot k and is defined as The force manipulability Jacobian J f M k projects the scaled rate of change of M f k to the joint velocity Θk and it is obtained as where [26].The force capability optimization is achieved in the null space of manipulation task by using the proposed taskoriented manipulability optimization (12).

IV. DECENTRALIZED ABILITY-AWARE
ADAPTIVE CONTROL According to the force polytope shown in Fig. 3, the maximum operational force along any specific direction can be calculated.Subsequently, the decentralized ability-aware adaptive controller (DA 3 C) computes the control inputs in accordance with the force capability of each robot.

A. Force Capability
The maximum operational force of each manipulator along the specific task direction F t is calculated by max 1 The Mandel representation [ε] (as a column-vector) of any second rank, symmetric tensor ε is defined as follows: where H k is the inertia matrix of robot k, C k is the Coriolis and centrifugal force of robot k, G k is the gravity of robot k.The force polytope is defined by (15b) and can be obtained from the dynamic equation of the manipulator, while (15c) is used to define the specific task direction.

B. Ability-Aware Adaptive Controller
In order to track the desired trajectory of the object during decentralized multi-robot collaborative manipulation, we propose an ability-aware adaptive controller in which the force capability of each robot is considered.
According to (4), the ideal reference dynamics model of the object can be written as where A * is a Hurwitz stable matrix written as , and is a stacked vector of gravity term and nonlinear term in (4), and m * o and I * o are the nominal mass and inertia matrix of the object.Given the bounded reference (desired) trajectory ẋ * o and ẍ * o of the object, the bounded reference control input F * t , which represents the external force exerted on the object, can be computed from (16).
To design the adaptive controller, the actual object's dynamics model can be rewritten in the following linear form with respect to the system state ẋo as where A and B k are unknown constant matrices, A = is the total number of robots, U k = W * T φk Φ k is an unknown nonlinear term caused by modelling uncertainties which can be approximated by a Radial Basis Functions (RBFs) neural network, W * φk is the weight matrix for the RBFs, Φ k is the vector of RBFs and output bias.
For each robot, the adaptive control input is designed as where K xk , K rk , K nk and W φk are control gain matrices.
Considering the force capability of each robot, the control input constraints of DA 3 C are guaranteed by positive µmodification [27].Therefore, the modified ideal reference dynamics model (see (16)) and the actual system dynamics (see (17)) with input constraints are rewritten as where K f k is the control gain matrix, ∆F k is the control deficiency, which is described in Appendix A, and is used to generate the adaptive augmentation for the reference model.Furthermore, by defining the tracking error e = ẋo − ẋ * o , the tracking error dynamics can be obtained as where K = K − K * is the error matrix of control gain and K * is ideal control gain (see Appendix B).Last, the following adaptive laws are adopted in order to guarantee asymptotic tracking of the reference trajectory.
where P is unique SPD solution of the Lyapunov equation, P A * + A * T P = −Q < 0, where Q is any SPD matrix; Γ xk , Γ rk , Γ nk , Γ φk , and Γ f k are SPD gain matrices.The proof for asymptotic stability and asymptotic tracking using the above adaptive laws can be found in Appendix B.

C. Computed-Torque Control for Each Robot
According to the constrained operational force and the adaptive reference trajectory of the object, the controller of each robot can be designed as the following well known computed-torque feedforward control, where Θk , Θk are the error vectors of joint angle and joint angular velocity, and K p , K v are the corresponding gains.

D. Level of Communication
In the proposed ability-aware adaptive controller, the levels of communication between the robots in the team may vary depending on the different states of the multi-robot system.As long as the desired operational force of each robot is within its force polytope, each robot can be controlled in a fully decentralized manner.Each robot should know the object's velocity and the grasping location on the object.However, in cases where a robot would need to exceed its capabilities to track the reference input-desired operational force lies outside of the force polytope-then, the control deficiency ∆F k of each robot should be broadcast to all the robots of the team.The control deficiency of all the robots will be used in (19) to modify the unfeasible reference control input for each robot, which results in a coordinated adaptation of the robot team.

V. RESULTS
First, we perform three ablation studies to demonstrate the benefits of each one of the proposed components and their combination.Second, we demonstrate the adaptation capabilities of the DA 3 C framework in a decentralized manipulation setup.Third, we validate DA 3 C on a physical simulation, where three heterogeneous robots manipulate an object.The control frequency of all the simulations is set to 500 Hz.

A. Ablation Studies
1) Force Capabilities with and without Manipulability Optimization: In this ablation study, three 4-DOF planar robots manipulate an object from initial position [0, 0] m to final position [-0.1, -0.2] m.The maximum joint torques of robot A, B and C are set to 0.8, 0.6 and 0.6 N m, respectively.The simulation results without and with task-oriented manipulability optimization are shown in Fig. 4. Using the proposed method the WFME of each robot is optimized through the null-space motion to track the nominal task ellipsoid (yellow ellipsoid), and consequently the force capability (red arrow) along the specific task direction is increased.
2) Ability-Aware vs Ability-Agnostic Adaptive Controller: The second ablation study compares the tracking performance of the proposed ability-aware adaptive controller (DA 3 C) with an ability-agnostic adaptive controller [1].The task considers three robots manipulating an object, with 20 kg mass and 20 kg m 2 inertia (along z-axis), on a plane with a sliding friction coefficient of 0.2.The adaptive controllers are initiated with 80% of these values.The reference control input is set to F * t = 4 * [sin(0.4t),sin(0.3t)]T N. The maximum force F max of each robot is set to [1.0, 1.0] N, and the constant vector δ (see Appendix A) is set to 10% of F max .The object's trajectories with ability-agnostic control (green line) and ability-aware control (blue line) are shown in Fig. 5.For the ability-agnostic adaptive controller, the control input (see Fig. 6) exceed the force constraints, hence, we limit     the control input to the maximum force.This results in a significant deviation of the object's trajectory from the desired one (see Fig. 5).On the other hand, given the limited capability of each robot in the ability-aware controller (see Fig. 7) the new reference trajectory (red dashed line) deviates from the original one (blue dashed line) according to the control deficiency of all the robots as shown in Fig. 5.In this way, the multi-robot system can track the modified trajectory accurately with average tracking error 0.009 m/s and 0.013 m/s along x-axis and y-axis, respectively.
3) DA 3 C with and without Manipulability Optimization: Here, we compare the proposed ability-aware adaptive control with and without manipulability optimization.The three manipulators of Section V-A1) are used to manipulate the same object as in Section V-A2).The reference trajectory is circular and is described with v x = −0.1 * π 6 * sin( π 6 t) m/s, v y = 0.1 * π 6 * cos( π 6 t) m/s and w z = 0.The nominal task ellipsoid is a sphere to equally allot the force capabilities along different directions.The velocity tracking errors of the object are shown in Fig. 8 (left).The average tracking errors along two axes with manipulability optimization are (0.0138, 0.0143) Fig. 10: A heterogeneous multi-robot team manipulates an object.The object's circular motion is illustrated with respect to the yellow circle, which is fixed in the world frame.Fig. 11: Desired, modified reference and actual trajectory of the object (1st column), and desired (dashed line) and actual (solid line) end-effector forces of robot A (2nd column), robot B (3rd column) and robot C (4th column).By setting robot-specific safety zones (green region), the force of each robot is guaranteed not to violate its actual limits after a few iterations.m/s, which outperforms (0.0138, 0.0301) m/s corresponding to the one without.In the right part of Fig. 8, we can observe all the control inputs along with the force limits (regions).The DA 3 C with manipulability optimization regulates the robot's configuration to achieve similar force capabilities along both directions, such that tracking performance is balanced in both directions.

B. DA 3 C on-the-fly Adaptation
Here, we show that the proposed method is capable of handling changes with respect to the mass of the object, the friction coefficient and is also tolerant to faults of team members.We consider the same circular task, where the mass of the object is increased from 20 kg (white region 1 ) to 30 kg (green region 2 ) at 30 s, the friction coefficient is decreased from 0.2 (green region 2 ) to 0.1 (blue region 3 ) at 50 s, and the max force of robot B is reduced to 1/3 while robot C is shut off (purple region 4 ) at 70 s.The results are shown in Fig. 9, the control inputs of the robots increase when the object's mass is increased and decrease when the friction coefficient is decreased.Further, it is worth noting that in the purple region 4 , the control input of robot A is increased and the reference trajectory is adapted, due to the loss of robot C and the reduced capability of robot B.

C. DA 3 C for Multi-Robot Collaboration
To verify the proposed DA 3 C framework, three heterogeneous robots, the torque-controlled Kuka-iiwa (A), the Frankapanda (B), and one position-based admittance controlled mobile manipulator (C), are used to manipulate an object on a surface (see in Fig. 10), where both the object's mass, and the friction between the object and the surface are unknown.The task is a circular motion and the maximum force for all the robots is limited to 10 N, while the constant δ (see Appendix A) are set to 1, 2, and 3 N, respectively.For the torquecontrolled robots, the feedback gains K p , K v in ( 23) are set to diag( [16,16,16,12,12,12,12]) and diag([0.2,0.2, 0.2, 0.1, 0.1, 0.1, 0.1]), respectively.The interaction between the object and the robots is modelled with a spring-damper (stiffness 5000 N/m and damping 500 Ns/m).The physical simulation and visualization are developed in MATLAB Simscape and Simulink.
The object's desired and actual trajectories, and the force profiles of each robot are shown in Fig. 11.The reference trajectory is modified according to the control deficiency, due to the limits on the robots' force capabilities.Robot C does not always track the desired force accurately, due to the admittance controller.Thus, its safety zone is set larger (green area) to always experience forces within its limits.Also, such force deviations introduce further unmodelled interaction forces, due to the formed closed-chain.Yet, each robot adapts its control gains on-the-fly to cope with these deviations too.

VI. CONCLUSION
In this paper we propose a decentralized ability-aware adaptive control (DA 3 C) framework for multi-robot collaborative manipulation, which can handle uncertain system parameters, input constraints and band-limited communication.The key idea is that the force capability of each robot is maximized by exploiting its null-space motion, while the designed adaptive controller enables decentralized coordination according to the capability of each robot.The proposed method achieves accurate trajectory tracking irrespective of the low-level controllers, and can be used for heterogeneous fixed-base and mobile-base multi-robot systems.An open challenge is the inclusion of joint position limits into the ability-aware adaptive controller.In our future work, we plan to use DA 3 C for human-robot co-manipulation experiments where the access to human's capability is not straightforward.

APPENDIX A CONTROL INPUT WITH µ-MODIFICATION
The control input with µ-modification is written as where µ is positive design constant, F δ kmax = F kmax − δ, δ is a constant vector, 0 < δ < F kmax , and F kmax is the maximum force of robot k which is obtained from (15).The corresponding control deficiency can be calculated as

APPENDIX B LYAPUNOV STABILITY ANALYSIS
In order to match (19) and (20), we can choose the ideal gain matrices K * xk , K * rk , K * f k , and K * nk according to the following forms: According to the error dynamics in (21), we consider the following Lyapunov function candidate: Therefore, given a bounded reference input, we can conclude that the system can achieve asymptotic tracking by using Barbalat's Lemma.

Fig. 1 :
Fig.1: Pictorial description of the multi-robot collaborative manipulation setup, where the ability of each robot is illustrated as a force polytope.

Fig. 3 :
Fig. 3: Force polytope and WFME (left).Nominal task ellipsoids corresponding to different manipulation tasks where Ft represents the desired force along different directions (right).

Fig. 5 :
Fig. 5: State of the object with ability-agnostic and ability-aware controller.

Fig. 8 :
Fig. 8: Tracking errors of the object (1st column) and control inputs of robot A (2nd column) with and without manipulability optimization.The cyan regions represent the force capability envelope of the robot without optimization while the orange regions represents the capability envelope with optimization.

Fig. 9 :
Fig.9: State of the object (1st column) and control inputs of robot A (2nd column) and robot B (3rd column) and robot C (4th column).In region 2 , the mass of the object is increased from 20 kg to 30 kg.In region 3 , the friction coefficient is decreased from 0.2 to 0.1, while in region 4 , the force capability of robot B is reduced to 1/3 while robot C is shut off.