Efficient and Collision-Free Human–Robot Collaboration Based on Intention and Trajectory Prediction

Human–robot collaboration (HRC) is an important topic for manufacturing and household robotics. It is very challenging to ensure both efficiency and safety in HRC. This article presents an HRC pipeline that generates efficient and collision-free robot trajectories based on predictions of the human arm and hand (AH) motions. We train a recurrent neural network for AH trajectory prediction based on observed initial trajectory segments. To increase the accuracy of target estimation at an early stage, the observed and the predicted hand palm trajectories are combined to predict the current AH motion target using Gaussian mixture models (GMMs). An optimization-based trajectory generation algorithm is proposed to ensure the safety of the human while collaborating with the robot. The proposed system is validated in a shared-workspace scenario with human pick-and-place motions. The task can be safely and efficiently completed. The results demonstrate that our proposed pipeline can predict the human AH trajectory and estimate the motion target intended by the human accurately and early.


I. INTRODUCTION
R OBOTS are powerful and fast, while humans are intelli- gent and can carry out dexterous manipulation tasks that may be hard for robots.Human-robot collaboration (HRC) is increasingly used in order to improve work efficiency and flexibility.However, it is very challenging to ensure the safety of the human and the efficiency of the robot at the same time.In this article, we consider a scenario shown in Fig. 1, where a human and robot share a narrow workspace.Physical interactions, such as compliance control in [1], are not considered here.However, the robot should work together with the human.Fig. 1.Side view of the human-robot pick-and-place platform used for our experiments.In the shared workspace, the human puts objects to one of twelve target positions.The robot observes the current AH trajectory and predicts both the short-horizon human motion and the human reaching target position.The robot then generates collision-free and goal-oriented trajectories online to collaborate with the human for the assembly task.
To improve the joint assembly tasks' efficiency and ensure human safety in the shared workspace, the robot needs to be able to predict the human arm and hand (AH) trajectory and infer the human's target position in a short time horizon.
We model human motions using four joints, i.e., shoulder, elbow, wrist, and palm.Some work has been proposed to predict human trajectories in similar scenarios, such as [2], [3], and [4].But these works just consider one or two joints, such as the wrist and elbow, which is not enough to make sure that the robot can avoid the human AH, especially in this narrow workspace.In [5], an adaptive method is used to predict the human hand trajectory.However, it is hard to use in our task because the problem dimension is too high to adjust the weights online.In general, creating an accurate dynamic model for human AH motion prediction is difficult, especially for different persons.Therefore, the state-of-the-art, e.g., [6] and [7], are based on data-driven models.Inspired by [6], we also adopt a position-velocity encoder-decoder neural network for AH trajectory prediction.
In fact, we could train a neural network to predict the short-term AH trajectory and estimate the intended final target position at the same time.However, training such a multitask neural network would be challenging because of more parameters to tune, and it would also be hard to take semantics (like a set of known target positions) into account for human intention prediction.Instead, probabilistic methods have been preferred c 2023 The Authors.This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.for intended target inference or motion regression [8], [9] and they can generalize well to new scenarios.The computational load of these methods is quite low when the number of possible target positions is limited (a total of 12 targets in this article, as shown in Fig. 2), so they are efficient enough for fast motion prediction.Besides, the probabilistic methods could be trained quickly in an unsupervised way.Whereas the above work only estimated human intentions based on the observed trajectory and is not suitable for our task.In our scenario, the targets are very close to each other (with 10 cm in between), and the initial parts of the trajectories (about 50%) are very similar, as shown in Fig. 3(a) and (b).To increase the target estimation accuracy during the early stage of the reaching motion, we propose using both observed and predicted palm trajectories based on Gaussian mixture models (GMMs).Safe robot trajectories were generated online by solving two optimization problems in [10] and [11], but under the assumption that the AH trajectory prediction was known already.Based on the predicted AH trajectory from our proposed motion prediction module, we can efficiently generate a safe robot trajectory by solving only one optimization problem with fewer objective functions.To significantly reduce the number of geometric constraints in the trajectory optimization problem, we employ capsules to model AH and robot links instead of a large number of spheres as in previous work.
To summarize, the key contributions of this work are as follows.
1) We propose an HRC pipeline that combines high-level and multijoint trajectory prediction and intended target estimation with low-level online collision-free trajectory generation for HRC tasks.

A. Human Trajectory Prediction
Different methods have been proposed for human trajectory prediction.Human joint-space trajectories are predicted based on dynamical movement primitives (DMPs) and then used to predict human joint torques for intention estimation during walking [12].Another category of algorithms for AH motion prediction is based on inverse optimal control (IOC), which tries to approximate a cost function explaining the observed behavior, e.g., [13].However, with IOC, the goal information needs to be known first, which is not possible for the task we are interested in.Other work predicts human motions with explicitly defined dynamic equations derived from the physical theory, such as [14], [15], and [16].But it is hard to model human dynamics, and the model-based methods usually only work well for a very short time horizon.
There are also some pattern-based approaches.They can learn complex dynamic models from data sets based on various approximation methods (e.g., neural networks, hidden Markov models, and GMMs).Luo et al. used the GMMs to model the human AH trajectory in [17].This unsupervised method can generalize to new persons by dynamically updating or generating new models.Wang et al. [6] proposed a positionvelocity recurrent encoder-decoder neural network (PVRED).A velocity connection is added to the input of the long shortterm memory (LSTM), and their results show that this method can achieve a better performance than previous results.We revised this model to predict the AH trajectory in our scenario.

B. Human Intention Estimation
Human gaze, gestures, electroencephalography (EEG), electromyography (EMG), etc., could be used for human intention estimation [18], [19], [20].Here, we focus on the algorithms that make use of human reaching motions.Arpino and Shah predicted the reaching target by time-series classification in [21].They encoded each time step as a multivariate Gaussian distribution and calculated the class posterior probability with the observed trajectory.The result shows that they can achieve a rather accurate target prediction.A similar idea has been promoted in [17], where GMMs are used to approximate one class of trajectories.Landi et al. [22] combined the minimum jerk model with an adaptive neural network to predict whether the human will react to the robot end effector.The similarity between the observed short-term movements and the learned user behavior was used to predict human reaching goal in a teleoperation task [23].
The Q-learning method was also used for this task.Cheng et al. [2] proposed that humans optimize a reward function during the pick-and-place task, related to the distance and velocity from the human hand to the target position.Assuming that the human motion follows a Boltzmann policy, they estimated the posterior probability distribution over all targets based on the observed trajectory.However, this method does not work so well when targets are located close to each other (e.g., 10 cm in our scenario), because there will be several similar probable target positions in this situation, especially during the initial motion stage.
Therefore, we will use the probabilistic model GMMs for target estimation instead of end-to-end deep learning methods.The benefit of GMMs is that they are easier to train and also provide us with probability information.As the target positions in our task are close to each other, the trajectories are very similar at the early stage.Unlike the work mentioned above, we make use of both the observed AH trajectory and the short-term prediction as the input of GMMs to improve the estimation accuracy at the beginning of the reaching motion.

C. Online Trajectory Generation
Only specific motion planning algorithms can deal with the dynamic obstacle-avoidance problem, such as trajectory optimization [24] and sampling-based methods [25].Considering the whole volume of obstacles across all prediction time steps for safe trajectory generation results in conservatively planned trajectories.Zheng et al. [11] proposed a framework to deal with this problem.They reformulate the obstacle-avoidance problem into two quadratic programming (QP) programs.This way, they can generate a collision-free trajectory very fast.However, in some scenarios, e.g., when the separating plane used in their approach is close to vertical, the generated trajectory is not safe anymore because of local minima and the linearized kinematics.In other works, such as [26] and [27], they generated collision-free and custom-preferred waypoints in Cartesian space online, during which the dynamics limitation was not considered.They then control the robot end effector to track these points.
In our recent work [28], we solved a trajectory tracking problem in a static environment.In this article, we model a predicted AH trajectory as several moving capsules and solve the trajectory optimization problem in a model predictive control (MPC) style.A set of penalty terms is added into cost functions to efficiently generate a smooth and safe trajectory for the dynamic HRC task.

III. METHODOLOGY
As mentioned above, our proposed HRC pipeline is divided into three main parts, namely, trajectory prediction, final target estimation, and online trajectory generation, as shown in Fig. 4.

A. Human Trajectory Prediction and Target Estimation
Predicting the human arm trajectory is the fundamental step in our system.The predicted human trajectory is not only employed to infer the intended hand position but also enables the controller to generate a safe trajectory.
To predict a trajectory, we use the encoder-decoder structure of the sequence-to-sequence (Seq2Seq) model similar to [6].This encoder-decoder model is based on LSTM [29] which can account for dependencies in long sequence data.The architecture of the encoder-decoder network in this article is shown as Fig. 5.
Although the trained GMMs described below could also be used for regression, the shape of the predicted trajectory is not similar to the actual trajectory [17].Hence, we still make use of the Seq2Seq model for motion prediction.
The Seq2Seq model can be expressed as Xi,t+1:t+T = f (X i,0:t ) and is trained on a data set D = (X i ) N i=1 , where N is the number of demonstrated trajectories, t is the observed trajectory length, and T is the length of the predicted trajectory.X i,t are the positions of the human shoulder, elbow, wrist, and palm in the Cartesian space.We split every trajectory into pieces, i.e., X i,0:t+T .The input of the model is the observed trajectory X i,0:t and the labels are X i,t+1:t+T .Our goal is to train this model to make the prediction Xi,t+1:t+T close to X i,t+1:t+T .The loss function for the network is a weighted prediction error over four different markers, as follows: The weight values will be chosen by grid search.With the data set D, we also train a GMM library G = (g m ) M m=1 by the well-known unsupervised expectation-maximization (EM) algorithm, where M represents the number of potential target objects.For the target position estimation task, we just use the trajectory of the human palm in the data set.This is because the palm motion encodes the most related information for trajectory classification [17].The observed trajectory X ob is a m×n matrix, where m is the number of waypoints and n is the dimensions per waypoint.We use K multivariate Gaussians  (gc k ) K k=1 to approximate every g m in G. So, the probability of X i,t , one trajectory point at time step t during demonstration i, belonging to g m is given by The probability of X i,t given gc k and g m is a Gaussian distribution Due to the similarity of the trajectories at the initial stage of the reaching motion, we will use both the observed and predicted trajectories for target position estimation.Then, the probability of the observed trajectory X i,0:t together with the predicted trajectory Xi,t+1:t+T given g m is seen in ( 4), where X i,s ∈ X i,0:t Xi,t+1:t+T According to the Bayesian rule, the log likelihood of g m given X i,0:t and Xi,t+1:t+T is given by ( 5).We will choose g m with the highest posterior probability as the intended target position (5)

B. Online Trajectory Generation
Our optimization-based online trajectory generation method allows the robot to perform manipulation tasks while, at the same time, avoiding human AH motions, workspace boundaries, joint position limits, and dynamic constraints.
A limited quadratic position loss l P is calculated from the distance between end-effector position P E and goal position P G .This loss is clipped at a maximum value m to avoid overriding other objectives such as collision avoidance.We also minimize a quadratic orientation loss l R between end-effector orientation matrix R E and goal orientation matrix R G l P = min(m, Robot joint angles (p i ) T R i=0 are constrained by joint position limits (p i,min , p i,max ) T R i=0 , velocity limits (v i,min , v i,max ) T R i=0 , and acceleration limits (a i,min , a i,max ) T R i=0 , where T R is the length of generated robot trajectory per optimization loop During the real experiment, time delay exists due to calculation and communication between different modules.To prevent robot motion jumps because of the trajectory replacement between adjacent optimization loops, we constrain the first two steps of any newly generated trajectory to be the same as the corresponding steps in the previous trajectory.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.We also add a velocity and acceleration regularizer r i with weights b and c to prefer smooth motions Finally, we need to avoid collisions between the robot and humans as well as between the robot and the fixed workspace boundaries.We model human limbs and robot links as capsule-shaped collision objects, i.e., cylinders with hemispherical caps as shown in Fig. 6(a).This results in a much lower number of collision geometries compared to the familiar representation using sets of collision spheres.Capsules with radius (r H,j ) T j=0 are created between all connected human joints with predicted positions (P H,j ) T j=0 , and capsules with radius (r R,k ) T R k=0 are created between all connected robot joints with positions (P R,k ) T R k=0 .Based on the method provided by Sunday [30], we compute the pairwise closest distances between the segments connecting robot joints and the segments connecting human joints.Then, these closest distances subtract the radii of capsules are the pairwise closest distances between the human limb capsules and robot link capsules, as shown in Fig. 6(b), and use the directions of the shortest distance vectors as separating plane normals N j,k , as shown in Fig. 6(c).
For human-robot collision avoidance, we add a set of penalty terms q j,k with the desired minimum distance d.We use penalty terms instead of hard constraints because the speed at which the human approaches the robot might be faster than the maximum velocity at which the robot is allowed to move.In such cases, with a hard constraint, the problem could become infeasible, and the robot might stop, provoking a collision.With a soft penalty, the robot will keep moving away from the human as quickly as possible.Since the position loss l p is limited, collision avoidance still has precedence over reaching the target.
In many scenarios, a large open space will be available into which the robot can safely retreat.When avoiding human motions, it may be preferable for the robot to move toward this area.We therefore support an optional bias B. In our experiments, the robot can safely move upward into a large open area above the table, so we set the bias B to 0, 0, 0.5 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The result shows that the predicted trajectory is beneficial for early target estimation.
Finally, the workspace boundaries are enforced by a set of planes with normals (N B,m ) 6 m=1 and offsets (o B,m ) 6 m=1 , as shown in Fig. 6(d).We add one inequality constraint for each plane and robot sphere At each time step, we optimize a trajectory with ten future time steps and a step size of 0.1 s.The trajectory is reoptimized at 10 Hz with the most recent human motion predictions.As the basis for our implementation, we solve the optimization problem via sequential QP using a primal-dual interior-point method described in [28].

IV. EXPERIMENTS
In this part, we evaluate the proposed method with real human motion data and the robot system in a desktop assembly task scenario as shown in Fig. 1.The results show that our method generates collision-free trajectories and helps the robot to collaborate with the human more efficiently and safely.

A. Trajectory Prediction and Target Position Estimation
Many devices have been developed for body tracking [31], [32].In our experiment, four LED markers were placed on the human shoulder, elbow, wrist, and palm, respectively (Fig. 1).While the subject was performing the pickand-place task, AH motions were recorded by a PhaseSpace Impulse X2 motion-capture system.We collected a data set of pick-and-place trajectories from five healthy subjects (four male and one female) of different body heights.The data were recorded at 270 Hz and afterward resampled down to 27 Hz for the use of the Seq2Seq neural network.The number of recorded human trajectories per subject is 240, and 25% of the data set per subject is held out for testing.
In the training phase of the Seq2Seq neural network, the data set was split into equally sized trajectory pieces of 0.7s duration each.The former 0.35s part of the trajectory (ten steps) is the input of the neural network, and the remaining 0.35s trajectory (ten steps) is the label of the neural network output.One pick-and-place cycle consists of about 32 steps.Finally, the training data set contains 48.6K samples, and the test data set contains 14.6K samples.
See Fig. 2 for the layout of the 12 target positions.They are placed in two rows, six targets per row.Targets in the same row are at an interval of 10 cm, and the distance between these two rows is 20 cm.The IDs for targets are numbered top-down and left-to-right from the human viewpoint, so 1, . . ., 6 in the first and 7, . . ., 12 for the second row.
During the initial training experiment, an LSTM layer with a 128-D hidden state was chosen.The Seq2Seq neural network is trained using Pytorch [33] with a batch size of 128 and a teacher forcing rate of 0.6.The learning rate is initialized to 0.005, with an exponential decay rate of 0.01.To speed up the training process, the batch normalization (BN) technique is also used.The weights in the loss function (hyperparameters) are 0.08, 0.16, 0.32, and 0.44 by grid search.
The test results for trajectory prediction are plotted in Fig. 7(a).We can see that the biggest prediction error is from the palm and wrist because the motions of these two joints vary a lot.The prediction errors for all joints are less than 2 cm over five time steps.
For predicting the human target position of the human reaching motion, 12 GMMs are trained with Python [34] on data set D = (X i,0:T ) N i=1 , with 75% used for training and 25% for testing.The hyperparameter K of GMMs was set to 18 based on the Bayesian information criterion (BIC) [35].As mentioned above, the initial parts (about 40%) of the recorded human trajectories are very similar to each other, and the data points overlap, as shown in Fig. 3(b).This makes it very difficult to classify the trajectory at the early stage, see Fig. 7(b).Based only on the observed motion waypoints, the GMM classification (orange curve) is initially mostly random, but improves after about 40% of the trajectories have been observed.Note that some false classifications remain even after 60% of the human motion is known.
However, our LSTM network has already learned to disambiguate between the different reaching motions and to predict the human palm positions for the next few time steps.Therefore, we can significantly improve the accuracy of the reaching target classification during an ongoing motion by Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.show the confusion matrices for intended position estimation using only the observed trajectories.The figures in the lower row show the confusion matrices for intended position estimation using observed and predicted trajectories.Each column corresponds to the condition that 20%, 30%, 40%, 50%, and 60% of the reaching trajectory have been observed.
feeding the GMMs with a few predicted hand positions in addition to the observed positions.
The resulting improvement of the classification as a function of the percentage of observed trajectory is shown in Fig. 7(b) (blue curve).To analyze the algorithm's performance in more detail, we also present the corresponding confusion matrices, see Fig. 8.The diagrams show the classification results without and with the LSTM predictions (upper / lower row) after (20%, 30%, 40%, 50%, 60%) of a human reaching trajectory is input to the GMMs.In the upper row, the initially rather random behavior can clearly be seen, only improving after 50% and 60% of the human motions are observed.Still, some false classifications between targets 3 and 10 and targets 6 and 12 remain; both are easily explained by the experiment layout, where the human hand passes over targets 3 and 6 to reach targets 10 and 12.With trajectory prediction enabled, much more accurate and robust estimates are obtained if at least 30% of the human motion has been observed.

B. Online Trajectory Generation
The proposed online trajectory generation method is tested on a 6-DoF manipulator UR10e arm with a Shadow C6 hand.The testing scenario is shown in Fig. 1.To ensure human safety, the maximum joint velocity of the arm is limited to 0.02 rad s −1 and acceleration are limited to 1 rad s −2 (velocity and acceleration constraints described above).The radius for every capsule is 10 cm and the collision margin d between capsules is set to zero.
The trajectory optimizer is implemented in C++ using the Eigen library for linear algebra functions.All of the modules communicate with each other via the ROS [36] platform.We use MoveIt [37] to load the robot model, and Roscontrol [38] with the ur_modern_driver [39] to command the robot in real time.We calculate the predicted LSTM and GMMs results at around 20 Hz, while the trajectory optimizer runs at 10 Hz.
This experiment has two phases: 1) a reaching phase and 2) a staying phase.In the reaching phase, the human takes one screw bolt from the initial position and puts it to target 3. Once the human moves from the initial position, the robot also starts to move from target 7 to the position of target 6.After the bolt is placed, the human stays at target 3, working for 5 s, while the robot is required to continue its task; this is the staying phase.Finally, the human and the robot return to their initial positions.To compare the performance between a reactive controller (considering only the current AH positions during robot trajectory generation) and the predictive controller (considering current and predicted AH positions during robot trajectory generation), we did the experiment multiple times, as shown in Figs. 9 and 10.In this scenario, the predictive controller uses five predicted palm positions to improve the target estimation at the early stage.
Fig. 9 shows the experimental results of the reactive controller.The figures from the first three columns show the human reaching phase.Due to the lack of human motion prediction, the robot starts moving toward its target, but also toward the human arm.When the distance between the human arm and robot becomes less than the predefined threshold (the third column of the figures), the robot automatically adjusts its motion to avoid the human arm and then continues to move to target 6.The overall motion is still collision-free of course, but far from optimal for the robot.Fig. 10 presents the performance of the predictive controller.As shown by the figures in the third column, the robot predicts much sooner that the human will enter the shared workspace, and the trajectory to target 6 is planned accordingly.Importantly, the closest distance between the human arm and the robot is also larger than that in Fig. 9, which means that the predictive controller is safer than the reactive controller.
For a direct comparison of the generated robot trajectories, Fig. 11 plots three successive task executions (reach, stay, return) for both controllers in a single diagram, where the large detour taken by the reactive controller is clearly visible for the reach phase.The trajectories from the predictive controller also include some noise initially, but become smooth once the target prediction remains stable.The return trajectories are almost optimal (straight lines) for both controllers.We also analyzed the minimum distance between the human palm and the robot's index fingertip, the length of the robot trajectories, and the execution time under both controllers, see Table I.As expected, the trajectories generated by the predictive controller are shorter, and the minimum distance between the human and robot with the predictive controller Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Once the human intention and AH trajectory prediction based on the observed AH trajectory indicate target 3 for the human, the robot will replan its trajectory online (here to target 5 instead of target 3), keeping a safe distance so that it will not disturb the human task. is larger than for the reactive controller.These results suggest that the predictive controller is more efficient and safer than the reactive controller.

C. HRC Efficiency
In order to investigate the efficiency of the whole algorithm, the participant was asked to carry out the collaboration task in more scenarios.During the experiments, the participant picked the object from the initial position and put the object in the 12 target positions in any random order, like 7→12→5→11→3→9→2→8→10→4→1→6→7 in this experiment.One example to explain the task flow is that the person placed a screw bolt in a target position, like target 3, and waited there for 5 s.Meanwhile, the robot needed to change its target position from target 3 to target 5 and replan its trajectory based on the predicted AH trajectory as shown in Fig. 12, to avoid disturbing the human's work.The relationships between the human intended target position and robot available target positions are shown as Table II.The main idea of the workflow design is to let the robot and the human work concurrently without interfering with each other during the collaboration process.The workflow is coordinated by the robot goal scheduler module in Fig. 4. Once the robot finished touching target 6, it continued to touch targets in the order 7→8→9→10→11→12→1 if there were no human movements in the workspace.Otherwise, the robot target position was scheduled online according to Table II.We did this experiment in two different situations for the ablation study.
In situation 1, we predicted human intention (human target position) without any AH motion prediction (NP) and replanned the robot trajectory with the predictive controller.In situation 2, human intentions were estimated with a predicted AH trajectory (WP).The trajectory of the robot was also replanned with the predictive controller.During the total duration of the experiment, we counted the number of assembled products by the robot.From Fig. 13, we can see that the HRC efficiency was reliably improved based on our algorithm strategy.The improvement was not significant, and one reason was that many human-robot target combinations are conflict-free (e.g., human's intended target was at 7, and the robot was going from target 3 to target 4).If the conflict-free human-robot target combinations were excluded, the HRC efficiency improvement between WP and NP would be more significant.

V. CONCLUSION AND FUTURE WORK
We proposed a pipeline to improve the efficiency and safety of HRC assembly tasks.We trained a Seq2Seq neural network to predict the human AH trajectory accurately.Unlike other methods, we made use of both the observed and predicted trajectory as the input of GMMs for target estimation.As shown by our experiments, this results in a much more accurate posterior probability distribution over all potential target positions from the early stages of human motion, even if the trajectories are initially very similar to each other.
The predicted trajectory and estimated target position were then combined to generate a goal-oriented and collision-free trajectory based on a novel trajectory generation method.We evaluated the effectiveness of the whole pipeline on our real robot system, and the results demonstrated an enhanced safety and efficiency of the HRC task.
For future work, more experiments need to be conducted to test the pipeline's effectiveness in more complex scenarios.We also plan to replace the motion-capture system and use the raw data from cheap RGB-D cameras for human trajectory prediction and intention recognition.It will also be interesting to fuse more information, such as human gaze or semantic task information, to improve the target estimation accuracy and robustness.

Fig. 2 .
Fig. 2. Placement of the target and initial positions.The green markers indicate the target positions (1)-(12), and the red marker at the front represents the human hand's initial (and rest) position.The black boxes show the cameras of the motion-capture system used to track human motion.

2 )
Not only the observed trajectories but also predicted palm trajectories are used to estimate the final intended AH target, thus increasing the target estimation accuracy during the early motion stage by unsupervised learning.3)We evaluate the proposed HRC pipeline by real physical experiments.The results show that the robot can generate goal-oriented and collision-free trajectories to improve the efficiency and safety of HRC.

Fig. 3 .
Fig. 3. Units of axes are meters.(a) Recorded human reaching trajectories (human palm) from the rest position to the twelve target positions.(b) Close view of the first 40% of a few reaching trajectories for target positions 2-5, demonstrating the initial overlap between the trajectories.

Fig. 4 .
Fig. 4. Architecture of our proposed HRC pipeline.It contains three main parts: trajectory prediction, target estimation, and online trajectory generation.The encoder-decoder network takes the observed human arm trajectory as input and predicts the next steps of the human arm trajectory.The GMMs library then estimates the goal position of the human hand based on the observed and predicted trajectories.The encoder-decoder network and the GMMs library are both trained on a self-collected human arm motion data set.The goal schedule module adjusts the robot goals based on human intentions.The trajectory generator yields collision-free trajectories during the collaboration tasks.

Fig. 5 .
Fig. 5. Architecture of the encoder-decoder neural network.It consists of two modules, i.e., encoder and decoder.Both encoder and decoder modules are composed of LSTM cells.

Fig. 6 .
Fig. 6.Separation plane generation and workspace constraints for collision avoidance.(a) We use capsules as efficient collision geometries.Several capsules are created to cover the robot and the human arm.(b) White lines visualize the pairwise closest distances between every robot capsule and human capsule.(c) Separation planes are calculated between every human capsule and robot capsule.These planes are used in our trajectory optimization to guarantee a safe separation whenever the distance between the human and robot is less than a threshold.(d) Static workspace boundaries are defined by six planes (four vertical planes and two horizontal planes) to restrict the robot motion to the defined volume.

Fig. 7 .
Fig. 7. Analysis of trajectory prediction and target estimation.(a) Mean trajectory prediction errors over different time horizons of the four tracked human AH joints.(b) Target estimation accuracy at the early stage, with and without the short-term trajectory prediction.The result shows that the predicted trajectory is beneficial for early target estimation.

Fig. 8 .
Fig.8.Figures in the upper row show the confusion matrices for intended position estimation using only the observed trajectories.The figures in the lower row show the confusion matrices for intended position estimation using observed and predicted trajectories.Each column corresponds to the condition that 20%, 30%, 40%, 50%, and 60% of the reaching trajectory have been observed.
Fig.8.Figures in the upper row show the confusion matrices for intended position estimation using only the observed trajectories.The figures in the lower row show the confusion matrices for intended position estimation using observed and predicted trajectories.Each column corresponds to the condition that 20%, 30%, 40%, 50%, and 60% of the reaching trajectory have been observed.

Fig. 9 .
Fig. 9. Experiments with a reactive controller (no prediction).The figures in the upper row visualize Cartesian trajectories of the human and the robot.The blue line represents the trajectory of the robot (Shadow hand first finger tip, moving from target 7 to target 6), and the red line represents the trajectory of the human (palm joint, reaching target 3).The photographs in the lower row show the human arm movement during the reaching phase, then the human arm stays at the target for 5 s, as shown in the last figure.

Fig. 10 .
Fig. 10.Experiments with our predictive controller.Human motions, start and goal positions, as well dynamic constraints of the robot, are the same as Fig. 9.

Fig. 11 .
Fig. 11.Trajectories of robot Shadow hand first finger tip during the experiments with reactive and predictive trajectory controller.

Fig. 12 .
Fig.12.Dynamic goal scheduling.Initially, the robot plans to go to target 3. Once the human intention and AH trajectory prediction based on the observed AH trajectory indicate target 3 for the human, the robot will replan its trajectory online (here to target 5 instead of target 3), keeping a safe distance so that it will not disturb the human task.

Fig. 13 .
Fig. 13.Number of assembled products by the robot in two conditions.1) NP: Human intention classification with no predicted arm trajectory.2) WP: Human intention classification with predicted arm trajectory.

TABLE I COMPARISON
RESULTS BETWEEN PREDICTIVE AND REACTIVE CONTROLLER