Robot Programming by Demonstration: Trajectory Learning Enhanced by sEMG-Based User Hand Stiffness Estimation

Trajectory learning is one of the key components of robot Programming by Demonstration approaches, which in many cases, especially in industrial practice, aim at defining complex manipulation patterns. In order to enhance these methods, which are generally based on a physical interaction between the user and the robot, guided along the desired path, an additional input channel is considered in this article. The hand stiffness, that the operator continuously modulates during the demonstration, is estimated from the forearm surface electromyography and translated into a request for a higher or lower accuracy level. Then, a constrained optimization problem is built (and solved) in the framework of smoothing B-splines to obtain a minimum curvature trajectory approximating, in this manner, the taught path within the precision imposed by the user. Experimental tests in different applicative scenarios, involving both position and orientation, prove the benefits of the proposed approach in terms of the intuitiveness of the programming procedure for the human operator and characteristics of the final motion.

autonomously in a separate cell to a scenario in which robots and humans work together and share the same workspace. Accordingly, methods for robot programming have changed as well. Traditional approaches for robot programming, based on offline programming and simulation and on the use of the teach pendant to define the sequence of desired configurations to be reached, have been supported, and in some cases replaced, by various methods that can be classified in a broad sense as programming by demonstrations (PbD) [1]. In PbD, the user directly trains the robot to perform the desired task with different forms of interaction and different levels of abstraction. These modalities of interacting with robots allow reducing the time needed for programming a task while minimizing users' errors [2]. Moreover, they also enable nonexpert users to program robots and are, therefore, one of the key technologies that can foster the use of robotic systems, especially in small-and medium-sized companies.
If the medium/long-term goal of PbD of robotic systems aims at reproducing the training processes that typically occur among humans [1], based, for instance, on skills transfer by means of practical demonstrations and/or vocal instructions, in industrial applications, PbD methods, such as lead-through programming or walk-through programming [3], [4], [5], which consist in manually guiding the robot, recording, and then reproducing its motion, are still the most widespread. Even if these approaches date back to the early 1980 s, when some painting robots could be programmed by manual guidance [6], it is in the last decade that, thanks to the availability on the market of collaborative robots specifically designed to safely interact with humans, programming via direct kinesthetic teaching has become a real option in many applications. However, the interaction between the user and the robot is often limited to a force exchange, and the robot task is a mere reproduction of the imposed trajectories.

A. Related Work
PbD can be categorized according to different features: the demonstration can be performed on a virtual or on a real environment, the user must hand guide the robot or simply perform the task with some kind of tracking system, and different input channels can be used (e.g., visual, kinesthetic, speech, etc.). or NURBS [7], or on different types of probabilistic methods, e.g., Gaussian mixture regression [8], [9], dynamical movement primitives (DMPs) [10], probabilistic movement primitives [11], kernelized movement primitives [12], or Gaussian process regression [13]. Despite the different features, a common characteristic of almost all the proposed approaches is the need for suppressing noise and unwanted/unnecessary motions that inevitably affect human demonstrations. To cope with these issues, the basic solution consists in recording multiple demonstrations, which are performed by an expert user, and then applying some kind of averaging or optimization method. In the context of probabilistic approaches, the differences among the input trajectories are translated into different variance levels of the stochastic processes used to model the task [13], [14], but alternative methods are also possible. In [15], variations between multiple demonstrations performed by a human operator are exploited to define the boundaries of a region free from obstacles. Then, a piecewise linear segment trajectory, fully contained in this admissible region, is built up to minimize the geometric path of the robot. In this approach, the human inconsistency in repeating the same desired trajectory is not treated as a noise to be canceled but is considered a measure of accuracy required by the task. A similar approach is proposed in [16], where several trajectories obtained by observing a human operator that replicates a given task are discretized in space and then condensed in a sequence of pairs composed by the average point and the standard deviation. On the basis of these pairs, a feasible space is constructed, and then, different trajectories for a robotic system are computed optimizing criteria such as the total length, the execution time, or the energy consumption. In [17], the motions performed by a force-controlled robot directly commanded by the human operator are used to define a safe volume, i.e., a volume free from obstacles, where the trajectory, which is needed for a given task, can be planned according to a given optimization criterium. In [18], NURBS curves are used for smoothing the trajectories learned via a user demonstration in a virtual environment. Spline curves are the tool used in [19] to design a smooth trajectory from the user demonstration. In this article, the reduction of the points of the input trajectory, to make the interpolation process less expensive from a computational point of view, has as a side effect a rejection of the noise affecting the motion.
As abovementioned, the generality of the PbD methods relies on multiple demonstrations of the same task, while only a few cases are based on a one-shot demonstration despite the clear advantages in terms of required effort and time for programming. In [20], the points of the recorded trajectory are modified by interpreting them as particles interacting among them and with the surrounding environment. In [21], the points acquired after the demonstration are used to define a continuously differentiable (polynomial) trajectory that approximates the piecewise linear path joining the points within a given tolerance, initially set by the user. The idea of modifying a given motion trajectory with a maximum deviation was initially proposed in [22]. In this case, the nominal motion is defined via multiple demonstrations, but then the user has the possibility of progressively refining the trajectory inside a tube, whose radius is determined automatically, based on the variation of the training data. The PbD approach, that we propose in this article, is not dissimilar. It is based on a one-shot demonstration and a procedure for approximating the resulting trajectory in an optimal manner within a given tolerance. However, in the framework that we propose, this tolerance is not constant over the entire length of the path or computed automatically but can be specified at runtime by the operator himself. This possibility requires an additional input channel besides the standard position recording, according to a multimodal paradigm.
Especially, in the context of semistructured tasks, additional modalities of interaction can be exploited in support of the physical interaction of kinesthetic trajectory teaching. In this way, further guidance information can be transferred to the robot. The additional interaction modality should be naturally and unobtrusively adjustable by the operator to convey human intentions continuously and during the entire robot teaching execution.
In recent years, human-robot interaction has largely benefited from wearable devices for the measurements of biological signals. Physical interaction augmented by such kinds of biointerfaces allows lightweight, effective, and intuitive information exchange with minimal constraint/encumbrance on the human body. In [23], it has been demonstrated that human motor intentions can be detected with good accuracy from surface electromyography (sEMG). The online extraction of information from sEMG signals represents a valuable approach for the multimodal guidance of robotic devices. In particular, the human ability to simultaneously activate antagonistic muscles both in static postures and dynamic motions can be exploited. The central nervous system continuously exploits muscle cocontraction (i.e., coactivation of antagonistic muscles) to regulate joint mechanical impedance [24]. We experience every day that we modulate muscle cocontraction to stiffer the joints of our limbs during the execution of motor tasks that require higher accuracy, while we reduce their mechanical stiffness when low precision is required. This behavior has been experimentally investigated and confirmed by clinical trials in [25], where human joint stiffening induced by muscle cocontraction has been measured by sEMG signals during a pointing task of randomized targets. On the basis of these considerations, the idea developed in this article consists in augmenting the kinesthetic teaching of collaborative robots, with an sEMG-based measurement of the forearm muscle cocontraction, that can be used by the operator to continuously modulate the desired level of accuracy. In order to follow the natural behavior of the human motor control, low cocontraction levels are mapped onto large tolerance radii and vice versa. Furthermore, the operator is provided with a vibrotactile feedback about the actual level of cocontraction that simplifies this operation.

B. Contribution and Structure
This article presents a novel PbD algorithm that starting from a single kinesthetic demonstration provides an optimized motion trajectory subject to geometric constraints directly defined by the user during the demonstration. In this manner, it is possible to cope at the same time with: 1) the compliance with the task constraints that are directly taught by the operator by varying the hand stiffness deduced from forearm muscles' sEMG; 2) the smoothness of the path, which by virtue of the adopted parameterization based on smoothing B-spline has the minimum curvature/acceleration. Note that the regularity of the trajectory obtained as a result of a PbD approach is neglected in a large body of literature focusing on this topic. Accordingly, industrial applications where the execution time must be minimized cannot take advantage of these programming methods because of the high curvature that often characterizes the resulting trajectories, which may induce large accelerations and even vibratory phenomena. The trajectories considered in this article are the so-called smoothing B-splines [26], which are used in the industrial field, and in particular in the robotic one when it is necessary to define a trajectory as a tradeoff between the need of minimizing the approximation error with respect a given set of via points and the need of limiting the geometric curvature. Interesting enough, smoothing B-splines are also a standard tool for solving nonparametric regression problems of estimating a curve g(t) given n observations affected by random noise, i.e., y(t i ) = g(t i ) + ε i for i = 1, . . . , n [27]. In our application, the curve g(t) is the ideal trajectory that the robot should track, and the noise ε i is caused by the unwanted motions affecting the human during the demonstration phase. Therefore, the approach based on smoothing B-splines can be seen as an attempt of eliminating this noise. Furthermore, to cope with the bounds imposed by the user, the standard approach has been enhanced with constraints on the maximum deviation with respect to the original trajectory. To do this, a novel use of the human stiffness estimation has been exploited. As a matter of fact, the idea of using an estimation of the user impedance for teaching a robot is not new in general, since many examples can be found in the scientific literature where the impedance of the human operator is reflected to the robot. For instance, in [28], the user can select the stiffness of the commanded robot while teaching its position trajectory by acting on an input interface based on a linear spring-return potentiometer that maps a button position to the arm stiffness. Previously, the concept of teleimpedance was introduced in [29], where in a leader-follower teleoperation scheme, a reference command composed of both the desired motion trajectory and the impedance profile is sent from the human operator to the remotely operated robot. Note that the impedance profiles are deduced from sEMG measurements on the operator's arm. In [30], a framework for robot PbD based on a single demonstration provided by the human operator is proposed, in which the teleimpedance approach, for teaching the robot stiffness, is integrated with a generalization of the trajectories demonstrated by the user in order to let the robot adapt to different object and environment configurations. In [31], sEMG signals and kinesthetic control are exploited in a PbD task, where the positions imposed by the human operator and impedance gains deduced from the muscles activity are recorded and then applied offline to the robot. Similarly, in [32], the trajectory demonstrated by a human operator together with sEMG signals is used to command both the position and the stiffness of a robot manipulator interacting with the environment. In these cases, the signals are encoded by using DMPs models, so that they can be easily adapted to new tasks.
Differently from the abovementioned examples, in the framework proposed in this article, we are not going to estimate the user stiffness at her/his arm's endpoint (i.e., Cartesian stiffness) with the objective of making the robot replicate the human stiffness profile. In our approach, we use sEMG to estimate the overall stiffness of the user's hand (fingers and wrist joints) from the activation of the predominant extrinsic hand's antagonistic muscles located in the forearm, to provide the user with an additional programming input. This can be used to specify at runtime-i.e., during the execution of kinesthetic teaching by physically guiding the robot-the geometric constraints for a B-spline trajectory optimization problem, which minimizes the noise and curvature of the trajectory taught by the user. Importantly, since the user should be able to voluntarily modulate the additional programming input during the operation, vibrotactile feedback is provided to inform him about the actual modulated level of hand stiffness. This approach allows us to better capture the human intentions and enhance the effectiveness of the PbD method in programming complex tasks, by maintaining the same level of intuitiveness. Experiments involving ten subjects in teaching trajectories of a seven-degree-of-freedom collaborative robot, while modulating sEMG-estimated hand stiffness, show that the proposed system can be successfully used for intuitive multimodal robot teaching of tasks in semistructured environments. Without pretraining or previous experience with sEMG/vibrotactile interfacing, subjects were required to teach trajectories with instructed target hand stiffness levels in a test scenario characterized by obstacle courses with narrow passages, variations of end-effector orientation, and specific workspace locations to be reached, with the underlying idea of reproducing some mock-up welding process teaching trajectory.
The rest of this article is organized as follows. In Section II, the general scheme for robot programming based on kinesthetic teaching and sEMG signals is illustrated; then, the constrained optimization problem for the B-spline robot trajectory definition is formulated in Section III. In Section IV, the robotic setup used in the experiments is presented; in Section V, the effectiveness of the proposed PbD approach is demonstrated both in terms of characteristics of the programmed motions and intuitiveness for operators. Finally, Section VII concludes this article.

II. TRAJECTORY LEARNING TECHNIQUE
The proposed programming framework aims at defining an optimal trajectory that approximates the geometric path demonstrated by the user, directly acting on the robot, within a tolerance defined at each time instant by modulating the cocontraction level of forearm muscles. A block scheme representation of this approach is illustrated in Fig. 1. The user steers the robot along the desired trajectory by applying a wrench F h (t) at its end-effector. It is possible to make the end-effector able to follow and adapt to the force exerted by the operator in many ways, for instance, by imposing that the dynamical behavior of the robot matches the dynamics of the mass-damper system [33]. A detailed description of the controller implemented in the experimental robotic setup is reported in Section IV-C.
While the user performs the position programming task, and the position q k of the robot is recorded with a given sampling period T s , the sEMG signals that activate the muscles of the forearm are also acquired and elaborated to obtain the profile of the desired precision ρ k at each sampling instant. This basic element of the proposed strategy is described in Section II-A. Finally, the pairs (q k , ρ k ), k = 0, . . . , n, are the input of an optimization process, illustrated in Section III, that provides the trajectory q (u) to be applied to the robot. It is worth noticing that: 1) the point q k , and consequently the trajectory q (u), can be considered both in the robot joint space and in the robot workspace; the difference consists in the meaning of ρ k , which in the former case is a maximum deviation with respect to joint variable values, while in the latter is a maximum distance from the geometric path demonstrated by the user; 2) when defined in the workspace, function q (·) only specifies the geometric path of the motion; in general, the variable u can be defined independently to specify time law along the curve.

A. Desired Precision Estimation by sEMG Analysis
As already mentioned, one of the key elements of the proposed scheme is the mapping between the sEMG signals of the human operator and the desired level of accuracy. This mapping is based on a two-stage process: first, the stiffness of the operator's hand is estimated based on a simple consideration of the physiological phenomena governing the regulation of the human stiffness, and then, this value is converted in a distance via linear scaling.
Various implementations of human stiffness estimations from sEMG have been presented in the literature. For instance, in [34] and [35], sEMG-based human stiffness estimation methods are developed and used for supervisory control and bilateral teleoperation of robotic systems. Other approaches for sEMGbased stiffness estimation involve detailed Hill-based muscle modeling techniques [36] (see, e.g., [37]), where an assistive control scheme control for a knee exoskeleton device has been developed. These methods mainly differ on the desired level of accuracy required by the specific application. As can be easily understood, as much as a higher accuracy is requested, the higher will be the complexity of the calibration on different users, the difficulty of parameter identification, and the time necessary to obtain them. In the application proposed in this article, a very accurate stiffness estimation is not necessary, because the objective is to obtain a rough measurement of the overall hand stiffness continuously modulated by the operator, and therefore, a simplified model of the stiffness regulation can be adopted. Hand stiffness is the overall stiffness seen by the object grasped by the human hand (in our case, the robot end-effector) and is the result of the cumulative stiffness of each finger, which, in turn, depends on the stiffness of each joint. It is well known that the joints of the fingers are actuated by the muscles located in the forearm through a net of tendons. Moreover, studies on the human muscle geometry and joint mechanics [38], [39] have demonstrated that the total human hand stiffness is a function of hand muscle activations. Since the mechanical impedance of the human hand can be seen as a consequence of the overlapping of several muscle activations working in an antagonistic configuration [40], it is possible to consider the hand stiffness as a summation of the activations of all the muscles involved in the grasping task [39]. According to these considerations, the hand stiffness estimation can be deduced from the sEMG signals acquired from the operator's forearm. This approach has been demonstrated to be suitable in several telerobotics works [41], [42], [43], [44]. In detail, we exploit the fact that the n sEMG signals taken from the forearm, as described in Section IV-A (in our work, n = 8), are located on the n/2 couples of predominant hand antagonistic extrinsic muscles [45], according to the well-known concept of the human antagonistic actuation model [39], [41]. We can, therefore, consider that such sEMG activity captures the information related to the activation of the predominant hand muscular antagonistic actions. Formally, considering n sEMG channels e 1 , e 2 , . . . , e n measured by a bracelet placed around the forearm, as described in Section IV-A, the estimation of the stiffness σ k of the operator's hand is computed as where is the maximum value of the sum of the maximum voluntary contractions (MVCs) [46] recorded from each sEMG channel during a calibration operation, in which the human operator is asked to stiffen at maximum level her/his hand for three times while grasping the robot handle end-effector (see Fig. 2), T s is the sampling time, and k = 0, 1, 2, . . .. According to its definition, the hand stiffness estimation σ k is an index normalized in the range [0,1]. From σ k , the desired precision ρ k can be readily obtained via a linear scaling, i.e., where ρ min and ρ max are arbitrary parameters denoting, respectively, the minimum and the maximum level of accuracy in order to match the requirements of a specific robotic task of interest.

A. Trajectory Definition
The learned trajectory is defined by using the so-called B-spline curves. They represent a different mathematical formulation of standard spline parameterization based on piecewise polynomial functions. A B-spline of degree p is a parametric curve s : [u min , u max ] → R d defined as convex combinations of control points p j ∈ R d weighted by B-spline basis functions of degree p, B p j (u): More details about the definition and computation of basis functions B p j (u) are provided in Appendix A. The vectorial coefficients p j , j = 0, . . . , N, determine the shape of the curve and are generally computed by imposing approximation/interpolation conditions on a given set of data points. In the proposed application based on the direct control of the robot manipulator by the human operator via physical interaction, these points are the samples q j , j = 0, . . . , n, of the recorded trajectory. Note that q j , and accordingly p j , are vectors whose dimension d depends on the specific task. For instance, in an application where only the position of the robot end-effector must be programmed and the orientation is kept constant, q j and p j will be 3-D vectors, while if also the orientation is considered, the two vectors will include a minimal representation of the orientation, like, e.g., the Euler angles, and they will be, therefore, defined in a 6-D space.
Since the via points q j are derived from the trajectory of the robot q(t) obtained via kinesthetic teaching, an exact interpolation is certainly unsuitable because of the unwanted movements that affect the user. In the consideration of the modest precision required by the task, which is comparable to that of a human operator, the idea is to design a trajectory path that lies in the neighborhood of the given via points with the lowest possible curvature and, therefore, the lowest acceleration. To this purpose, smoothing B-splines are the ideal tool [26], being the functions s(u) that minimize the quantity where λ ≥ 0 is a parameter that can be freely chosen to control their smoothness, while w j > 0 is a parameter used to selectively weight the contribution of the squared error at a particular point q j . Parameter u j denotes the time instant at which the approximation occurs. If λ = 0, the problem becomes a standard least-squares error fitting of the given trajectory points with a B-spline. In case n = N , the spline exactly interpolates all the points q j , while if n > N, the obtained B-spline only approximates these points in a quadratic sense. However, in the latter case, the reduction of the number of control points has a significant impact on the computational load required by the optimization problem and is, therefore, desirable. For this reason, we assume that n N , considering, for instance, a fixed ratio between the number of samples q j of the input trajectory and the number of control points p j .
Interestingly enough, by rewriting in a matrix form the standard expression of the B-spline (4) as where the cost function (5) can be written as where tr(·) denotes the trace of a matrix, W = diag(w j ) > 0, j = 0, . . . n, is the weight matrix, andQ,P, B, C, and A are matrices of proper dimensions, which depend on known quantities such as q j , u j , and u j . The solution of the optimal problem is unique and can be computed analytically.
For the mathematical details, including the expression of the matrices that appear in (9) and the expression of the optimal solution, see Appendix B.

B. Constraint Definition
One of the main drawbacks of the optimization problem (10) is the critical role played by the parameter λ, which must be selected by the user. More in general, as highlighted in [47], in many problems related to robot navigation/trajectory planning based on splines, it is difficult to balance the tradeoff between the smoothness of the curve and the desired shape. For this reason, smoothing splines with prescribed tolerance have been already proposed in the literature [26]. According to this approach, the parameter λ is selected with the purpose of bounding the maximum approximation error below a given threshold ρ, i.e., However, this type of bound may lead to overconservative solutions, since the problem is global, i.e., involving the entire curve. With the proposed approach, the maximum distance between the points of the input curve and the corresponding point of the B-spline trajectory is limited in a local sense, i.e., for each point, and with a value of the threshold ρ, which may be different for each of them, i.e., ρ j , j = 1, . . . , n − 1 (being the first and last point exactly interpolated). If the goal is quite clear, the ways of implementing it can be different in terms of complexity and implications. A possible solution is based on the pointwise selection of weight coefficients w j , as in [48]. However, this approach is generally based on heuristics and does not assure the exact compliance with the bounds. For this reason, the setup of a constrained optimization problem is preferable.
Three alternative types of constraints on the control points p j have been identified, each one with pros and cons. 1) Polyhedron: For each point q j , it is assumed that where subscript k denotes the kth component of the vector (s(u j ) − q j ). Note that for each component, it is possible to select a different bound ρ j,k . By considering (6), condition (11) be rewritten as where ρ j = [ρ j,1 , . . . , ρ j,d ] T . With these linear constraints, the optimization problem (10) becomes a convex quadratic programming problem, for which efficient solutions exist [49]. 2) (Euclidean) ball: For each point q j , it must be assured that where ρ j is a scalar. This is clearly a nonlinear constraint, but the convexity, and therefore the possibility of finding a solution of the optimization problem, is ensured in any case. In addition, only a parameter is sufficient to bound the distance between each point and the approximating spline, even if the meaning in the case of trajectories, that include also the orientation, is questionable. In this case, multiple constraints, one for the position and one for the orientation, are preferable. 3) Cylinder: The error between the B-spline trajectory and the points q j is decomposed by considering the tangent direction to the motion and the orthogonal plane to this vector (a similar approach can be found, e.g., in [50]). Obviously, this type of constraint can be applied only to position trajectories in the 3-D space. Let us assume that and that n j is the vector of maximum norm (properly normalized) among t j × i, t j × j, and t j × k being i, j, and k the unit vectors defining the base frame of the application, and where × denotes the cross product between vectors. Finally, assume that b j = t j × n j . With these three vectors, it is possible to define the rotation matrix R j = [t j n j b j ] T that brings the approximation error in a frame, whose x-axis is aligned with the tangent direction while yand z-axis are within the plane orthogonal to this direction, i.e., and then it is possible to impose that and with e j = [e j,1 e j,2 e j,3 ] T . The definition of this feasible set is much more complex than balls and polyhedra, but the convexity is still preserved. Moreover, even if limited to the 3-D position trajectory, its meaning is very clear. In Fig. 3 , a pictorial representation of the three different sets for a given point q j is reported. Note that the constraints only affect the geometry of the curve, while the way in which the B-spline is tracked and the related quantities, such as velocity, acceleration, etc., are not taken into account. The choice of a particular type of constraint has a strong influence on the computation time required to solve the constrained nonlinear optimization problem. In particular, it is possible to prove that ball-and polyhedron-type constraints lead to similar computation times, while the more complex constraint based on cylinders leads to a duration that is approximately double. Moreover, the different types of constraints affect differently the approximation error and the smoothness of the final trajectory, but the variation is rather marginal, and from this point of view, the three methods are almost equivalent.

C. Constrained Trajectory Optimization
Because of the convexity of the objective function and of the admissible sets, the solution of (10) subject to (12), (13), or (14) and (15) can be efficiently found with state-of-theart numerically optimization methods. However, since usually n N , and accordingly the number of constraints (n − 1) is much larger than the number of free parameters p j , it may happen that the intersection among all the admissible sets, which is still a convex set, does not contain any solution. For this reason, a two-stage optimization problem has been devised.
In the first step, it is checked whether a number N + 1 of control points guarantees that the intersection of all the convex sets deriving by the constraints is not empty. This can be done by computing the (unconstrained) approximating B-spline, by means of (23) with λ = 0, and then by verifying if all the constraints are met. In this case, the constrained optimization problem can be finally solved (in the second step of the procedure). If the approximating B-spline is not compliant with all the constraints, then it is necessary to increase N . To illustrate this procedure, we consider a simple pick-and-place task that also requires obstacle avoidance, which is shown in Fig. 4. It is assumed that obstacles remain stationary, and an initial trajectory (red line) is given together with some bounds that define a feasible space where the optimized trajectory can lie. The approximating B-spline trajectories for increasing values of N (varying from 3 to 14) are shown in the figure with back lines, whose thickness is proportional to N (thin lines correspond to B-spline with few control points and vice versa). It is possible to notice that as N grows, B-spline curves become closer and closer to the original trajectory (which is composed of 145 points). This is confirmed by trend of the cumulative approximating error as a function of the number N + 1 of control points defining the B-spline, which is shown in Fig. 5. It can be noticed that this error decreases as N grows, even if the trend is not strictly monotonic, and many local minima are present. As a matter of fact, the error tends to zero as the number of control points N + 1 approaches the number of given via points n + 1, since in this case an exact interpolation is obtained. Note that not only the global error but also the local error at a specific via point tends to zero, as N goes to n. On the other hand, as remarked in [51], an error function such as (16) is not convex with respect to the knot location. By considering a fixed knot spacing, the variation of the number of knots also modifies their position and, therefore, causes the local minima observed in Fig. 5. In any case, by progressively increasing N , the bounds on the error are certainly met. Therefore, when the initial value of N does not assure any feasible solution, or if parameter N is not an input of the problem but can be freely chosen, it is possible to find the value that minimizes the dimension of the trajectory optimization problem by starting from the given value, or from a small value (e.g., 3), of N and then iteratively increasing this value until the approximating trajectory is compliant with all the constraints. Since the analytic solution for determining the approximating B-spline is available (see (23) in Appendix B), this procedure can be performed very efficiently and in a very small amount of time.
Once that the value of N has been determined, the constrained optimization problem based on the cost function (10) with λ = 0 can be finally solved by using off-the-shelf convex optimization solvers and exploiting the solution found in the first step as an initial guess. Note that the value of N found is only a sufficient condition for the existence of the solution, but, providing a feasible solution used as the initial condition for the constrained optimization, it allows us to speed up the numerical computation process. Algorithm 1 summarizes the overall procedure.
By considering the pick-and-place example previously mentioned, the minimum value of N that guarantees the compliance of the approximating B-spline trajectory with all the constraints   Fig. 6. In the same figure, the constrained B-spline is compared with the approximating B-spline that satisfies all the constraints (initial solution) and with the unconstrained smoothing B-spline defined with the same value of λ. It is clear that the unconstrained trajectory is the smoothest one, but it collides with the obstacles being not included in the feasible region. On the other hand, the approximating B-spline is very close to the input trajectory but reproduces also unwanted phenomena, such as involuntary movements that may be superimposed to the ideal trajectory. The constrained smoothing B-spline represents the optimal tradeoff as confirmed also by Fig. 7, where the first and second derivatives with respect to u (which are proportional to velocity and acceleration, respectively, once the motion law u = u(t) has been defined) are shown. In particular, the constrained B-spline curve exhibits intermediate values between the unconstrained smoothing B-spline and the approximating B-spline, with a substantial reduction with respect to the input trajectory. Finally, it is worth noticing that the devised optimization procedure only affects the geometry of the trajectory, while the way in which the trajectory is tracked, i.e., the motion law, is not taken into  account, because many different approaches can be adopted according to the needs. For instance, it is possible to reproduce the input trajectory with the same timing or follow the geometric path at a constant velocity or in minimum time.

IV. EXPERIMENTAL SETUP
To validate the proposed approach, the experimental setup shown in Fig. 8 has been used. It is composed of a robot manipulator equipped with a force/torque sensor and an sEMG sensing device with vibrotactile feedback. In the following, a detailed description of the hardware is given.

A. sEMG Sensing
For the sensing of the sEMG activity, eight sEMG channels are acquired from the forearm of the operator. Specifically, a gForcePro wearable armband (by OYMotion) has been used for signal acquisition. Since the predominant muscles producing flexion and extension actions in the hand are the Flexor Digitorum Superficialis and Extensor Digitorum Communis, the armband has been placed on the forearm in the proximity of such Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. muscles-that is, close to the proximal part of the forearm [45]. The sEMG provided by the armband is a raw signal at the sampling frequency of 1 kHz, sent to a nearby PC through a Bluetooth interface. After the acquisition, each raw sEMG channel is processed by a chain [41] composed of a 50-Hz notch filter for powerline interference cancellation, a 20-Hz high-pass filter for baseline noise reduction, and the root-mean-square filter computed over 200-ms running window.

B. Vibrotactile Feedback
The operator has been equipped with a wearable bracelet, based on a Grove vibration motor by Seeed Technology Co., that produces an adjustable vibrational skin stimulation. The intensity of such stimulation can be modulated by controlling the motor vibration magnitude, using the electronic control board Seeeduino V4.2, in a range of integer values from 100 to 255. In order to provide the operator with a vibrational feedback indicating the hand stiffness level in real time, σ k has been mapped in the input range of the bracelet. Note that this technological element is not strictly necessary for the proposed application, but is an aid that allows the user to better perceive the applied level of stiffness.

C. Robotic Manipulator
The experiments have been carried out by using the collaborative robot Panda (by Franka Emika GmbH, Munich, Germany) equipped with a 3-D printed handle and a six-axis force sensor (ATI Industrial Automation, Apex, NC, USA), placed between the handle and the robot flange. A kinesthetic teaching interface between the operator and the Panda robot has been implemented by imposing a partial compensation of the robot dynamics plus a standard proportional-derivative (PD) control along the motion directions that are constrained during the PbD tasks [33]. By using this mechanism, in the first experiment, the orientation has been kept constant. Let us consider the dynamic model of the robot manipulator, expressed in the operational space [52] C(θ,θ), and g(θ) are the corresponding quantities defined in the joint space. F c = J −T τ c is the end-effector wrench corresponding to the control (joints) torques τ c and F h is the wrench applied by the user to end-effector. The control action is characterized by three terms: 1) a compensation of the Coriolis and gravity terms based on a proper estimation of the functions Γ(θ,θ) and η(θ); 2) a PD controller where K p and K d are diagonal matrices with nonnegative elements; in particular, the gains of K p equal to zero correspond to the directions that are directly commanded by the user (in this case, the corresponding components of x ref andẋ ref are null); 3) a feedforward term proportional to the wrenchF h exerted by the user and measured by the six-axis force sensor; a low-pass filter is applied for rejecting high-frequency vibrations elasticity induced by robot's kinematic and actuation chains. By assumingF h ≈ F h , the dynamics of the robot equipped with this control along the workspace directions commanded by the user becomes where ρ(θ,θ) takes into account the mismatch between the actual Coriolis and gravity terms and the corresponding estimations and also the unmodeled dynamics like, e.g., frictional forces. Note that, even if the used controller does not modify the workspace inertia of the robot, the control action based on the external force measurement has the effect of scaling the inertia seen by the user acting on the end-effector, especially for a large value of gain α. Moreover, this term contributes to mitigating the consequences of the undesired dynamics represented by ρ(θ,θ), making the programming task by kinesthetic teaching easier.

V. EXPERIMENTAL RESULTS
In order to demonstrate the effectiveness of the proposed approach, in terms of both the intuitiveness of the programming method for the operator and the features of the final trajectory for the robotic application, various experiments have been carried out under different conditions. In particular, since the human operator plays a key role in the programming task, ten participants have been involved in the experiments to evaluate the proposed approach enhanced with sEMG measurements (age: 30 ± 5) In the following, the subjects are denoted as S1, S2, . . ., S10. The experiment has been conducted in accordance with the Declaration of Helsinki. An informed consent form has been signed by all the subjects, in which a detailed explanation of the experimental protocol has been provided.

A. Experimental Session 1: Obstacle Course With Welding Points
In this first experimental session, the subjects have been asked to lead the robot along a specific path characterized by narrow passages. In particular, as shown in Fig. 9, the robot must initially track a winding path, and then, the end-effector must be placed in specific points avoiding all the obstacles located in the workspace. As shown in Fig. 9(b), the path has been marked with different colors, according to a code that denotes the ideal level of stiffness/precision that the subjects should apply while moving the robot end-effector: maximum level (namely "highest accuracy level") in the red-coded zones, medium level (namely "intermediate accuracy level") in the yellow-coded zones, and minimum level in the remaining zones (namely "minimum accuracy level"). First, the calibration has been carried out, and thereafter, the subjects have briefly familiarized themselves with the system and started the experimental task. Note that in this experiment, the orientation of the end-effector has been kept constant and the users only act on the robot position.
In Fig. 10, the procedure for the robot motion definition is exemplified, by considering the PbD task performed by one of the subjects (S1) considered in the experiment.
1) The user teaches the robot by moving its end-effector along the prescribed path and modulates the stiffness of its hand at the same time, in order to specify the required accuracy. The color of the points along the trajectory denotes the stiffness level, ranging from 0 to 1. A simple inspection of Fig. 10(a) shows that the stiffness is augmented where a small variance with respect to the imposed trajectory is necessary, while lower values are applied where the final trajectory can differ much more from the input motion. 2) The stiffness information is translated into an admissible region about the trajectory applied by the human operator.
In particular, the bound on the variation is deduced by exploiting (3), where the minimum and maximum levels are selected on the basis of the specific application (in the experiment, ρ min = 0.005 m and ρ max = 0.025 m).
3) The smoothing B-spline is finally computed by assuming λ = 10 −5 and cylindrical constraints. By comparing the initial motion in Fig. 10(a) and the final trajectory in Fig. 10(b), the positive effect of the proposed approach is evident: the segments that are approximatively straight are further straightened and sharp corners are made smoother; in addition, unwanted fast motions of the user are canceled. These observations are confirmed by the motion profiles along the Cartesian axes reported in Fig. 11.
In this figure, the position velocity and acceleration of the initial trajectory and of the optimal B-spline are compared. Both the motion profiles are a function of the variable u ∈ [0, 1], obtained by normalizing the time of the demonstration by its duration. The use of the same parameterization also for the B-spline curve allows reproducing the original motion, including, for instance, possible stops of the end-effector at some locations. The derivatives of the demonstrated trajectory are obtained via numerical differentiation (this explains the noise superimposed to the first and above all to the second derivative), while the derivatives of the final trajectory are available in a closed form. For a specific insight into the optimized trajectory's smoothness, velocity, and acceleration characteristics, considering both single-subject and aggregated data, see Section VI-B. In order to evaluate also the intuitiveness of the proposed approach, the capability of the users of modulating the hand stiffness while operating the robot has been tested. As illustrated in Fig. 12, showing the results of the stiffness modulation for a single subject, the user is able to adequately modulate the stiffness in accordance with the requested levels during the kinesthetic teaching, that is to match  with good approximation the maximum, medium, and lower levels requested in the different zones of experimental test bed. Similar performances are achieved by all the subjects involved in the experiment, and further analyses of the aggregated data on the hand stiffness modulation capabilities are reported in Section VI-A.
Finally, the experiment ends with the position-controlled robot executing the trajectories taught by the subjects and optimized by the constrained smoothing B-spline. The effect of the optimization process is very clear, if the tracking errors θ i in the joint space and the corresponding control torques τ i , i = 1, . . . , 7, are considered. From the profiles reported in Fig. 13, a generalized reduction of both the tracking error and the related control action for all the joints, due to the improved smoothness of the optimized trajectory, is evident. A comparison between the peak absolute values (PAVs) of the trajectory demonstrated by the subject S1 and the related optimal path, reported in Table I

B. Experimental Session 2: Tilted Welding Points
In this second experiment, based again on the mock-up of a welding process, the users are asked to place the end-effector in   specific locations according to a task that, besides a specific motion, also requires a change in the orientation. More specifically, as can be observed in Fig. 14, from the starting point, the robot end-effector has to be tilted while moved down-left in order to be inserted in the first required location and then carried back to the initial position to repeat the same task on the left side with opposite inclination. The operator must impose a precise robot configuration on the two trajectory ends, while during the rest of the motion, the variation with respect to the "ideal" trajectory can be large. For this reason, the user must impose a high level of stiffness in the neighborhood of the mock-up welding points, while maintaining a low level elsewhere. This behavior can be easily achieved, as highlighted in Fig. 15, showing the Fig. 15. Stiffness modulation for subject S1 during the tilted welding points session. stiffness modulation for a single subject during the test. The same behavior is confirmed for all the subjects of the experiment, and we refer to Section VI-A for a dedicated analysis of the related aggregated results. Since in this application both position and orientation are involved, the optimization problem for defining the robot trajectory has been split into two parts.  In Fig. 16, the trajectory trained by subject S1 and the final trajectory are compared, and the benefits of the proposed method in regularizing the trajectory, eliminating unwanted motions of the operator within the limits imposed by the application, are clear. The fact that the two-way trajectories are almost coincident is worth noting. To better highlight the advantages of the procedure not only for the position but also for the orientation, in Fig. 17, the roll, pitch, and yaw angles of the original motion (with the given bounds) and of the final one are reported as a function of the normalized variable u, showing that, where allowed by the application, this approach tends to eliminate all the oscillations that affect the considered signals.

VI. DISCUSSION
In the following, observations related to the experimental findings reported in the previous section are discussed. By interpreting the obtained results with respect to the experimental task carried out, hardware/software choices, and future perspectives, specific aspects are analyzed and contextualized for a wider insight into the study of the proposed framework.

A. Aggregated Analysis of the Hand Stiffness Modulation
In order to analyze the aggregated results in relation to the hand stiffness modulation within Experimental Session 1 (as described in Section V-A), the data of all subjects have been grouped on the basis of the requested reference level; see the box plots reported in Fig. 18. From the aggregated results, it can be seen that the different stiffness levels are matched by all the subjects, with acceptable deviations (a maximum error of 0.22 and 0.096 is reported for the higher and lower reference bands) and, importantly, reporting for a statistically significant difference among all the requested reference levels. For the statistical analysis, we have set the statistical significance to p < 0.001 and conducted a one-way repeated measures analysis of variance (ANOVA) [53]. The Shapiro-Wilk test for normality and Mauchly's test for sphericity has been checked, concluding that the ANOVA assumptions have been not violated [53], [54]. The ANOVA results reveal a statistically significant difference among all levels, F (2, 6) = 63. 15. The ability of the subjects  in modulating the stiffness during the robot teaching task, as requested by the instructed levels, is therefore supported by statistical evidence.
The capability of the subjects to correctly modulate the hand stiffness during the programming task is also confirmed in relation to Experimental Session 2 (as described in Section V-A). The aggregated data are reported in the box plots of Fig. 19, which group the results on the basis of the requested lower and higher reference levels. Similarly to Experimental Session 1, a statistical analysis based on ANOVA has been conducted, setting the statistical significance to p < 0.001. The Shapiro-Wilk test and Mauchly's test report that the ANOVA assumptions are not violated. The ANOVA reveals a statistically significant difference between the requested stiffness modulation levels, F (1, 9) = 24.62, and p < 0.001. All the subjects are, therefore, statistically able to successfully modulate the stiffness (a maximum error of 0.3 is reported for the higher reference bands) according to requested reference levels in the presence of different orientations of the end-effector during the sEMG-enhanced kinesthetic teaching task.
According to the analyzed aggregated results, the proposed approach is, therefore, capable to provide the user with an  II  PEAK AND MEAN VALUES OF THE NORM OF VELOCITY AND ACCELERATION  DURING EXPERIMENT #1 FOR SUBJECT S1 additional programming input channel, which could be voluntarily modulated to program a specific parameter of a constrained B-spline optimization of the trajectory. Particularly, this highlights a clear improvement with respect to state-of-the-art works in which the sEMG information is exploited to observe a biomechanical behavior of the user, as in teleimpedance-based approaches. Indeed, in the proposed approach, the programming modality estimated from sEMG sensors is not "indirectly" emerging from human motions; instead, it is arbitrarily exploited as a programming input in accordance with the presence of a real-time vibrotactile feedback informing the user about the current value of the modulated hand stiffness level. Note that the presence of a real-time "biofeedback" represents a novelty with respect to state-of-the-art approaches. Another aspect of novelty also regards the usage of sEMG to estimate the level of stiffness of the human hand instead of the stiffness at the arm endpoint: this allowed us to provide the users with a programming input that could be intuitively and voluntarily modulated even during the performance of a robot kinesthetic teaching with both arms.

B. Smoothness Characteristics of the Optimized Trajectory
With reference to the results reported in Section V-A for Experimental Session 1, in Table II, the peak and mean values of the norm of velocity and acceleration of the trajectory demonstrated by subject S1 and the corresponding constrained smoothing Bspline are reported. By comparing the numerical values related to S1, it comes out that peak values of both velocity and acceleration are considerably lowered by the optimization process (−63.20% and −92.98%, respectively). The average velocity is reduced by −19.79%. In this case, since the duration of both the trajectory (original and optimized) is the same, this decrease can be explained by a reduction of the total traveled distance: in fact, the procedure that makes the trajectory smoother tends to reduce its length, by avoiding unwanted oscillations. Similar conclusions can be drawn by observing the data aggregated for all the subjects in Fig. 20. The peak and mean values of velocity and acceleration are reduced by similar amounts. In addition, it is possible to observe that the optimization process based on constrained smoothing B-splines causes a general reduction of the width of the values' distribution about the median element. This means that the influence of the single subject on (these features of) the final trajectory tends to become negligible.
In addition, the capability of the proposed procedure to increase the smoothness of the original trajectory at the price of larger approximation errors is confirmed by an analysis of the cost function (5), or equivalently of (9), and in particular of the two contributions J 1 and J 2 . Term J 1 takes into account the overall amount of the squared errors of the optimal curve with respect to the original trajectory, while J 2 is a measure of the smoothness of the optimal curve based on the integral of the acceleration. The data, reported in Table III and related to the experiment with subject S1, reveal that the constrained smoothing B-spline has an intermediate behavior between a simple approximating spline, which takes into account only the approximation error by assuming λ = 0 and accordingly minimizes J 1 , and a standard smoothing B-spline without any additional constraint. Interestingly enough, by reducing the size of the bound ρ with respect to the initial value denoted by ρ , the approximation error J 1 decreases, while the term J 2 becomes larger. This means that, if the admissible region becomes smaller, the optimal trajectory tends toward an approximating curve. On the other hand, by relaxing the constraint, e.g., by considering ρ = 2ρ , the approximation error grows, but the smoothness of the curve is improved, being J 2 smaller. In this case, the constrained curve gets closer to an unconstrained smoothing B-spline. It is possible to conclude that with the introduction of the constraints in the optimization process, it is possible to transform a subjective choice of the parameters, i.e., λ, whose final effect on the overall trajectory cannot be predicted a priori, into an objective parameter selection, since the constraints (ρ min and ρ max ) depend on the geometry of the problem, while it is sufficient to assume λ small enough.
Therefore, the proposed approach allows the operators to program trajectories that are smoother with respect to the ones demonstrated by means of kinesthetic teaching only, with also the advantage of having the possibility to specify spatial precision constraints. Importantly, at the same time, the resulting trajectory is completely determined in geometrical terms and, therefore, applicable by definition to any timing law. This is a clear advantage with respect, for example, to state-of-the-art methods for task parameterization, based, e.g., on DMPs, that require the knowledge of additional motion information, such as velocity and acceleration, which are typically affected by noise when computed from position information.

C. Subjective Vibrotactile Feedback Evaluation
In accordance with the architecture described in Section IV, the subjects performed the experimental sessions described in the previous section by exploiting a vibrotactile feedback conveyed on the upper arm's skin by a wearable bracelet, with a range of vibration intensity proportional to the modulated hand stiffness level. In order to qualitatively evaluate its usage, we provided the subjects with a questionnaire to subjectively assess the influence of the vibrotactile feedback during the experimental session executions. In particular, the questionnaire required the subjects to rate three statements related to the perceived ease of use (PE), perceived usefulness (PU), and comfort (C) in exploiting the vibrotactile feedback, on the basis of a Likert scale from 1 (entirely disagree) to 7 (entirely agree), as reported in Table IV. As can be observed by the questionnaire results reported in the table, all the average scores computed over the subjects were greater than or equal to 6, showing a qualitative positive evaluation of the interpretability and usefulness of the vibrotactile feedback for the voluntary modulation of the hand stiffness level during the experimental tasks.

D. Hand Stiffness Level and Trajectory Sharpness
Sections V-A and V-B report how the subjects involved in the experiments were successfully able to modulate their hand Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
stiffness, according to the levels instructed by the task protocol during the physical teaching of end-effector robot trajectories. However, the characteristics of the trajectories that have been taught by the subjects with respect to the three different levels of stiffness (minimum, lower, and higher) are worth analyzing. In particular, we want to have an insight into the relationship between the sharpness of the human motion/demonstrated trajectories and different modulations of the stiffness, since this may have an impact on the actual capability of performing desired robot teaching as well as on how the robot is perceived by the user during the physical interaction, and then, we also want to check the effect obtained by the application of the constrained B-spline trajectory optimization. To this aim, we use jerk measures to quantify the level of trajectory sharpness, since it has been shown that-in a comparison scenario-the lower the trajectory's jerk amplitude, the more smooth and natural the performed motion (minimal jerk of the human motion [55]). Conversely, higher jerk amplitudes correspond to motions/trajectories with higher sharpness. Specifically, we consider the entire sample of jerk values shown by each subject over the robot teaching tasks. On this basis, we report for the mean absolute value (MAV) of the trajectory jerk, which is simply the average value of all the rectified jerk values, and for the mean higher absolute value (MHAV) of the trajectory jerk, which is the average value over the rectified jerk values that are greater than 1.5 times the interquartile range (i.e., the mean value of the higher outlier values w.r.t. the jerk values distribution itself) for each subject and grouped according to the different levels of stiffness modulation during the experiment executions. According to their definitions, the MAV of the jerk is related to the average trajectory sharpness, whereas the MHAV is related to less frequent sharper behaviors that are not appreciable looking only at the total average jerk. Fig. 21(a) and (b) reports the box plot of the jerk MAV and MHAV of the demonstrated trajectory for the subjects, grouped by the different hand stiffness levels performed during the teaching task. Since the Shapiro-Wilk test reports that the considered data groups violate the assumption of normality, the Kruskal-Wallis test was performed, revealing that both the MAV and MHAV jerk in Fig. 21(a) and (b) were affected by the level of stiffness, H(2) = 13.65, p < 0.01. Thereafter, a Tukey test was performed for pairwise comparison, reporting statistically significant smaller values for both the MAV and MHAV jerk in the case of the higher stiffness level with respect to the minimum and lower stiffness levels. We can, therefore, conclude that during the teaching of trajectory paths with higher stiffness, the subject performed less sharp trajectories with statistical evidence. This is reasonably related to the fact that, when the subjects most stiffened their hand, they were less prone to execute motions characterized by rapid oscillations and abrupt variations with respect to the case with lower stiffness levels. Therefore, higher stiffness may be interpreted as an injection of higher dumping in the motion generation process of the user during the teaching phase. Although this aspect did not affect the correct execution of the tasks, as described in Sections V-A and V-B, we will conduct future investigations to further study and characterize the role of this behavior. On the other hand, looking at Fig. 21(c) and (d), it is possible to observe the MAV and MHAV jerk after the application of the B-spline trajectory optimization. As expected, it is appreciable how the jerk amplitude was tremendously reduced by five to six orders of magnitude, with no statistically significant difference among the trajectory sections related to the different levels of stiffness, as reported by the Kruskal-Wallis test based analysis (H(2) = 0.8, p = 0.671) performed in an analogous manner as for the data of Fig. 21(a) and (b). An aspect of particular interest observable by looking at Fig. 21(c) and (d) is related to the fact that the B-spline optimization almost completely cancels the presence of MHAV jerk values for the case of the lower stiffness level, since, in this case, straight paths were taught by the user to the robot end-effector trajectory (see Fig. 9). This result has been allowed by the possibility of selecting smoothing parameters α particularly small because the constrained optimization problem guarantees, in any case, the compliance with the limits on the maximum deviation with respect to the original trajectory, without the need for a difficult tradeoff between smoothness of the curve and interpolation error.

E. Future Work and Perspectives
The results obtained so far are based on two pillars, which represent also the two main novelties of the article: 1) smoothing B-splines enhanced with a bound on the maximum deviation with respect to the demonstrated trajectory; 2) sEMG signals related to the modulation of the hand stiffness of the operator for the continuous specification of this bound. The former contribution has been completely developed with the proposed two-step optimization procedure that guarantees the definition of a minimum complexity trajectory, i.e., characterized by the minimum number of parameters (in the specific case, the control points of the B-spline function) compliant with the bound on the maximum deviation from the demonstrated path. The latter aspect will be the subject of further research. In particular, the use of sEMG signals as an independent communication channel between the user and the robotic system will be addressed to improve PbD tasks. Regarding the proposed application, the mapping between the muscle activation and the desired level of precision will be improved, by studying more in depth the role of vibrotactile feedback. In addition, novel PbD methods, based on multimodal interaction, namely force and stiffness, between the user and the robot will be developed.

1) Mapping Between Muscles Activation and Desired Precision:
Future studies will include the consideration of different types of mapping relations between muscle activation and desired precision constraints that the user wants to instruct to the system. In particular, since from biomechanics, it is known that the muscle activation characteristics are nonlinear [38], a very interesting perspective regards the investigation of a nonlinear mapping also between muscle activation and precision-level modulation in terms of intuitiveness for the users. Furthermore, since in the present work we have implemented a linear mapping that actually enabled the users to modulate the hand stiffness level in accordance with the levels requested by the tasks/experimenters, we can observe how the presence of the vibrotactile feedback played an important role in allowing the users "compensating" for the nonlinear muscle activation characteristics by means of their cognitive control exploiting the provided vibrotactile feedback. We are, therefore, planning to carry out a future dedicated study in order to determine the most intuitive mapping relation between muscle activation and precision level modulation. Such future investigation should include properly designed experimental protocols in which a group of subjects will test different combinations of: 1) linear mappings; 2) nonlinear mappings; and 3) presence/absence of vibrotactile feedback, during kinesthetic teaching.
2) Vibrotactile Feedback Influence: Future experiments will also be considered to investigate the influence of the vibrotactile feedback on the modulation of the hand stiffness level and related precision constraint of the trajectory. More specifically, we plan to carry out tests of the capability of the users in modulating the hand stiffness level/inferred trajectory precision during kinesthetic teaching tasks with and without the presence of vibrotactile feedback. In addition, we will also further analyze the influence of the vibrotactile feedback by comparing vibration profiles different from the one used in the present work (i.e., modulation of the intensity of a continuous vibration), such as the modulation of the frequency of intermittent vibrations, and more structured profiles combining asynchronous "vibrations bursts" when certain meaningful values of the hand stiffness level are matched by the users. Finally, we also plan to test different body locations for the placement of the vibrotactile device, such as in the proximity of the wrist or ankle.
3) Novel PbD Methods Based on Stiffness Modulation: Stiffness modulation can be exploited not only for specifying the desired precision of the trajectory to be planned, but, being an additional input channel, it can also be used for other different forms of interaction between the robot and the user. Future research will focus on the possibility for the human to physically interact with the (already) planned trajectory and locally reshape it according to the needs by modifying the stiffness level of the hand.

VII. CONCLUSION
The proposed multimodal teaching approach, based on kinesthetic interaction and information extracted from forearm sEMG, can be effectively used to enhance robot PbD procedures. The estimation of the desired accuracy level from the hand stiffness of the user that guides the robot along the desired path allows obtaining a smooth motion trajectory with a one-shot procedure that combines intuitiveness for the user, due to the natural tendency of humans to modulate their stiffness according to the requested precision, and optimality of the trajectory in terms of minimum curvature/acceleration within the bounds imposed by the tasks. Based on a single demonstration, this method leads to a significant reduction of the time and effort required to program robotic tasks, with an important impact on industrial applications, at the price of using a (low-cost) wearable device for recording sEMG signals.
Let u = [u 0 , . . . , u m ] be a vector of real numbers (called knots), with u j ≤ u j+1 . The jth B-spline basis function of degree p is defined, according to the Cox-de Boor recursive formula [56], as Note that B p j (u) ≥ 0 ∀u, and, in particular, B p j (u) = 0 everywhere except in the interval [u j , u j+p+1 ). The number of control points, N + 1, the number of the knot, m + 1, and the degree of the B-spline are related by m = N + p + 1.
Since the focus of this article is on the definition of the geometric path that the robot must track to accomplish a given task, the knots, that implicitly determine how this path is covered, are considered with a uniform distribution, i.e., where the internal knots u j are computed according to Therefore, fixing the degree p of the B-spline (in robotics applications, cubic B-splines, i.e., p = 3, are usually considered because of the continuity of velocities and accelerations) and the support [u min , u max ] where the curve is defined (typically = [0, 1], but other choices are also possible, the B-spline curve is completely defined by the sequence of control points p j , j = 0, . . . , N.

APPENDIX B SMOOTHING B-SPLINE DERIVATION
Smoothing B-splines are standard B-spline functions s(u) whose control points p j are determined by minimizing the cost function where λ ≥ 0 is a free parameter, and which, therefore, represent a tradeoff between the (squared) approximation error with respect to the given via points q j , j = 0, . . . , n, and the smoothness of the curve, which is measured by taking into account its curvature/second derivative. Generally, the number N + 1 of control points defining the spline is equal to the number n + 1 of data points to be approximated. However, when the number of data points is very large, it may be convenient for computational reasons to choose n > N and, in particular, n N . By recalling that points q j , j = 0, . . . , n, are obtained by sampling the demonstrated trajectory with a constant period T s , one can assume a fixed ratio between n and N , by considering a new control point every m via points. In this case, the values of u, at which the approximation of points q j occurs, are obtained by normalizing the actual sampling instant jT s with respect to the total duration nT s of the trajectory, i.e., u j = j n , j = 0, . . . , n.
By exploiting (6) and the relationship between the derivatives of a B-spline and its control points (for more details, see [26]), the cost function (19) can be written in a matrix form as where tr(·) denotes the trace of a matrix, W = diag(w j ) > 0, j = 0, . . . n, is the weight matrix, and P = [p 1 , . . . , p N −1 ] T is the matrix with the internal control points 2 to be determined. Note that the first and the last via points are supposed to be perfectly interpolated, i.e., s(u min ) = q 0 and s(u max ) = q n and, accordingly, p 0 = q 0 and p N = q n .
The analytic expression of the other matrices appearing in the cost function (19) is reported below. The matrix B ∈ R (n−1)×(N −1) is built up with the vector of the basis functions at "time instants" u j , j = 1, . . . , n − 1:  2 For the sake of simplicity, we use the same name of the matrix containing all the control points, which has been defined in (8). Note that the difference between the two matrices only consists of the first and last control points, which are already known. The procedure that leads to the definition of these matrices, even if only in the particular case N = n, can be found in [26].
Remark 1: Matrix W is symmetric and positive definite by construction. A simple inspection reveals that tridiagonal matrix A is symmetric. Its positive definiteness can be easily inferred by applying the Gershgorin circle theorem (see [57,Th. 1.2]). Both matrices can, therefore, be rewritten as Γ T Γ = W and Ω T Ω = A, respectively. Finally, it can be shown that, because of the choice of u j and u j , matrices B (see [58,Th. 5.18]) and C are full rank.
Remark 2: The trace function that appears in (21) is necessary to deal with the vectorial form of each control point p j that leads to a matrix form of the unknown P. The cost function (21) can be rewritten as where x F = tr(x T x) denotes the Frobenius norm of matrix X [59], while the meaning of the other symbols is the same as in (21). By rearranging the components of control points p j in P to obtain a column vector of length (N − 1)d, and modifying the matrices that appear in (21) accordingly, the objective function becomes a combination of the square of two standard Euclidean norms at the price of a less compact notation and an increase of the problem dimension by a factor of d.
Since matrices B and C are full rank, the two functions J 1 (P) and J 2 (P) are both strictly convex. Thus, the whole objective function (21), which is a positive combination of these functions, is strictly convex too. For this reason, the optimal set of the problem min P J(P) contains at most one point. By differentiating (21) with respect to P and equating the result to zero, the analytical solution of this minimization problem can be derived in an analytical form P = (B T WB + λC T A T C) −1 (B T WQ − λC T A TP ).
(23) Since matrix B T WB + λC T A T C is strictly positive, its inverse is well defined, and the unique solution P of the minimization problem can be found. He is currently a Full Professor with the University of Bologna. He has authored or coauthored more than 150 scientific articles presented at conferences or published in journals. His research interests include design and control of manipulation devices, humanrobot physical interaction, mobile manipulation, manipulation of deformable objects, and soft robotics.
Claudio Melchiorri (Fellow, IEEE) received the bachelor's and Ph.D. degrees in electronic engineering from the University of Bologna, Bologna, Italy, in 1985 and 1990, respectively.
In 1985, he started his research activity with the Department of Electrical, Electronic and Information Engineering (DEI), University of Bologna, where he became a Research Associate in 1990. Since 2001, he has been a Full Professor in Robotics with DEI, University of Bologna. Since 2014, he has been a Member of the Academy of Sciences, Bologna Institute. He has participated to and coordinated several research projects in robotics, automatic control, and automation, both at the national and international levels. He is the author or coauthor of about 300 scientific papers presented at conferences or published in journals and six books on digital control, and he is a coeditor of seven books on robotics. His research interests include robotics (dexterous manipulation, telemanipulation, haptic systems, and advanced sensors), control theory (nonlinear control, passivity, and robust control), and motion control (electric motors, trajectory planning, and hardware/software real-time control systems).
Open Access funding provided by 'Università degli Studi di Modena e Reggio Emilia' within the CRUI CARE Agreement