Self-Supervised Regression of sEMG Signals Combining Non-Negative Matrix Factorization With Deep Neural Networks for Robot Hand Multiple Grasping Motion Control

Advanced Human-In-The-Loop (HITL) control strategies for robot hands based on surface electromyography (sEMG) are among major research questions in robotics. Due to intrinsic complexity and inaccuracy of labeling procedures, unsupervised regression of sEMG signals has been employed in literature, however showing several limitations in realizing multiple grasping motion control. In this letter, we propose a novel Human-Robot interface (HRi) based on self-supervised regression of sEMG signals, combining Non-Negative Matrix Factorization (NMF) with Deep Neural Networks (DNN) in order to both avoid explicit labeling procedures and have powerful nonlinear fitting capabilities. Experiments involving 10 healthy subjects were carried out, consisting of an offline session for systematic evaluations and comparisons with traditional unsupervised approaches, and an online session for assessing real-time control of a wearable anthropomorphic robot hand. The offline results demonstrate that the proposed self-supervised regression approach overcame traditional unsupervised methods, even considering different robot hands with dissimilar kinematic structures. Furthermore, the subjects were able to successfully perform online control of multiple grasping motions of a real wearable robot hand, reporting for high reliability over repeated grasp-transportation-release tasks with different objects. Statistical support is provided along with experimental outcomes.

have investigated the decoding of surface skin electromyography (sEMG) to realize HRi for human-in-the-loop (HITL) robot hand grasping control.In recent decades, research efforts have been mostly dedicated to investigating the exploitation of pattern recognition based machine learning to achieve more natural and intuitive HRi based on muscular activation [2].Specifically, one methodology consists in classifying sEMG signals in order to produce discrete commands for the activation of different grasping actions on the robot hand, resulting in very good classification accuracy (> 95%) even when controlling more than ten grasping motions [2].However, classification presents inherent reliability issues, particularly related to the unpredictability of misclassifications and the increasing of complexity with the number of considered grasping actions/motions.In response, simultaneous and proportional (s/p) control has more recently taken center stage in the research community [3], which relies on exploiting regression models to map sEMG signals into continuous motions of multiple robot hand grasps [4].The core advantage of s/p HRi is that facilitates the user to react in face of mapping inaccuracies/unpredicted motions, thanks to the continuous nature of the modulable robot hand inputs.A major issue of s/p approaches is that -when they are implemented in a supervised fashion -they require an sEMG data collection and explicit labeling in order to train the regression model.Note that, since in this work we are interested in s/p control enforced by means of regression of sEMG signals, the kind of labelling we refer to is any instant-by-instant continuous labelling of the sEMG signal denoting the grasping motions that are desired to be controlled on a robot hand.Unfortunately, labeling of sEMG signals for telemanipulation purposes is a tedious and frustrating procedure that is critical for the user.Furthermore, several systematic labeling imprecisions cannot be avoided, due to well-known difficulties in labeling biological data [5].
In order to bypass these limitations, unsupervised regression approaches have been explored.State-of-the-art approaches have typically exploited the synergistic organization of the human motor control system, assuming that the sEMG signals are the result of a mixture obtained by the product of a basis matrix (the muscular synergies matrix) with an encoding vector (the motor drives vector).In this context, unsupervised regression has been successfully realized by applying Non-negative Matrix Factorization (NMF) to unlabelled sEMG training data [6], [7].However, a very limiting assumption of the traditional NMF-based approach is that different motions must correspond to motor drives vectors belonging to (mostly) orthogonal subspaces [8].Unfortunately, such assumption is not going to be fulfilled in practice, due to the complexity of musculo-tendon mechanisms in the execution of multiple grasping motions.This implies specific single target hand motions to be representable by linear combinations of muscular synergy vectors corresponding to other interfering motions, bringing the regression output to be unreliable for robot hand control.Importantly, these limitations can be seen as a consequence of the intrinsic absence of nonlinear fitting capabilities of unsupervised approaches.More recently, variants of the traditional NMF-based approach considering sparsity-constraints [9], time-varying muscular synergies [10] and autoencoders [11] have been proposed, without however resolving the performance drop when in presence of multiple grasping motions.For this reason, in this study traditional NMFbased approaches are considered for comparison.
In this work, a novel HRi based on self-supervised regression of sEMG signals for HITL robot grasp control is presented.Specifically, we demonstrate that NMF can be used to compute the labels to be provided to a Deep Neural Network (DNN) [12] architecture to reliably map sEMG signals into robot hand control inputs, even in presence of multiple grasping motions.Experimentations were performed recruiting two groups of five subjects in offline and online sessions, respectively, see Fig. 1.In the offline experimental session, the regression capabilities of traditional NMF-based approaches were compared with the self-supervised method, demonstrating the improved performance of the proposed solution.Also, systematic evaluations on three different simulated robot hands with dissimilar kinematic structures were carried out, supported by statistical analyses.In the online session, the involved subjects performed real-time control of a wearable robot hand, reporting for smooth and highly repeatable regulations of the grasps during multiple object grasp-transportation-release tasks, also providing statistical support for the reliability of the system.The letter is organized as follows: Section II presents robot hands control strategy, traditional NMF-based approaches and proposed self-supervised method; Section III reports the offline and online experiments descriptions and results; Section IV discusses work implications, results comparisons and possible limitations; finally, Section V draws the conclusions.

A. Setup 1) sEMG Signal Acquisition:
The 8-channel wearable sEMG armband gForcePro (see Fig. 1) by OYMotion was used to acquire sEMG signals from the forearm muscles of the user.
The armband was positioned in proximity of the bellies of the Flexor Digitorum Superficialis and Extensor Digitorum Communis muscles, following the guidelines outlined in [13].Raw sEMG signals were acquired at 1 kHz via a built-in Bluetooth interface, and transmitted to a nearby PC.A processing chain was then applied to each sEMG channel, consisting of a 50 Hz notch filter to eliminate powerline interference, a 20 Hz highpass filter to remove baseline noise and, finally, the computation of the root mean square (RMS) value over a 200 ms sliding window.
2) Robot Hands and Controller: The robotic grasping device used for the online experiments was the AR10 Robot Hand by Activat8 Robots, a lightweight anthropomorphic robot hand with 5 fingers and 10 degrees-of-freedom (DoFs).The AR10 servomotors were controlled at low-level via a Robot Operating System (ROS) interface.Considering that the AR10 presents n J = 10 joints (2 joints per finger), let us denote with q ref (t) ∈ R n J the vector of reference joint angles.In this study, the robot hand was controlled in such a way to allow the regulation of the closure level of n G = 3 different grasping motions corresponding to power, tripodal and ulnar grasps (see Fig. 1).The criteria for the choice of these grasping motions regards the selection of three volar grasps characterized by different Virtual Finger (VF) configurations, begin the VF defined as a functional unit comprised of at least one real physical finger (which may include the palm) acting in unison to apply opposing forces on the object and against the other VFs in a grasp [14].In particular, for the power grasp, three VFs are involved: (VF1) the palm, (VF2) the thumb and (VF3) the index-middle-ring-little fingers; whereas, for the tripodal and ulnar grasps, the (VF3) was associated with the index-middle fingers and ring-little fingers, respectively.Specifically, the AR10 joint references were imposed as where S ∈ R n J ×n G is the grasp synergy matrix and is the vector of synergy activations, with n G the number of grasping motions.In particular, in order to allow the regulation of the closure level of power, tripodal and ulnar grasps, the grasp synergy matrix S was computed in accordance with the concept of postural synergies [15], as detailed in the following.A matrix collecting four hand configurations, Q ∈ R n J ×4 , was built, defined as Q = [q OH q PW q TR q UL ], in which q OH , q PW , q TR , q UL ∈ R n J are the vectors of joint angles for the open hand (OH) configuration and the configurations corresponding to the maximum closure level of power (PW), tripodal (TR) and ulnar (UL) grasps, respectively (see Fig. 1).Then, the matrix S ∈ R n J ×n G (n G = 3) is obtained by performing the Principal Component Analysis (PCA) on Q.
From the PCA, the principal components Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
obtained, corresponding to the orthogonal directions of maximum variance of the configurations collected in Q, and the grasp synergy matrix was then built as S = [s 1 s 2 s 3 ].In this way, by defining the vectors of synergy activations corresponding to the maximum closure level of power, tripodal and ulnar grasps as α PW , α TR , α UL ∈ R n G , respectively, their values can be computed as where S + is the pseudo-inverse of the matrix S. In this way, (1) can be used for robot hand grasp control based on sEMG signals as will be detailed in Section II-B.Note that postural synergies have been largely investigated in literature: in [15], postural synergies were treated from the point of view of a geometrical tool to structure the human hand behaviour for robot hands design and control, whereas, in [16], the role of postural synergies for robotic grasping quality, robustness and force regulation was investigated.In this study, the concept of postural synergies is exploited to embed robot hand configurations in a lower dimensional space with respect to the full-dimensional joint space, in order to be able to develop a less complex sEMG regression model for improved performance (see Section II-C.)Furthermore, also three simulated robot hands were considered for an offline study, see Section III-B.The considered additional robot hands were: (i) a ROS-based simulator of the AR10; (ii) a SynGrasp-based [17] simulator of the University of Bologna Hand IV (UBHand) [18], an anthropomorphic fully-actuated robot hand with 15 DoFs; and (iii) a SynGrasp-based simulator inspired to the Barrett Hand [19], a 3-fingered robotic gripper with 8 DoF (see Fig. 1).In particular, for these additional simulated robot hands the same controller as the one introduced in (1)-( 2) was used, by considering the specific values of n J in (1) and joint configurations in (2) for the simulated hands.Note that, even if more general procedures have been proposed for mapping human hand motions on non-anthropomorphic robot hands [20], in this study the definition of the matrix Q for the BarretHand-inspired robotic gripper was done by a simple heuristic association of four predefined joint configurations of the gripper -namely q OH , q PW , q TR , q UL -that were desired to correspond to the open hand, closed power grasp, closed tripodal grasp and closed ulnar grasp configurations of the user's hand, respectively.In our specific case (see Fig. 1) we associated: q OH with the gripper configuration with all fingers extended; q PW with the configuration with all fingers flexed; q TR with the configuration with only the first and second fingers flexed; and, finally, q UL with the configuration with only the first and third fingers flexed.

B. Traditional NMF-Based Unsupervised Regression
Approaches for Robot Grasp Control 1) Single Grasping Motion Regression Via NMF: Aiming at performing an sEMG-based unsupervised regression of a single grasp, let us consider a user executing the specific grasping motion of interest while an 8-channel sEMG signal (see Section II-A1) training set of d samples is recorded and collected in the matrix E ∈ R 8×d .The matrix E can be written as the mixture [6] where W ∈ R 8×2 is the muscular synergy matrix and H ∈ R 2×d is the motor drive matrix.In particular, W and H can be written as where w e , w f ∈ R 8 and h e , h f ∈ R d are the extension and flexion components of the muscular synergy and motor drives matrices, respectively.In order to estimate W and H, the NMF can be applied to the training set E. However, among W and H, we are mostly interested in the estimated muscular synergy matrix W , because it can be then exploited to online compute the vector of instantaneous motor drives differs from H, since it denotes the instantaneous motor drives) as where W + is the pseudo-inverse matrix of W and is the vector of the online sEMG signal.Finally, the grasp closure level σ(t) is obtained as where, by properly setting the scaling factors k 1 , k 2 in order to normalize h e and h f in (4) and It is then possible to control a single specific grasping motion of the robot hand imposing, in (1), , for only powergrasp regulation, α TR σ(t), for only tripodalgrasp regulation, α UL σ(t), for only ulnargrasp regulation, (7) where α PW , α TR and α UL have been introduced in (2).Therefore, by defining the robot hand control input as in (7), the closure level of power, tripodal or ulnar grasps can be controlled singularly from online sEMG signals (E needs to be recorded accordingly), without the possibility of multi-grasp control.We refer to this unsupervised sEMG regression approach as "single-NMF".
2) Multiple Grasping Motions Regression Via NMF: In the general case, we want now to use the NMF for the regression of N different grasping motions.According to the traditional approach, the procedure requires that for each generic n-th grasping motion (1 ≤ n ≤ N ), the matrix E n collecting the sEMG signal training set recorded during the execution by the user of the only n-th grasping motion is considered.The related muscular synergy matrix W n is then estimated independently from the other grasping motions applying the NMF to E n , and therefore estimating, in total, n muscular synergy.The muscular synergy matrix W for all grasping motions is then built by concatenating the single muscular synergy matrices of each specific grasping motion: Therefore, it is possible to consider the pseudo-inverse matrix W + of the matrix W as given by (8), and it is possible to exploit eq.s of the form of ( 5)-( 6) to compute the grasp closure levels for each of the n considered grasping motions.Coming back to the case considered in this study of power (PW), tripodal (TR) and ulnar (UL) grasps, three calibration matrices E PW , E TR and E UL have to be collected, for which three muscular synergy matrices W PW , W TR and W UL are estimated, respectively.Thereafter, W is built according to (8), and the three grasp closure levels Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
σ PW (t), σ TR (t) and σ UL (t) are computed as where i = {PW,TR,UL}, σ i (t) ∈ [0, 1], and h e,i (t) and h f,i (t) are computed from the online instantaneous values of the sEMG signal E(t) as where terms in ( 9) and ( 10) have an analogous meaning as in ( 6) and ( 5).The sEMG-based robot hand multi-grasp control is then obtained imposing, in (1), In the following, we refer to this multi-grasp unsupervised regression approach as "concatenated-NMF".

C. Proposed Self-Supervised Regression of sEMG Signals for Robot Hand Multi-Grasp Control
We propose an approach based on a DNN, which is trained in a self-supervised fashion by means of a proper application of NMF.Let us consider N different grasping motions, and the user executing a generic grasping motion n (1 ≤ n ≤ N ).Accordingly, a matrix E n is defined as the matrix collecting the sEMG signal samples recorded during the execution by the user of the grasping motion n.In this way, a total of N matrices Thereafter, NMF is applied to each matrix E n in order to estimate the relative muscular synergy matrix W n .Instead of concatenating the N synergy matrices W 1 , . . ., W N ∈ R 8×N to form a new matrix as in the concatenated-NMF approach, we use them to compute the offline estimated motor drive matrices Then, with a similar reasoning as in (6), we can obtain the offline estimated grasp closure level vectors σ1 ∈ R 1×d 1 , . . ., σN ∈ R 1×d N as Moving to the case considered in this study of power (PW), tripodal (TR) and ulnar (UL) grasps, (13) becomes where terms in ( 14) have an analogous meaning as in (6).Since σPW , σTR and σUL in (14) correspond to the calibration matrices E PW , E TR and E UL , respectively, they represent an (offline) estimation of the grasp closure level for the power (PW), tripodal (TR) and ulnar (UL) grasps, without any ambiguity of overlapping grasping motions.This allows to define an sEMG training set E T as and corresponding label T ∈ R 3×(d PW +d TR +d UL ) constructed as where α PW , α TR , α UL ∈ R n G =3 are defined in (2).Thereafter, we exploit the training set E T and label T to train a DNN for sEMG-based robot hand multi-grasp control.Importantly, note that the label T of the training set E T is automatically extracted from the sEMG calibration data, allowing to train the DNN in a self-supervised manner as detailed in the following.Let us consider the vector of online instantaneous sEMG signal ] T provided as input to a DNN architecture, as depicted in Fig. 2. The network is composed by n hidden layers and an output layer.The generic j-th hidden layer of the DNN contains N j neurons with Rectified Linear Activation Unit (ReLU) activation function F(•), and bias vector b (j) ∈ R N j .The input vector a (j−1) (t) ∈ R N j−1 coincides with the output of the (j − 1)-th hidden layer, characterized by N j−1 neurons.Accordingly, the output vector of the j-th hidden layer a (j) (t) ∈ R N j is given by This structure describes each hidden layer, except for the first on in which a weight matrix G (1) ∈ R N j ×N in , with N in = 8, is applied to the input E(t) ∈ R 8 .The output layer is characterized by N n+1 = 3 neurons, a weight matrix G (j) ∈ R N n+1 ×N n and a sigmoid activation function S(•).Thus, the output vector a (n+1) (t) of the DNN is given by where a (t) and a (n+1) 3 (t) are the three scalar outputs of the network, and a (n) (t) ∈ R N n and b (n+1) ∈ R N n+1 are the input and bias vectors of the output layer.The objective of the network introduced so far is to train the weight matrices and bias vectors such that the outputs of the network a   (t) allow to control the closure level of power, tripodal and ulnar grasping motions of the robot hand.To this purpose, the network training can be performed through the scaled conjugate gradient back-propagation algorithm with mean squared error (MSE) as loss function, using as training set the matrix E T previously introduced in (15), and as label the target outputs T = [τ 1 τ 2 τ 3 ] T defined as described in (16).Once the training is carried out, the control of the robot hand is realized by imposing, in (1), the vector of grasp synergy activations as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

A. Subjects
We recruited 10 healthy subjects (age: 30 ± 4.2, right-handed: 9 sbj.s,left-handed: 1 sbj.s).Five subjects were involved in an offline experimental session (see Section III-B), for systematic evaluations of the proposed self-supervised regression, in order to perform comparisons with the single-NMF and concatenated-NMF approaches, and considering three different simulated robot hands (refer to Section II-A2).The other five subjects were involved in an online experimental session (see Section III-C), in which they were required to online control the real AR10 Robot Hand (see Fig. 5) in several grasp-transportation-release tasks.The two groups of subjects were formed randomly.None of the subjects had previous experience with the system and sEMG.Experiments were conducted in compliance of the Declaration of Helsinki, and participants signed an informed consent form.

B. Offline Experimental Session
The involved subjects were asked to replicate six times the following sequence (while sEMG signals were recorded): continuous modulation of the power grasp from the minimum closure level (open hand) to the maximum closure level and then back to the open hand, then followed by the same continuous modulation of the tripodal and ulnar grasps.Note that the motion of the reference virtual hand on the screen was exploited to retrieve the (simulated) robot hand synergy activation references for the offline systematic evaluations and comparisons, see Fig. 3, as reported in the following section.
1) Systematic Evaluation of Regression Methods: Three nested cross-validations (nCV) were carried out independently for each of the considered regression methods, i.e. single-NMF, concatenated-NMF and the proposed self-supervised regression, using the sEMG datasets obtained as explained in the previous subsection.The nCV was performed in a subject-specific paradigm, meaning that the nCV carried out for each regression method was separately conducted for each of the subjects.The nCV was designed to minimize biases and/or artificial overestimations of the results, and was composed of the following steps: i) the sEMG dataset -constituted by 6 repetitions of the power-tripodal-ulnar motion -was partitioned in 6 different combinations of a subset of 5 motion repetitions plus a subset of 1 motion repetition; ii) for each of these partitions, the subset of 5 motion repetitions was used to perform a 5-fold CV (with each fold containing one power-tripodal-ulnar motion repetition) to conduct a grid-search for the selection of the regression model hyper-parameters, obtaining a total of 5 trained regression models.Therefore, this step constituted the inner nested loop of the nCV; iii) thereafter, for each of the dataset partitions, the performance of each of the 5 models trained in the inner nested loop was evaluated on the remaining external subset containing 1 motion repetition (i.e., the test set), resulting in 5 different model evaluations.The metrics used to evaluate this performance was the Dynamic Time Warping (DTW) similarity measure between the model outputs and the reference synergy activations (see previous Section III-A and Fig. 3), since it is a distance measure particularly suited for sEMG-based s/p control [5]; iv) this evaluation of the 5 "inner" models on the external test set was repeated 6 times, for each of the 6 partitions of the dataset, constituting the outer loop of the nCV; v) once outer and inner loop iterations were completed for a specific regression method and subject, a total of 30 DTW evaluations of as many trained models were available.Finally, the mean value over these 30 DTW measures was computed, constituting the result of the nCV for a specific regression method and subject.The nCV was therefore exploited for the evaluation of the single-NMF, concatenated-NMF and proposed self-supervised regression approaches by adequately arrange the offline sEMG dataset in accordance with Section II-B1 (single-NMF), Section II-B2 (concatenated-NMF) and Section II-C (self-supervised regression).The nCV and DNN code has been released at the repository: https://bit.ly/3BbJDZ8.
2) Results of the Offline Systematic Evaluation: In Fig. 3, it is possible to observe how the DTW-aligned [5] robot hand synergy activations estimated by the single-NMF approach resembled the reference synergy activations with high fidelity.On the other hand, by looking at the middle row of Fig. 3, it can be clearly observed that the references were very badly followed by the estimated values.This behaviour confirms the critical drop in regression performances of the concatenated-NMF multi-grasp regression.Conversely, in the bottom row of Fig. 3, the proposed self-supervised regression approach follow very well the references of all grasps, showing to be able to overcome the performance degradation of concatenated-NMF.The single subject results are also confirmed by the results obtained considering all the subjects, reported in the boxplot of Fig. 4(a).Incidentally, it is important to note that the single-NMF is a regression approach that operates only with single grasps, and therefore it cannot be compared online with other regression methods able to deal with multiple grasping motions.

Statistical analysis of offline results
A two-way repeated measure Analysis of Variance (ANOVA) was carried out on the results reported in Fig. 4(a) for the factors Grasp Type and Regression Method.The statistical significance was set to p < .05.The ANOVA revealed a statistically significant influence of Regression Method (F (2, 36) = 76.85,p < .001),whereas no significant influence was reported for the Grasp Type and the factor interaction.Therefore, a Tukey Test was performed for pairwise comparisons, revealing a statistically significant difference only between the concatenated-NMF approach and the other two approaches This demonstrates that the self-supervised regression presented statistically significantly better performances with respect to the concatenated-NMF.We then performed a further statistical evaluation of the offline experimental session, taking into consideration three different simulated robot hands: (i) a ROS-based simulator of the AR10, (ii) a SynGrasp-based simulator of the UBHand (anthropomorphic robot hand), and (iii) a SynGrasp-based simulator inspired to the Barrett Hand (3-fingered robotic gripper).In this case, the mean value of the DTW measure obtained for each subject from the nCVnamely m DTW -was normalized for results comparison reasons, computing the DTW error ratio e DTW,ratio as where z DTW is the DTW measure computed between the reference synergy activations of a specific grasping motion and the synergy activations corresponding to constantly keeping the open hand configuration.Fig. 4(b) reports the boxplot of the DTW error ratios for all subjects, on which a two-way repeated measure ANOVA was conducted, investigating the factors Robot Hand Type and Grasp Type.Statistical significance was set to p < .05.The ANOVA revealed no statistically significant influence for both Robot Hand Type and Grasp Type factors, as well as for the factor interaction (Fig. 4(b)).This demonstrates that the positive performances of the proposed sEMG-based self-supervised regression were not biased to a specific robot hand kinematic structure.

C. Online Experimental Session
1) Grasp-Transportation-Release Task: Each subject was instructed to put on the wearable setup shown in Fig. 5(a) and perform two repetitions of the power, tripodal and ulnar opening/closing grasping motions while the sEMG data was acquired in order to train the self-supervised DNN according to Section II-C.Thereafter, based on the specific setup shown in Fig. 5(b), the subjects were required to: (i) grasp three different objects, one at a time, from the right side; (ii) transport the objects towards the opposite side, climbing over a 20 cm wall; (iii) release each object inside the correct placeholder-box.The objects were a small rigid box (7×7×19 cm), a plastic cylinder (5×5×25 cm) and a soft rubber ball (8×8×8 cm), which mandatory required to use ulnar, power and tripodal grasps, respectively.The task was repeated 5 times by each subject with random objects order.
2) Results of the Grasp-Transportation-Release Task: In Fig. 6, a photo sequence of the online grasping task is reported.Fig. 7(a) reports the modulation, by means of the proposed self-supervised regression, of the trajectory in the synergy activations subspace.In particular, the green, blue, red and yellow black-edged circles represent the nominal open hand, power, tripodal and ulnar robot hand configurations.Therefore, it is possible to appreciate in Fig. 7(a) how the subject naturally moved to the neighborhoods of the power, tripodal and ulnar grasp configurations.We also report in Fig. 7(b) the the synergy activations along the time axis.Considering aggregated results, Fig. 7(c) reports, for each subject, the mean value of the Euclidean distance between the robot hand synergy activations Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.modulated during the object grasping phases (i.e., between t 1 -t 2 , t 3 -t 4 and t 5 -t 6 in Fig. 7(c)) and the respective nominal grasp in the synergy activations subspace, computed over the five task repetitions, and grouped for the different types of grasping and distances from nominal grasps.
Statistical analysis of online results: A two-way repeated measure ANOVA was performed on the results reported in Fig. 7(c) for the factors Online Object Grasping Type and Distance From Nominal Grasp Type.Statistical significance was set to p < .05.The ANOVA revealed a statistical significant influence of the interaction between Distance From Nominal Grasp Type and Online Object Grasping Type (F (4, 33) = 55.76,p < .001),whereas no significant influence of the single factors was reported.A Tukey Test was performed for pairwaise comparisons, revealing that for all subjects the distance from the correct nominal grasps was statistically significantly lower then the distance from the other nominal grasps.

Implications of the work:
The proposed self-supervised method allows to avoid the burden for the labeling of sEMG data, which is a complex and frustrating procedure.This study also demonstrates the viability of the proposed method for robot hands with dissimilar kinematic structures.Furthermore, our approach also implies a more reliable and natural sEMG-based multi-grasp control of robot hands with respect to previously proposed unsupervised regression methods, mainly limited to the decoding of wrist motions.
Comparison to existing literature: We first of all consider supervised regression approaches, since they constitute a benchmark of regression performance in literature.To this aim, we select three remarkable studies reporting for the following metrics of similarity between predictions and ground truth: in [21], an R 2 value within the range of 0.8-0.9 was reported using a method based on Kernel Ridge Regression (KRR); in [22], a normalized mean squared error (nMSE) within the range of 0.2-0.3 was obtained employing Kernel Ridge Regression (KRR); and, finally, in [23], a root mean squared error (RMSE) within the range of 0.07-0.08 was reported using Long Short-Term Memory (LSTM) neural networks.For comparison purposes, we report that the results shown in Fig. 4(a) correspond to an R 2 value, nMSE, and RMSE of 0.8423, 0.1355, and 0.0735, respectively, averaged over all the subjects, grasp types and repetitions.Therefore, the outcomes achieved through the proposed self-supervised method approach the levels of error observed in the selected literature works.Secondly, for the consideration of literature works using unsupervised/semi-supervised methods, a qualitative comparison is reported, due to the intrinsic lack of a ground truth for metrics comparison.We therefore report in Table I the features of three selected sEMG-based unsupervised regression approaches for robot hand control, along with our proposed method.As can be seen in the table, the proposed self-supervised approach is the only one providing nonlinear Possible limitations: Firstly, possible limitations regard the fact that the variability of the sEMG signals due to long time usage of the system was not considered in the context of this specific study.Indeed, the effects on the sEMG of aspects like muscle fatigue, mental tiredness, skin perspiration and even limb/body postures could be accounted by extending the proposed HRi to an adaptive version capable of retraining with new sEMG data in an incremental fashion.Secondly, other possible limitations of the present study regard the fact that experiments were conducted only on healthy subjects, without involving amputees.Indeed, amputees typically show different muscle activation patterns and residual limb characteristics, and therefore a dedicated evaluation of the accuracy of the proposed self-supervised regression method is needed.Additionally, for experiments involving amputees, it will be crucial to conduct a proper clinical validation to test required safety, reliability, and performance standards.

V. CONCLUSION
The present study investigated the development of sEMGbased control strategies for robot hands avoiding explicit labelling procedures, by combining NMF with DNN for selfsupervised regression of sEMG signals.The study reports for experiments with 10 healthy subjects, and demonstrates the effectiveness of the proposed approach in both offline evaluations and online real-time control of a wearable anthropomorphic robot hand.The results report for a high reliability over repeated grasp-transportation-release tasks with different objects.Overall, this work contributes to the development of more advanced and effective sEMG-based control strategies in telemanipulation, prosthetics and teaching by demonstration scenarios.

Fig. 1 .
Fig. 1.Schematic representation of the sEMG-based robot hand multi-grasp control realized in this study.

Fig. 4 .
Fig. 4. (a) Mean DTW measure between reference and estimated AR10 synergy activations for all subjects."*" indicates statistically significant difference.(b) DTW error ratio ( (20)) for all subjects, grouped by type of grasp and simulated robot hand.

Fig. 7 .
Fig. 7. Single subject online results for one grasp-transportation-release task.(a) Grasp modulation in the synergy activation subspace (1 st , 2 nd and 3 rd synergy activations correspond to α 1 , α 2 , and α 3 in (1)).(b) Synergy activations plotted along time.(c) Mean value of the Euclidean distance from nominal grasps in the synergy activations subspace, for all subjects."*" indicates statistically significant difference.

TABLE I FEATURE
COMPARISON BETWEEN PROPOSED METHOD AND REPRESENTATIVE SEMG-BASED UNSUPERVISED REGRESSION APPROACHESfitting capabilities, allowing to actually decode more complex actions like multiple grasping motions.