Decoding Multi-DoF Movements Using a CST-Based Force Generation Model With Single-DoF Training

Recent developments in dexterous myoelectric prosthetics have established a hardware base for human-machine interfaces. Although pattern recognition techniques have seen successful deployment in gesture classification, their applications remain largely confined to certain specific discrete gestures. Addressing complex daily tasks demands an immediate need for precise simultaneous and proportional control (SPC) for multiple degrees of freedom (DoFs) movements. In this paper, we introduce an SPC approach for multi-DoF wrist movements using the cumulative spike trains (CSTs) of motor unit pools, merely leveraging single-DoF training. The efficacy of our proposed approach was validated offline against existing methods respectively based on non-negative matrix factorization and motor unit spike trains, using experimental data. The experimental process includes both single-DoF (for training) and multi-DoF (for testing) movements. We evaluated the performance using Pearson correlation coefficient (R) and the normalized root mean square error (nRMSE). The results reveal that our method outperforms comparative approaches in force estimation for both testing datasets (3 and 4). On average, for dataset 3, R and nRMSE of the flexion/extension DoF (the pronation/supination DoF) are 0.923±0.037 (0.901±0.040) and 12.3±3.1% (12.9±2.2%); similarly, those of dataset 4 are 0.865±0.057 (0.837±0.053) and 14.9±2.9% (15.4±2.0%), respectively. The outcomes demonstrate the effectiveness of our method in simultaneous and proportional force estimation for multi-DoF wrist movements, showing a promising potential as a neural-machine interface for SPC of dexterous myoelectric prostheses.


I. INTRODUCTION
D EXTEROUS myoelectric prostheses have significantly bridged the gap, mimicking the degrees of freedom (DoFs) offered by human hands [1], [2].However, the practical applications of prosthetic hands are hindered due to lacking of a robust human-machine interface with intuitive and flexible control systems [3].In the past decades, surface electromyogram (sEMG) played an essential role in establishing control interfaces because of its convenience and non-invasiveness [4].The sEMG signals are regarded as the summation of motor unit action potentials (MUAPs) containing the neural control information from the central nervous system (CNS) [5].Therefore, it has been widely attempted to establish intuitive neural control interfaces using sEMG signals.
Despite the successful applications of pattern recognition techniques, users expect controls to extend beyond a discrete number of gestures [6], [7], [8].They prefer simultaneous and proportional control (SPC) with multi-DoF akin to human hands.Classic machine learning-based [9], [10] and deep learning-based [11], [12] methods were introduced to capture the nonlinearity of the control system.However, these methods require a large training set that includes all possible task patterns [9], [10], [11], [12].Consequently, the training phase becomes extensive and cumbersome, posing a significant hurdle to the practical application of prosthetics [4], [13].This arduous and time-consuming training process significantly increases user's burden.
To address these issues, several SPC methods with less training data and convenient protocols were proposed.Ning Jiang et al. have presented a DoF-wise non-negative matrix factorization (NMF) algorithm, aiming to extract the continuous neural control information, assuming a linear and instantaneous mixture between muscle activation and the mean square values (MSVs) of sEMG signals [14].However, the method appears sensitive to surface crosstalk and necessitates selective electrode placement.Another notable strategy involves employing motor unit spike trains (MUSTs), decomposed from high-density sEMG (HD-sEMG) with hundreds of channels.A MUST is the firing time sequence of a recruited motor unit (MU), representing the fundamental functional unit of neuromuscular system [15].The MUSTs can be partially identified by blind source separation (BSS) algorithms [16], [17], [18] and classified into distinct MU pools by cross correlation, based on the motor neuron synergy perspective [19], [20], [21].This perspective considers that the CNS modulates muscle contractions through common input which can be estimated through the discharge rate (DR) of the cumulative spike train (CST) of the recruited MUs in an MU pool [22], [23], [24].Due to the strong linear relationship between muscle force and the DR of the CST [24], [25], the MUST-based method can achieve accurate and intuitive simultaneous and proportional force estimation.However, there are limitations related to the number of reliable MUs decomposed by BSS algorithms, making it challenging for MUST-based methods to classify more than three MU pools.Furthermore, it remains uncertain whether each pool contains an adequate number (> 5) of MUs, leading to unstable estimation of multi-DoF movements using these methods [26], [27].
In this paper, we propose an algorithm to achieve precise SPC for multiple DoFs while minimizing the training dataset through single-DoF training.We employ a spatial spike detection (SSD) method to extract the CSTs of adjacent electrodes, instead of decomposing accurate single MUSTs.The method directly estimate the CST of MUs under the coverage of a single electrode, deemed as the CST of channel (CSToC).This approximation can obtain more MU firing information than MU decomposition method with acceptable potential estimation errors [28].Inspired by the NMF, we propose a non-negative matrix optimization method to cluster the CSToCs into MU pools, differing from the typical clustering method of MUST-based approach.Subsequently, the DR of the CST in an MU pool can be obtained by summing up the spike trains in the corresponding MU pool within a sliding time window.Finally, the DRs of MU pools were projected to multi-DoF forces using a multiple linear regression (MLR) model.The efficacy of the proposed method was validated against NMF-based and MUST-based methods.An MLR method based on root mean square (RMS) was also involved as a baseline.We trained all the methods on single-DoF data and tested them on multi-DoF data in an offline environment.

A. Subjects
The experiment involved twelve non-disabled participants (nine males, three females, all right-handed, aged 24-34 years) without any previous neurological or muscular disorders.All of them were recruited from Shanghai Jiao Tong University and were presented with an informed consent form.The experimental protocols were approved by the local ethics committee of Shanghai Jiao Tong University (approval number: B2020026I), conforming to the principles of the Declaration of Helsinki.

B. Data Acquisition & Preprocessing
Figure 1(a) illustrates the experimental setup.All subjects were instructed to perform isometric contractions of the wrist

TABLE I DESCRIPTION OF MOTOR TASKS IN THE EXPERIMENT
according to the graphical user interface (GUI).A home-made torque measurement device recorded the force at a frequency of 1000 Hz, which was then relayed as real-time feedback to the GUI.The HD-sEMG signals were sampled at a frequency of 2048 Hz from both the dorsal and the ventral side of the forearm, using three grids of 64 channels electrode arrays (8 × 8, GR10MM0808, OT Bioelettronica, Italy) with 10 mm interelectrode distance (IED) and 4 mm electrode diameter.
The experimental protocol comprised four sessions as detailed in Table I.The movements include wrist flexion (Fle.), wrist extension (Ext.), pronation (Pro.), supination (Sup.) and their combinations.The level of excitation varies from 0 to 30% of the maximum voluntary contraction (MVC).Datasets 1 and 2, consisting only of single-DoF tasks (top panel of Fig. 1(b)) were used for training, whereas

C. Algorithm Description
Throughout this section, bold letters denote matrices, bold italic letters denote vectors, while italics denote scalars.The notation with a hat symbol signifies an estimated value.
1) Feature Extraction: A CST estimation method was employed to extract the CST of MUs under each channel [28].The firing spikes of activated MUs were allocated to the closest regions of channels through the SSD method.Consequently, the CSToC features of all 192 channels were obtained.The whitening matrix W and the spike detection threshold of the mth channel β m were calculated and retained based on training data.
2) Force Generation Model: Figure 2 presents an overview of the motor neuron synergy view and the force generation model for the force prediction.The force of K-DoF movements in each sample t is defined as: where F k stands for the force function of the kth DoF.F(t) can be factorized as follows: where S = (s kp ) K ×P is a K × P synergy matrix and s kp denotes the contribution of the pth MU pool in the kth DoF.
T contains the forces of MU pools at time t.Given that the CNS modulates muscle excitation levels through varying the recruitment and the DR of the recruited MUs in an MU pool, a strong linear relationship exists between muscle force and the cumulative DR of all recruited MUs in the corresponding MU pool.Then f p can be modelled as follows: where a pn indicates the extent of participation of the nth MU in the pth MU pool, and d pn is the DR of the nth MU in the pth MU pool.Equation (3) can be expressed in matrix form: where T comprises consecutive samples of the DRs of all N MUs.Since identifying all recruited MUs is challenging, we employed the SSD method to directly estimate the DRs of the CSToCs [28].We can then estimate the force function of MU pools on the tth sample as follows: where A = (a pm ) P×M symbolizes a P × M non-negative mixing matrix, and T comprises all the estimated DRs of the CSToCs close to different channels.dm is calculated as: where ŝm (t) consists only of 0 and 1. ŝm (t) = 1 denotes an identified spike on the tth sample on the mth channel.R is the duration of the sliding time window to calculate the DRs. Figure 3 shows the representative results of the SSD method.Substituting (5) into (2) produces: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.where F(t) stands for the estimation of F(t) in (1).S and A are the mixing matrices, which need to be determined during the training phase.As the input feature of the model, D(t) is calculated using the SSD method.
3) Optimization of Mixing Matrices: To estimate S and A, we assume that only one functional group is activated during single-DoF movement.Subsequently, the recorded torque of the activated DoF can be used as an approximation for the output of an MU pool.This assumption aligns with the principles of DoF-wise NMF [14].Define F + k (t) and F − k (t) as where F k is the kth DoF of force function defined in (1).And rewrite f(t) as Then the estimation of A becomes an optimization problem by solving min where f , A and D are non-negative matrices.As D is obtained by the SSD method, this optimization problem can be approached as the NMF with a fixed right matrix.An iterative algorithm is used.With A being randomly initialized, its elements update in the hth iteration step as follows [29]: When max(|(a h pm − a h−1 pm )/a h−1 pm |) < ε or reaching maximum iterations, the iteration concludes.To avoid overfitting, we added an additional constraint.The pth row of A h is arranged in descending order: The upper limit of a h pm is set as: where K M represents the expected number of dominant channels in an MU pool.If a h pm > thr p , a h pm in (11) will be set to thr p .Thus, the matrix A can be estimated, and f is recomputed using equation (5).Then S in (7) can be calculated through MLR model using the least-square method: S and A are calculated and retained by training data (datasets 1 and 2), while for testing data (datasets 3 and 4), the force is estimated by (7). Figure 4 illustrates this training and testing procedure.

D. Comparative Methods
All the algorithms were trained on single-DoF data (datasets 1 and 2), and tested on multi-DoF data (datasets 3 and 4).The length of the sliding time window (R in ( 6)) for feature extraction was the same for all methods.
1) EMG-Based Method: The EMG-based method, serving as the baseline, extracted the RMS features from all 192 channels of the filtered sEMG.MLR model then projected these features to multi-DoF forces.
2) NMF-Based Method: The NMF method, a commonly used single-DoF training algorithm, was applied [14].The MSV was calculated for NMF as follows: where z m and x m represent the MSV and the filtered sEMG signal of the mth channel, respectively.
3) MUST-Based Method: All training data was decomposed to MUSTs by an improved CKC method [18].MUs with pulseto-noise ratio (PNR) higher than 25 and the coefficient of variation (CoV) for the interspike interval less than 50% were discarded [30].The DR of MUST was obtained by: where d MU n (t) is the DR function of the nth identified MU. s MU n (t) = 1 represents a firing spike of the nth MU on the tth sample.The MLR model was utilized to project the DRs of all MUSTs onto the recorded multi-DoF forces.

E. Performance Metrics
The Pearson correlation coefficient (R) and the normalized root mean square error (nRMSE) were employed to evaluate accuracy of methods.The R and nRMSE of testing results were calculated as follows: where F and F are the estimated and recorded force, respectively.T is the length of samples.We performed a one-way ANOVA to determine the statistical significance of the results obtained from CSToC against those from other methods, using SPSS Statistics 26.The level of significance was set to p = 0.05.All methods were programmed and analyzed using Matlab (R2020b) on a PC with Intel Core i7-8700 CPU and 16 GB of RAM.

III. RESULTS
All three training data trials were used to decompose MUSTs for MUST-based method, while the other algorithms employed a three-fold cross-validation method for training.The length R of sliding window was set to 300 ms for all algorithms.For the CSToC method, the impact of the parameter K M in ( 13) was evaluated on dataset 3. The values of K M were set to K M = 5, . . ., 40 to compare the results of R and nRMSE, as depicted in Fig. 5. Notably, an increase in K M improved estimation accuracy for K M < 30, but stabilized for K M > 30.Therefore, K M = 30 was utilized for subsequent tests and comparisons.
The estimation results on training data are shown in Fig. 6.Although all methods performed well on the training datasets, their estimation accuracy decreased in the testing phase.The comparisons of simultaneous and proportional force estimation accuracy of test data are presented in Fig. 7.The results show that the CSToC method obtained the best performance of force estimation with averaging values of R = 0.923 ± 0.037 (R = 0.901±0.040)and n R M S E = 12.3±3.1% (n R M S E = 12.9±2.2%)for DoF 1 (DoF 2) on dataset 3.And on dataset 4, the results were R = 0.865 ± 0.057 (R = 0.837 ± 0.053) and n R M S E = 14.9 ± 2.9% (n R M S E = 15.4 ± 2.0%) for DoF 1 (DoF 2).The MU decomposition results for the MUST-based method can be found in Table II.Moreover, the iteration results of the non-negative mixing matrix A in (7) are depicted in Fig. 8(a).To provide a coherent overview of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.results, an average matrix, representative of all twelve subjects, is depicted in Fig. 8(b).The spatial arrangement of electrodes is illustrated in Fig. 8(c).Lastly, to facilitate the interpretation of MU pools, a correlation matrix was employed on the DRs  of K M channels with the highest a pm in each MU pool per subject, as illustrated in Fig. 9.

IV. DISCUSSION
This study reports a simultaneous and proportional force estimation method of multi-DoF wrist movements, employing only single-DoF training.The proposed approach outperforms its counterparts, and effectively extracts MU pool characteristics from single-DoF data.Informed by the perspective of motor neuron synergy, the derived force estimation structure utilizes the SSD technique to approximate the CSToC features.Subsequently, the DRs of CSToCs are projected onto the MU pools via a non-negative mixing matrix, A, optimized through an iterative optimization process.S is determined by MLR model, mapping outcomes of MU pools to muscle forces.The resulting matrices S and A prove effective in estimating multi-DoF forces.

A. Number of Dominant Channels
When evaluating CSToC performance for varying dominant channel numbers K M , ranging from 5 to 40, it was noted that the method's accuracy had an increase trend with K M when K M ≤ 30.However, results variance became inconsistent when K M exceeded 30.Intuitively, the increased number of dominant channels reduces each individual channel's contribution to the force estimation, resulting in more robust estimations that remain unaffected by noise on selected channels.Similar conclusions have been reported by MUST-based methods in earlier studies [30], [31], [32].K M 's upper limit is predominantly constrained by muscular size and IED of electrodes.Hence, the optimized result of K M in this paper is specifically applicable to HD-sEMG signals with IED = 10 mm only when recorded from the forearms of non-disabled individuals.

B. Comparisons on Estimation Accuracy
The proposed CSToC method provides the best estimation accuracy on both testing datasets as illustrated in Fig. 7.It implies that our method successfully extracts the motor neuron synergy features from single-DoF movements.Dataset 4 has marginally lower accuracy than dataset 3 due to the former's more complex motion states.
The NMF-based method does not yield better results than the baseline (the RMS-based method) on our datasets, although it is considered a feasible single-DoF training method.The classic NMF method blindly solve the factorization problem.The DoF-wise NMF [14], along with its enhanced version [33], constrains the right factor of NMF to obtain improved muscle synergy features.In our study, we selected the most representative DoF-wise NMF for comparison.However, it primarily applies to the sEMG signals with a limited number of channels (typically less than 15) [14], [34], [35].Therefore, it lacks constraints on the left factor, whose dimensionality increases with the number of channels.As the dimensionality expands, it becomes challenging to generate a sufficiently sparse factorization of the left factor, leading to difficulties in preventing overfitting.This can potentially result in a decline in the performance of the NMF method.
MUST-based methods typically employ classification methods to assign MUs to different MU pools based on correlations [21], [36].As the number of classes (four movements in our study) increases, these classification methods tend to become unstable.The limited number of reliably decomposed MUs cannot assure a sufficient number of MUs in each MU pool, leading to unreliable estimates if an MU pool contains fewer than five MUs [26], [27].Therefore, we employed an MUST-LR method as a representative Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
MUST-based method [19].This method is effective when multiple-DoF data constitutes the training datasets, but portrays a substantial deterioration when trained with single-DoF data.This is mainly because the MLR model cannot accurately determine the contribution of each MUST, which can easily cause deviation in force amplitude.
To address the issue of unstable classification with an upsurge in the number of MU pools, we applied an SSD method to extract CSToC features with equal channel numbers.Then, a non-negative matrix A, inspired by the NMF method, was used to classify these features into different MU pools.Unlike the NMF method, we used the matrices f and D in (5) to control the iteration outcomes, ensuring the convergence.And an additional constraint is used to avoid overfitting as described in (12) and (13).Figure 8 illustrates that the results of A from different subjects converge to the zones associated with corresponding muscles, confirming our method's efficacy in acquiring effective synergy features from single-DoF training data.

C. Motor Neuron Synergy
The channel correlation matrices for the MU pools are displayed in Fig. 9, with the color map representing the strength of the correlation coefficient.Human movements are controlled by the combinations of muscle synergies, and muscles are innervated by MU pools, wherein the MUs receive the same neural control signal named the common input [37], [38].As a result, there is high linear correlation amongst the DRs of CSToC in the same MU pool [39].In the traditional muscle synergy view, the entire MU pool receives a set of common inputs with fixed weights [40].However, the recently proposed motor neuron synergy view maintains that common inputs are projected to functional groups of the same or different muscles [22].Although the MU pool classification results ensure high intra class correlation, channels with high correlation between MU pools exist.This suggests that an MU pool can receive a common input set and a common input can also innervate more than one MU pools.Our optimization method did not impose any constraints on the adjacent positions of CSToC features in an MU pool.Thus it emerged that an MU pool innervated functional muscle groups of DoF 2 (pronation and supination) as seen in Fig. 7.These results are consistent with the motor neuron synergy view.We note that DoF 2 has more complex synergy with activated muscle groups.This complexity makes it challenging to estimate the corresponding elements in matrix A accurately, leading to an accuracy decrease for DoF 2 as shown in Fig. 7.

D. Limitations
Due to the precision and ease of torque measurement, this study is premised on the isometric contraction experiment.Nonetheless, in actual prosthetic limb control, the estimation of muscle contraction angles is also equal important.Similar to other MU-based methods, the performance of CSToC might deteriorate under dynamic contractions [41].The impact of dynamic contractions will be systematically investigated in the future.Note that this study was conducted offline.The proposed method can also be adapted into an online version by whitening filtered signals before extension.We have conducted initial verification that demonstrates its ability to fulfill the computational requirements for online application.For sEMG recorded using 192 channels and a sample rate of 2048 Hz, the average calculation time for data segments with a window length of 300 ms is 199.3 ± 13.3 ms.However, the online control performance still awaits verification.Lastly, the experiment comprised solely non-disabled participants.Future research on subjects with limb deficiency is requisite for enhancing the general clinical applicability.

V. CONCLUSION
We introduced a force generation model for the SPC of multi-DoF wrist movements, based on the principle of motor neuron synergy.Instead of identifying individual MUST, we employed CSToC features to estimate the DRs of MU pools directly.Furthermore, we proposed an optimization method to enhance the classification of MU pools, leading to a significantly improvement in reliability over conventional MU-based methods.The superior accuracy achieved from the application for multi-DoF wrist movements highlights the efficacy of our proposed method in estimating muscle force synchronously and proportionally.This technique could potentially provide an effective neural drive control method for human-machine interfacing applications.

Fig. 1 .
Fig. 1.Experimental setup.(a) The data acquisition platform comprises a visual feedback interface, torque transducers, and HD-sEMG recordings.(b) Illustrations of the recorded force from four datasets are provided.

Fig. 2 .
Fig. 2. Overview of the motor neuron synergy and the force generation model for simultaneous and proportional control implementation.

datasets 3 and 4 ,
which included combination movements (bottom panel of Fig. 1(b)), were implemented as testing data.In sessions 3 and 4, we designed various muscle contraction patterns to emulate daily life situations as closely as possible.Each task was carried out in three trials.The force signals were up-sampled to 2048 Hz to match the frequency of the HD-sEMG signals, followed by filtering through a 20 Hz low-pass filter.For the HD-sEMG signals, a bandpass filter with frequency of 20-500 Hz and a comb filter with a cutoff frequency of 50 Hz were applied.Channels with signal RMS values exceeding three times the average were considered outliers and subsequently set to zero.

Fig. 3 .
Fig. 3. Representative feature extraction results from single-DoF tasks.(a) Five out of 192 channels of the HD-sEMG signals.(b) The recorded force of two wrist DoFs.(c) The CSToCs supplied by SSD method.Each point along the horizontal line indicates one detected discharge from the same channel.

Fig. 4 .
Fig. 4.An example of training and testing procedures.(a) Results achieved from training with single-DoF tasks.(b) Estimated results on datasets 3 and 4.

Fig. 7 .
Fig. 7. Estimation performance of comparative methods on dataset 3 (a) and dataset 4 (b).R is presented in the top panels of (a) and (b), while nRMSE is presented in the bottom panels of (a) and (b).The results of DoF 1 (flexion and extension) are marked with the yellow background.In contrast, the results of DoF 2 (pronation and supination) are indicated with the blue background.Asterisk (*) denotes the significant difference between CSToC and other methods (p < 0.05).

Fig. 8 .
Fig. 8.The optimization results of MU pools (matrix A).(a) The normalized matrix A for four MU pools, trained using single-DoF data from all 12 subjects.(b) The mean matrix A across 12 subjects.(c) The spatial arrangement of each column of electrodes around the forearm.

Fig. 9 .
Fig. 9.The correlation matrices associated with K M (30) dominant channels in each MU pool.A cross-correlation function was applied on the DR of CST on each channel.Moreover, the color scale indicates the strength of the correlation, with values ranging from 0 to 1.