Estimation of Joint Torque by EMG-Driven Neuromusculoskeletal Models and LSTM Networks

Accurately predicting joint torque using wearable sensors is crucial for designing assist-as-needed exoskeleton controllers to assist muscle-generated torque and ensure successful task performance. In this paper, we estimated ankle dorsiflexion/plantarflexion, knee flexion/extension, hip flexion/extension, and hip abduction/adduction torques from electromyography (EMG) and kinematics during daily activities using neuromusculoskeletal (NMS) models and long short-term memory (LSTM) networks. The joint torque ground truth for model calibrating and training was obtained through inverse dynamics of captured motion data. A cluster approach that grouped movements based on characteristic similarity was implemented, and its ability to improve the estimation accuracy of both NMS and LSTM models was evaluated. We compared torque estimation accuracy of NMS and LSTM models in three cases: Pooled, Individual, and Clustered models. Pooled models used data from all 10 movements to calibrate or train one model, Individual models used data from each individual movement, and Clustered models used data from each cluster. Individual, Clustered and Pooled LSTM models all had relatively high joint torque estimation accuracy. Individual and Clustered NMS models had similarly good estimation performance whereas the Pooled model may be too generic to satisfy all movement patterns. While the cluster approach improved the estimation accuracy in NMS models in some movements, it made relatively little difference in the LSTM neural networks, which already had high estimation accuracy. Our study provides practical implications for designing assist-as-needed exoskeleton controllers by offering guidelines for selecting the appropriate model for different scenarios, and has potential to enhance the functionality of wearable exoskeletons and improve rehabilitation and assistance for individuals with motor disorders.


I. INTRODUCTION
E XOSKELETONS are investigated with increasing fre- quency due to their potential to facilitate movement for persons with disability, for instance, to facilitate neuromuscular rehabilitation in people with motor disorders [1].Many recent studies have investigated exoskeletons' potential for stimulating neuromuscular recovery during neurorehabilitation training by operating in a feedback loop that incorporates users' biosignals, such as electromyography (EMG) [2], [3].For example, Durandau et al. [4] developed an EMG-based controller that computes users' joint torque to determine the exoskeleton assistance level, enabling individuals with incomplete spinal cord injury or post-stroke to voluntarily control the exoskeleton to follow the target positions.Consequently, research focus on exoskeleton control strategies that integrate the user's torque capabilities continues to grow exponentially.The appropriate assistive torque is frequently predicted based on the user's EMG signals, as they adapt to the user's muscle function and can encourage users' active participation, thereby facilitating motor recovery.This control strategy, often referred to as "assist-as-needed", must sufficiently assist the musclegenerated torque to ensure that the user successfully performs the intended task.Achieving an accurate prediction of the user's torque is the first step in achieving this goal.
To predict joint torque, neuromusculoskeletal (NMS) modelling and artificial neural networks (ANNs) are two common methods, and each has advantages and disadvantages [5], [6], [7].Our recent study [8] was the first study that compared ankle joint torque estimation performance of a physics-based NMS model to a multilayer perceptron (MLP) model in gait and isokinetic movements.We highlighted the practicality of the NMS approach, which leverages domain knowledge to replicate the conversion process from EMGs to musculotendon forces and joint torque.However, setting up/calibrating the models in sophisticated steps can be labor-intensive, and it's important to note that physics-based NMS models offer only an approximate estimate of joint dynamics.Furthermore, EMG quality can be influenced by issues such as cross-talk and the precise positioning of sensors.The MLP model could estimate joint torques better but requires a varied set of training data.MLP is a classical and simple type of neural network with no time-series or memory information.However, how torque predictions from other neural network structures compare to those from NMS models has not been thoroughly investigated.LSTMs, types of neural networks reported to be the most robust and effective method for time-series prediction [9], [10], [11], can selectively remember patterns that correspond to a period of time by incorporating memory structures into networks.The memory structures are able to update the information between time intervals with gates to select what to output, forget and remember.A similar memory structure is included in the physics-based NMS model which uses MTU neural activation information from two previous time steps.Thus, comparing physics-based NMS models and LSTM networks is intriguing.Despite the advances made in joint torque estimation using physics-based NMS and LSTM models, limited research has been conducted on torque estimation in multiple lower limb degrees of freedom (DOFs) during daily activities using NMS and LSTM models in parallel.
Cluster approaches have been applied to classify gait and movement patterns in persons with and without motor disorders [12], [13], [14], muscle activation levels in nonpathological groups [15], and moment patterns in athletes with identical anterior cruciate ligament injury rates [16].Cluster analysis is a statistical technique for grouping observations or patterns into a number of clusters based on similarity in their characteristics.Phinyomark et al. [13] clustered running kinematic patterns into two distinct groups; their results suggested that variability observations in running patterns may be attributable to different gait strategies.White and McNair [15] applied cluster analysis to identify muscle activation patterns in abdominal and lumbar muscles during gait.In our recent study of ankle joint torque estimation by NMS models in gait and isokinetic movements [8], an NMS model calibrated with a varied set of movement trials had lower prediction accuracy than Individual models that were each calibrated with a specific movement.However, calibrating one model for each movement is time-consuming and impractical in realtime applications.A logistic question that arises is whether calibration with a cluster approach in both EMG-driven NMS models and in LSTM models can result in high joint torque prediction accuracy.
The objective of this study was two-fold: (1) to estimate ankle dorsiflexion/plantarflexion (D/P), knee flexion/ extension (F/E) torques, hip flexion/extension and hip abduction/adduction (Ab/Ad) by NMS and LSTM models with a cluster approach based on EMG signals and kinematics; and (2) to compare the prediction performance between Clustered NMS and LSTM models.

A. Data Collection and Data Processing
Eight non-disabled participants (age: 29±4 years; sex: 4F/4M; weight: 65.2±17.8kg; height: 168.1±9.4 cm) were enrolled in this study.Approval for this research was granted by the Swedish Ethical Review Authority (Reference Number: Dnr.2020-02311).All participants gave their informed written consent.All participants were required to perform ten movement types at least 10 times each (Fig. 1), specifically normal walking, slow walking, fast walking, stand-to-sit, sitto-stand, squat, jump up to a 15-cm stair, jump down from a 15-cm stair, vertical jump up, and land from vertical jump.The movements were arranged in a randomized sequence.
A 3D motion capture system (100 Hz, V16, Vicon, Oxford, UK) was used to capture marker trajectories, which were placed according to the CGM2.3 marker set model [19].Three force plates (1000 Hz, AMTI, MA, USA) was used to record ground reaction forces.
Segment and joint kinematics were estimated via inverse kinematics by solving a weighted least square problem to minimize the error between modeled x i and measured x ex p i marker trajectories [20] (Eq.( 1)).The modeled marker positions were placed at the same anatomical locations as the experimental marker positions (CGM2.3marker set model).
where q is the generalized coordinates of the model and θ i is the weight of ith marker.Kinetics, or specifically joint torques, were computed via inverse dynamics; these were considered as ground truth of joint torque for model calibrating or training.Kinetics were computed by solving the dynamic equations of motion [21] (Eq.( 2)), where q, q, q represent the generalised position, velocity and acceleration vectors, respectively; M(q) q represents the Fig. 2. Schematic diagram of joint torque estimation using different NMS and LSTM models.Joint torques were estimated using Pooled models calibrated/trained by all movements, Individual models calibrated/trained by one single movement, and Clustered models calibrated/trained by one single cluster.The inputs for NMS and LSTM models were EMGs and joint angles, acquired and post-processed from a 3D motion capture system.inertial forces and torques, and M(q) denotes the mass matrix; C(q, P q) represents the centripetal and Coriolis forces and torques; G(q) represents the gravitational forces and torques; R(q) denotes the muscle moment arms; F mt represents the musculotendon forces vector and R(q)F mt denotes the musculotendon torques vector; F e represents the external force and torques.Joint kinematics and kinetics were filtered using a low-pass fourth-order zero-lag Butterworth filter (6 Hz).[22], [23], [24].

B. Joint Torque Prediction Models
Both NMS models and LSTM neural networks were created to estimate ankle D/P, knee F/E, hip F/E, and hip Ab/Ad torques, using only input from the 13 captured EMG signals and joint kinematics (Fig. 2).The joint kinematics data included ankle D/P, knee F/E, hip F/E, and hip Ab/Ad joint angles. (

a) NMS Model
An open-source EMG-driven NMS model (CEINMS) [25] was used (Fig. 3 (a)), including muscle activation dynamics, musculotendon kinematics, muscle contraction dynamics, and joint dynamics [26].Specifically, the muscle activation dynamics part is for computing muscle activation with EMG data.The relationship between EMG excitation e(t) and neural activation u(t) was formulated as Eq.(3) [27], where α represents the muscle gain variable; d corresponds to the electromechanical delay.β 1 and β 2 represent the recursive variables; they are constrained to in order to obtain a stable solution [25], [27], [28].The muscle activation a(t) can be represented as Eq. ( 4), where A is the shape factor [18], [28].
The musculotendon kinematics part calculates the musculotendon lengths and moment arms.Muscle force is determined through the muscle contraction dynamics component using a Hill-type muscle model.The muscle force F can be represented as Eq. ( 5), where F m 0 represents the muscle maximum isometric force; l is the fiber length; F al (l) describes the muscle active force and fiber length relationship; v is the fiber contraction velocity; the F v (v) denotes the muscle force and fiber contraction velocity relationship; F pl (l) represents the muscle passive force and fiber length relationship; θ is the fiber pennation angle and d p represents the muscle damping.
Finally, the joint dynamics part is for computing the joint torque based on muscle forces and moment arms.
Parameter calibration in the EMG-driven NMS model was performed according to recommendations by Pizzolato et al. [25].For each participant, optimal fiber length l m 0 and tendon slack length l t s were bounded within a range of ±15% of their initials.The parameter A was limited to the interval (−3, 0) while the coefficients C 1 and C 2 were bounded in (−1, 1).To calibrate the maximum isometric force, a strength coefficient was employed, which was bounded within the range of 0.5 to 2.5.These parameters were determined by applying a simulated annealing algorithm, which minimized the discrepancy between estimated and measured joint torques (calculated by inverse dynamics) throughout the calibration process.The simulated annealing algorithm was set to run before the average alteration in the target function's value was less than the specified tolerance of 10 −5 .

(b) LSTM Networks
Using identical input data as the NMS model, namely EMGs and joint angles, LSTM networks were constructed to estimate joint torques (Fig. 3 (b)) [25].The time-series inputs were converted into time-slice data for the next LSTM layer.After that, LSTM layers were added, then a dense layer and a dropout layer.The dropout layer was applied to avoid overfitting whereas the dense layer made use of the intermediates from the LSTM layers to assist in the joint torque estimation throughout the regression phase [29].Finally, the output layer predicted joint torques.
To find the hyperparameters for the LSTM networks, we used a "Coarse-to-fine" random search method [30].A Xavier weight initialization method with zero bias was applied, and a batch size of 50 was selected.Each LSTM layer was configured with 50 neurons and tanh activation function was applied.64 neurons were selected in the dense layer.The dropout rate for the dropout layer was set to 0.4.Throughout the training process, the loss function was defined as the mean square error between predicted and measured joint torques.The learning rate for the Adam optimizer was set to 10 −3 .The optimum model was trained throughout 4000 epochs; however, it was stopped early if the accuracy of the predictions did not improve in a consecutive 50 epochs.

C. Cluster Approach
Using EMG signals and joint angles as inputs, we applied a t-distributed stochastic neighbor embedding (t-SNE) [31] clustering approach to reduce data dimensionality.Subsequently, we employed a K -means algorithm [32] to classify different movement types into K distinct clusters.
t-SNE is a valuable technique as it transforms highdimensional data into a low-dimensional space while concurrently maintaining the data's local structure.Suppose x i and x j denote two data points in high-dimensional space.Their similarity, defined as p j|i , is given by the following equation: where σ i symbolizes the variance of a Gaussian distribution with its center at x i .The t-SNE algorithm minimizes the Kullback-Leibler divergence, denoted as C, between the two aforementioned distributions.This minimization is accomplished through the application of gradient descent, as illustrated in Eq. ( 7),

D. Evaluation Protocol
We investigated the lower limb joint torque prediction accuracy of both NMS and LSTM models in three cases: Pooled, Individual, and Clustered models.We compared the prediction accuracy between NMS and LSTM in clustered models.
• Pooled model (n = 1): used 80% of the data from all ten movements to train one Pooled LSTM model.Meanwhile, the Pooled NMS model was calibrated with one trial of each included movement and was subsequently evaluated on the identical dataset as the LSTM model.To be specific, 80% of the data corresponds to 8 out of the 10 trials used for model training.
• Individual models (n = 10): used 80% of the data from each movement to train ten individual LSTM models.Meanwhile, NMS models were calibrated with three trials of each movement and evaluated on the identical dataset as the LSTM models.
• Clustered models (n = 3): used 80% of the data from each cluster to train three clustered LSTM models.Meanwhile, NMS models were calibrated with two trials of each cluster and evaluated on the identical dataset as the LSTM models.Normalized root mean square error (NRMSE) E N R M S was used to evaluate the estimation error of each model; low estimation error was regarded as high estimation accuracy.NRMSE was established as the root mean square error E R M S between the estimated and measured/actual torque, normalized by the range of observed joint torque during corresponding movements: where y e,n and y n represent the estimated and measured/actual torque, respectively; y max and y min are the maximum and minimum actual torque in corresponding motions, respectively.
To assess data distribution normality, Shapiro-Wilk tests were conducted ( p < 0.05 significance level).Subsequently, pairwise comparisons of the NRMSEs predicted by the three methods were conducted.Paired t-tests were applied to normally distributed data, and Wilcoxon signed-rank tests were used for data that did not follow a normal distribution, in both cases with Bonferroni correction ( p < 0.05 significance level).

A. Movement Clusters
The t-SNE algorithm combined with K -means identified three clusters among the ten movements.The t-SNE algorithm reduced the dimensionality of the data, allowing for visualization in a 2D space (Fig. 4).Subsequently, K -means grouped the samples of each movement into three clusters: K 1 comprising normal walking, slow walking, and fast walking; K 2 consisting of stand-to-sit, land from vertical jump, and squat; and K 3 including sit-to-stand, vertical jump up, jump up to a 15-cm stair, and jump down from a 15-cm stair.The median location of the movement samples was employed to assign them to their corresponding clusters based on relevance.The choice of the number K of clusters in K -means, as well as the naming of each cluster, was determined through visual inspection, ensuring meaningful and interpretable grouping of the movements.The NRMSE distributions between predicted and actual torques during ten motions among participants in NMS models: Pooled, Individual, and Clustered, shown with violin plots.Each violin plot includes a kernel density plot with a box plot.Kernel density plots illustrate the NRMSE probability distributions.Box plots denote NRMSE's minimum, lower quartile, median, upper quartile, and maximum values.* indicated that there was a significant difference between two models, determined by pairwise comparisons.

B. NMS Models
Overall, the estimation accuracy was similar between Individual and Clustered models for walking movements and vertical jump up.The estimation accuracy by the Pooled model was much lower than Clustered and Individual models in most motions, i.e., walking movements, sit-to-stand, vertical jump up, and jump up to a stair (Fig. 5).
The ankle D/P joint torque prediction accuracy from both Individual and Clustered models was significantly higher than by Pooled models for gait at different speeds (Fig. 5, slow walking: p < 0.01 and p = 0.01; normal walking: p < 0.01 and p < 0.01; fast walking: p < 0.01 and p < 0.01).In ankle D/P torque prediction for gait at different speeds, compared to Individual and Clustered models, Pooled models underestimated peak plantarflexion.In addition, a plantarflexion offset during the swing phase was predicted by all three models (Fig. 6).In other movements, compared to Pooled models, significantly higher estimation accuracy was found in Individual models for vertical jump up ( p = 0.04), jump up to a stair ( p = 0.04), jump down from a stair ( p < 0.01), and squat ( p = 0.05), and somewhat, though not significantly, higher in Clustered models.
The knee F/E joint torque prediction accuracy from both Individual and Clustered models was relatively higher than by Pooled models (Fig. 5) for gait at different speeds (NRMSE Pooled ≤ 26.7%,I ndividual ≤ 13.9% and Cluster ed ≤ 14.3%), jump up to a stair (NRMSE Pooled ≤ 19.7%,I ndividual ≤ 15.2% and Cluster ed ≤ 17.3%) and sit to stand (NRMSE Pooled ≤ 10.5%,I ndividual ≤ 8.8% and Cluster ed ≤ 6.8%).Unlike Clustered and Individual models, Pooled models overestimated knee extension torque during preswing-early swing for gaits at different speeds and underestimated knee extension moment in sit-to-stand (Fig. 6).All models failed to predict a smooth knee extension moment during squat, and Individual models failed to predict knee extension moment accurately in the stand-to-sit motion.
The joint torque estimation accuracy in hip F/E and hip Ab/Ad from Individual and Clustered models for all movements was overall lower than that in knee F/E and in ankle D/P: hip F/E (median NRMSE across ten movements: I ndividual 14.4%; Cluster ed 16.1%) and hip Ab/Ad torque (I ndividual 14.9%, Cluster ed 19.3%) vs. knee F/E (I ndividual 12.2%, Cluster ed 12.7%) and ankle D/P (I ndividual 11.0%, Cluster ed 15.5%) for the ten movements.In ankle D/P joint torque prediction, Pooled models had a lower prediction accuracy (median NRMSE 20.6%) compared to the other three joints (knee F/E 14.4%, hip F/E: 18.9%, and hip Ab/Ad 18.9%).
It is worth noting that during squat and stand to sit (Fig. 5), the average prediction accuracy by Individual models in knee F/E torque was lower than both Pooled and Clustered models, but outliers with high prediction error were observed in the Individual models.Moreover, the predicted knee F/E and hip F/E torques by Individual models agreed poorly with measured torque compared to the Pooled and Clustered models (Fig. 6).

C. LSTM Models
Overall, the estimation accuracy was similar among all three models (Pooled, I ndividual, and Cluster ed, Fig. 7), and the estimated joint torques are all in good agreement with the actual torques during ten motions (Fig. 8).
Among the torque prediction models at different joints (Fig. 7), Pooled, Individual, and Clustered models all estimated joint torque accurately with relatively low error (NRMSE in hip F/E: Pooled ≤ 8.4%, I ndividual ≤ 9.2%, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 6.One example trial of measured and predicted torque during ten movements via three NMS models: Pooled, Individual, and Clustered.Fig. 7.
The NRMSE distributions between predicted and actual torques during ten motions among participants in LSTM models: Pooled, Individual, and Clustered, shown with violin plots.Each violin plot includes a kernel density plot with a box plot.Kernel density plots illustrate the NRMSE probability distributions.Box plots denote NRMSE's minimum, lower quartile, median, upper quartile, and maximum values.* indicated that there was a significant difference between two models, determined by pairwise comparisons.

D. Comparison Between NMS and LSTM Clustered Models
Overall, compared to the NMS Cluster models, the torque predictions from LSTM Cluster models demonstrated a superior agreement with the torque calculated through inverse dynamics at all joints and movement tests (Fig. 9); median NRMSE across all ten movements was in hip F/E: N M S : 16.1%, L ST M : 4.9%, in hip Ab/Ad: N M S : 19.3%, L ST M : 4.7%, and in ankle D/P: N M S : 15.5%, L ST M : 6.7%.Similarly, knee F/E torque by the LSTM models was significantly better agreed with the torque calculated through inverse dynamics than the NMS models for most movements, including slow walking ( p < 0.01), normal walking ( p < 0.01), fast walking ( p = 0.01), jump down from a stair ( p < 0.01), jump up to a stair ( p < 0.01), vertical jump up ( p < 0.01), land from vertical jump ( p < 0.01), squat ( p = 0.03), and stand-to-sit ( p < 0.01), and to some extent, though not significantly, better agreed for sit-to-stand ( p = 0.16).

IV. DISCUSSION
We estimated ankle, knee and hip joint sagittal plane torques, and hip joint frontal plane torques for ten movements using two frequently used prediction approaches -EMG-driven NMS models and LSTM neural networks -and evaluated whether a cluster approach that grouped movements based on feature similarities can improve prediction accuracy, as compared to models based on individual or on pooled movements.We observed that, for LSTM neural network models, the Cluster approach did not have higher prediction accuracy than Individual or Pooled models; all three approaches predicted torques that agreed very well with inverse dynamics.Overall, the prediction accuracy in NMS models was lower than in LSTM models.In the NMS models, the Cluster approach did have higher prediction accuracy than the Pooled models in some movements.Whereas Individual and Clustered  NMS models had similar prediction performance, the Pooled models predicted with the least accuracy.To the best of our knowledge, this is the first study that describes multiple lower-limb joint torque predictions using both NMS models and LSTM neural networks with a cluster approach in daily activities.
It is an interesting finding that the cluster approach did not improve prediction accuracy in LSTM neural networks but did improve it in NMS Pooled models in some movements.Clustered and Individual NMS models had similar joint torque prediction accuracy, but Pooled models had a lower prediction accuracy and predicted higher peak knee extension and lower peak plantarflexion torques for gait at different speeds (Fig. 6).This confirms the concept that movements within a cluster have similar coordination patterns, and that the NMS models can capture these features for better prediction accuracy.Among Pooled, Individual and Clustered NMS models, muscle-tendon unit parameters, such as tendon slack length, optimal fiber length, shape factor, and strength coefficient, were calibrated with trials from different movement types.During the calibration, these parameters were optimized by minimizing the error between estimated and measured torque among all the included calibration trials.Therefore, the calibrated Pooled NMS may be too generic to satisfy all movement patterns.Practical implications of these findings are that, in some applications, creating and training individual NMS models for each movement may be impractical; clustering movements can contribute to a more practical workflow and is preferable to an approach that pools all movements.
In contrast, Individual and Pooled LSTM models already had very high torque prediction accuracy, and a cluster approach was generally neither more nor less accurate.All LSTM models were evaluated using the unseen, "new" data from the same type of motions that were employed to train models.Generally, when the same type of data is used for training and evaluating the model, deep learning is typically expected to demonstrate great prediction performance.The designed LSTM network, which has multiple neurons and layers, is able to selectively remember patterns that correspond to a time interval; and it can identify the sophisticated nonlinear associations between EMG data and joint torques.Therefore, regardless of whether the network is formed by a Pooled, Clustered, or Individual model, LSTM networks can accurately estimate joint torques in unseen, "new" data within the same type of motions.
It is important to note that the prediction error of Individual and Clustered NMS models in hip F/E and hip Ab/Ad torque was generally higher than in knee F/E and ankle D/P.One reason for this could be attributable to the difficulty in experimentally acquiring surface EMGs on the layered and deep muscles that surround the hip joint.This finding agrees with Pizzolato et al. [25] who attributed the lower prediction accuracy and underestimation of sagittal hip moments compared to knee and ankle moments to the difficulty of obtaining reliable surface EMGs on the iliopsoas.It is also worth noting that, while inverse dynamics is generally considered a gold standard in computing joint torques, it may not be entirely accurate either, particularly at more proximal joints.Estimation of any joint center is critical in the inverse dynamics approach, as it is used to compute the moment arm of the muscles and serves as the point about which the joint rotation occurs.However, accurately estimating the hip joint center is generally considered more challenging than estimating the knee and ankle joint centers, as the hip is deeper and more difficult to characterize with biomechanical models.It is also important to note that during stand-to-sit, outliers with higher prediction error were observed in the Individual models in hip F/E and knee F/E torque, compared to Pooled and Clustered models.This may be attributable to overfitting in the Individual calibrated models, wherein a slight input difference would result in a large output difference.Therefore, when using individual models, caution for overfitting is warranted.
The torque predictions by LSTM models were closer to the values calculated through inverse dynamics than that of NMS models in all movement tests.In LSTM networks, each node is viewed as an artificial neuron, simulating the human neural synapse system in the brain.This gives LSTM networks a great ability to comprehend complicated and non-linear information.However, it learns the correlations between inputs and outputs as a black box and does not explicitly model the relationships between physiological variables, for instance, joint angles-musculotendon kinematics, muscle force-length, and muscle force-velocity relationships.Therefore, only the designated outputs are appropriate to estimate.Although the NMS model had a lower prediction accuracy than the LSTM model, it analyzes and identifies participant-specific muscletendon parameters and relationships.As such, for applications in which participant-specific muscle-tendon parameters and relationships are desired, an NMS model-based solution is clearly more suitable.A model that combines relationships defined from NMS models with neural network structures may have the advantages of higher prediction accuracy than either of its derivatives, in addition to the benefit of defined muscletendon properties and relationships [7], [33], [34].Therefore, in future work, it would be worthwhile to explore the potential of a physics-informed neural network approach that combines NMS features with an LSTM model to enhance the neural network method's extrapolation capability.
As noted above, obtaining ground truth values of joint torques is not trivial.The joint torque estimated through inverse dynamics is not a direct measure of the actual joint torque occurring in the body.We implemented the inverse dynamics approach by solving the dynamic equation of motion (Eq.( 2)) with kinematics.The kinematics (joint angles) were obtained through the inverse kinematics approach by solving a weighted least square problem to minimize the error between modeled and measured marker trajectories.This approach may lead to dynamic inconsistencies due to measurement and modeling errors, resulting in joint moments that do not follow muscle dynamics [35], [36].Despite these limitations, the joint torque calculated by the inverse dynamics approach is currently the closest estimation to the real joint torque during dynamic tasks.It is also worth noting that model training processes in the NMS model are not directly comparable to those of the LSTM model, particularly in that the Pooled NMS model is only trained with one trial from each movement whereas the LSTM is trained with 80% of the data.Some of the prediction errors of the Pooled NMS models may be attributed to variability in EMG and kinetic measurements.Likewise, its performance improvement in Individualized models may be attributable to the varied training data.
While we have not evaluated the real-time computational cost of the LSTM network in this study, it is theoretically capable of predicting joint torque in real-time with low computational cost as it only requires forward evaluation [37], [38], [39].It is important to note that for individual participants, there should be only one calibrated NMS model that covers the large ranges of force-length curves as much as possible.Ideally, a subset of movements that covers the full range would be identified, but our main focus was not on achieving this.We included all movements for calibration, which should work similarly to a subset, as seen in the Pooled model.However, we observed unsatisfactory performance, which may be due to the exclusion of other experimental measurements such as fiber length and pennation angle, that are important to calibrate muscle tendon parameters.Measuring them during dynamic movements is challenging, and the Pooled model may not be suitable to capture all muscle coordination patterns.Our primary goal was to obtain a highly accurate model for predicting joint torque, and to achieve this, we explored various mathematical approaches, including calibrating different models separately, clustering them, and pooling them.We also compared these approaches with LSTM models.Although this may not align with the physical perspective, it is a practical way to obtain a better-trained model from a mathematical perspective.
While our aim was to obtain a model that could predict joint torque with high accuracy, we have not yet implemented these models in practical settings, specifically in exoskeleton control.Further steps are required, such as selecting appropriate methods/sensors for capturing EMG and kinematic data, training models based on the application requirements, and optimizing the model for real-time processing.The developed model would then need to be integrated into the exoskeleton's control system to provide assist-as-needed support for joint movements.Our study provides guidelines for selecting the appropriate model in different scenarios, which has the potential to improve the control strategies for exoskeletons and enhance the functionality of wearable exoskeletons.
We only investigated one NMS and one machine learning model approach, i.e.LSTM here, and our findings should be considered in this context.Other NMS models and even different parameters and levels of sophistication in the currently-used NMS, such as electromechanical delay and tendon types, warrant investigation.There are also other different machine learning models with complicated architectures, such as convolutions neural networks, that could be evaluated.Aspects of the experimental protocol may influence the generalizability of the findings, such as adjusting the chair height in relation to each participant's stature for tests involving sitting.Also, even though we used standardized instructions for all participants, there may have been subtle differences among participants in movement techniques for, e.g.squat and jump movements, and thus possible differences in muscle activation patterns.Additionally, this investigation exclusively involves non-disabled individuals, whereas the primary focus of interest, which is exoskeleton users, primarily comprises populations with motor disorders.Furthermore, the non-disabled participants were generally homogeneous in terms of age and overall physical condition.

V. CONCLUSION
In this study, we predicted ankle, knee, and hip joint sagittal plane torques and hip joint frontal plane torques during several different motions using physics-based NMS models and LSTM neural networks with a cluster approach.The cluster approach did not improve the prediction accuracy in LSTM neural networks, which was already high, but did improve it in NMS Pooled models in some movements.Among NMS models, Individual and Clustered models had similar prediction performance but the Pooled model may be too generic to satisfy all movement patterns.In applications where individual NMS models for each movement are impractical, a clustered movement approach may be a good alternative to pooled movement models.In general, detailed comparative performance of physics-based neuromusculoskeletal and neural networks models to predict torques in multiple lower limb DOFs and in a range of movements can provide useful guidelines for paradigms on the preferred choice of a prediction model in specific circumstances, and thus holds great promise for application in, for instance, assist-as-needed exoskeleton control, by taking into account the physiological joint torque contribution of its users.

A. Trials for NMS Calibration
Some may raise concerns about the accuracy of the pooled NMS model calibration, as using only one trial might be insufficient.However, the NMS model is based on physics and utilizes muscle-tendon parameters and activation patterns specific to each movement type.Therefore, for a given movement type, the muscle is usually activated at a similar range of force-length relationship.Based on the physical structure of the NMS model, calibrating it with one trial is adequate.We also attempted to calibrate the NMS model using two trials from each movement type, but found that this did not result in Fig. 10.
The NRMSE distributions between predicted and actual torques during ten motions among participants in NMS Pooled models with as a comparison one vs.two trials from each movement for calibration, shown with violin plots.Each violin plot includes a kernel density plot with a box plot.Kernel density plots illustrate the NRMSE probability distributions.Box plots denote NRMSE's minimum, lower quartile, median, upper quartile, and maximum values.* indicated that there was a significant difference between two models, determined by pairwise comparisons.a significant difference in prediction accuracy (Fig. 10), while calibration time was considerably longer.Hence, considering the practical calibration time and the physical structure of the NMS model, we did not include multiple trials from the same movement type in Pooled models for calibration.

Fig. 3 .
Fig. 3. (a) Schematic of an NMS model structure including four parts: (1) muscle activation dynamics part, which calculates muscle activation with EMG data; (2) musculotendon kinematics part, that computes musculotendon lengths and moment arms; (3) muscle contraction dynamics part, which is for calculating muscle force; and (4) joint dynamics part, which is for computing joint torques.(b) Schematic of an LSTM network model structure.The inputs to the LSTM networks were the same as those to the NMS models, namely joint angles and EMG signals.The time-series inputs were converted into time-slice data for the next LSTM layer.After that, LSTM layers were added, then a dense layer and a dropout layer.Finally, the output layer predicted joint torques.

Fig. 4 .
Fig. 4. Illustration of K = 3 identified clusters in 2D space by the t-distributed stochastic neighbor embedding algorithm.

Fig. 5 .
Fig. 5.The NRMSE distributions between predicted and actual torques during ten motions among participants in NMS models: Pooled, Individual, and Clustered, shown with violin plots.Each violin plot includes a kernel density plot with a box plot.Kernel density plots illustrate the NRMSE probability distributions.Box plots denote NRMSE's minimum, lower quartile, median, upper quartile, and maximum values.* indicated that there was a significant difference between two models, determined by pairwise comparisons.

Fig. 8 .
Fig.8.One example trial of measured and predicted torque during ten movements via three LSTM models: Pooled, Individual, and Clustered.

Fig. 9 .
Fig.9.The NRMSE distributions between predicted and actual torques during ten motions among participants in NMS and LSTM Clustered models, shown with violin plots.Each violin plot includes a kernel density plot with a box plot.Kernel density plots illustrate the NRMSE probability distributions.Box plots denote NRMSE's minimum, lower quartile, median, upper quartile, and maximum values.* indicated that there was a significant difference between two models, determined by pairwise comparisons.