Prediction of Achilles Tendon Force During Common Motor Tasks From Markerless Video

Remodeling of the Achilles tendon (AT) is partly driven by its mechanical environment. AT force can be estimated with neuromusculoskeletal (NMSK) modeling; however, the complex experimental setup required to perform the analyses confines use to the laboratory. We developed task-specific long short-term memory (LSTM) neural networks that employ markerless video data to predict the AT force during walking, running, countermovement jump, single-leg landing, and single-leg heel rise. The task-specific LSTM models were trained on pose estimation keypoints and corresponding AT force data from 16 subjects, calculated via an established NMSK modeling pipeline, and cross-validated using a leave-one-subject-out approach. As proof-of-concept, new motion data of one participant was collected with two smartphones and used to predict AT forces. The task-specific LSTM models predicted the time-series AT force using synthesized pose estimation data with root mean square error (RMSE) <inline-formula> <tex-math notation="LaTeX">$\le 526$ </tex-math></inline-formula> N, normalized RMSE (nRMSE) <inline-formula> <tex-math notation="LaTeX">$\le 0.21$ </tex-math></inline-formula>, R<inline-formula> <tex-math notation="LaTeX">$^{{2}} \ge 0.81$ </tex-math></inline-formula>. Walking task resulted the most accurate with RMSE <inline-formula> <tex-math notation="LaTeX">$= 189\pm 62$ </tex-math></inline-formula> N; nRMSE <inline-formula> <tex-math notation="LaTeX">$= 0.11\pm 0.03$ </tex-math></inline-formula>, R<inline-formula> <tex-math notation="LaTeX">$^{{2}}= 0.92\pm 0.04$ </tex-math></inline-formula>. AT force predicted with smartphones video data was physiologically plausible, agreeing in timing and magnitude with established force profiles. This study demonstrated the feasibility of using low-cost solutions to deploy complex biomechanical analyses outside the laboratory.

outcomes [1].The AT responds to mechanical stimuli; in vitro assessments showed that optimal strain promoted expression of collagen type I, reduced cell apoptosis, and improved material properties [2], [3].Human in vivo studies of the AT also showed that loading at 6.5% strain and 0.17 Hz (3 s on, 3 s off) for 5 sets, 4 days/wk over 14 weeks increased mechanical stiffness compared to loading at a 3% strain and 0.50 Hz (1 s on, 1 s off) using the same regimen [4].As the AT mechanical properties can be lumped into a phenomenological model describing the mathematical relationship between force and strain [5], the AT force becomes a viable surrogate to target specific mechanobiological adaptations [6], [7].However, measuring AT forces in vivo via implanted sensors is invasive and infeasible for routine assessments [8], [9].As such, accurate and practical estimation of AT force during dynamic motor tasks remains challenging.
Neuromusculoskeletal (NMSK) models can estimate AT force through static optimization [10], [11] or electromyogram (EMG)-informed approaches [7], [12], and high-levels of agreement between predicted free AT forces and invasive measurement have been reported [12].While established NMSK modeling pipelines have been successfully applied in laboratory conditions [13], [14], [15], [16], their reliance on specialized equipment, such as force plates and markerbased stereophotogrammetry systems, coupled with extensive processing steps, has prevented their applicability in real-world settings [17].Shear wave tensiometry, which leverages wave propagation delays to measure force, has been used to study the AT during a variety of motor activities [18], [19] even outside the research laboratory [20], [21].However, shear wave tensiometry still requires calibrating the model parameters using dynamometry prior being used, thereby limiting widespread adoption.
Markerless pose estimation models, such as OpenPose [22], offer an alternative to laboratory-bound marker-based stereophotogrammetry systems.Via markerless approaches, the two-(2D) or three-dimensional (3D) kinematics of an individual can be estimated without any prior preparation or change of clothes [23], [24].Pose estimation models have been combined with neural networks to predict clinically relevant biomechanical variables.For example, 2D pose estimation data were used to train a long-short term memory (LSTM) neural network to accurately predict ground reaction forces (GRF) across a variety of motor tasks [25].Similarly, a feed-forward neural network was developed to predict peak knee adduction moment using pose data synthesized from marker-based stereophotogrammetry data, thereby showcasing how low-fidelity pose estimation could inform the treatment of patients with knee osteoarthritis within clinical settings [26].Neural networks have been used to fast-track the prediction of key biomechanical features from marker data, inertial sensors, and EMG [27], [28], [29], [30], [31], and therefore have potential to provide feedback of internal tissue states (e.g., hip contact force) [32] that could be used to enhance training programs in accordance with known mechanisms of tissue adaptation [6], [33], [34].Nonetheless, this hypothesis cannot be experimentally verified until technological approaches exist to enable internal biomechanics to be readily estimated with low-cost, portable, and easy-to-use solutions.
Extending upon our prior work which demonstrated that a LSTM network yielded good estimates of AT strain during running relative to a NMSK model [5], the objective of this study was to compare the time-series of AT force prediction from synthesized pose estimation data using LSTM neural networks with the ground truth estimated by NMSK model for a variety of motor tasks, namely walking, running, countermovement jump (CMJ), single-leg landing (SLL), and single-leg heel rise (SLHR).Video data from two smartphones was used to demonstrate the feasibility of our proposed model outside the laboratory.

A. Experimental Data Collection and Processing
The dataset used in the present study included running data from our prior study [5].Data collection was previously published [12], [35] and is described here in brief.Sixteen trained middle-distance runners (female: 6, age: 25.2±5.0yr, height: 175.5±7.3 cm, body mass: 64.4±8.4 kg) with no history of AT injuries performed a series of motor tasks in the biomechanical laboratory, comprising walking at 1.3± 0.1 m/s, running at 3.0±0.3m/s and 5.0±0.5 m/s, CMJ, SSL, and SLHR at 1 and 1.2 body weight (BW) conditions.The 3D marker trajectories, GRFs and EMG signals for lower body muscles were synchronously collected by a motion capture system (Vicon Vantage Cameras, Vicon Motion Systems Ltd., Oxford, UK) at 250 Hz, force plates (Kistler Instrument Corporation, Amherst, NY) at 1000 Hz, EMG devices (TELEmyo DTS, Noraxon U.S.A. Inc., Scottsdale, AZ) at 1500 Hz.Additionally, magnetic resonance imaging (MRI) scans of the ankle joint were acquired to personalize the NMSK model.
The estimation of individual AT forces for different movement tasks was carried out using the EMG-informed NMSK modeling pipeline with MOtoNMS [36], OpenSim [37], and CEINMS [38] (Figure 1, a and b).Personalization of the NMSK model (gait2392) was achieved by linear scaling via anatomical marker coordinates obtained from a static trial.The muscle maximal isometric forces [39], moment arm of triceps surae muscles [40], [41], and Hill-type muscle model parameters [42] were also personalized as described previously [12].Inverse kinematics, inverse dynamics, and muscle analysis, were employed in OpenSim to calculate joint angles, joint moments, and musculotendon kinematics across all trials [43].Within CEINMS, a calibration procedure was conducted to fine-tune the model parameters of musculotendon units that spanned the ankle, subtalar, and knee joints [38], [44].Subsequently, the experimental muscle excitations were mapped to the complete set of muscles in the model [45].The calibrated model was then used to estimate triceps surae muscle forces based on musculotendon kinematics and muscle excitations using the EMG-assisted mode in CEINMS [46].The AT force was calculated as the sum of forces generated by the medial gastrocnemius, lateral gastrocnemius, and soleus muscles.The estimates of AT force were compared with direct measurements of AT force for walking, running and CMJ available in the literature [8], [9], [47].

B. Input and Output Data Pre-Processing
Five motor tasks (i.e., walking, running, CMJ, SLL and SLHR) were used to train task-specific LSTM models.The walking and running tasks focused on the stance phase, which is the period when the foot is in contact with the ground.The CMJ task targeted the push-off phase.The SLL task involved the landing period of a single leg jump.Finally, the SLHR included the heel lifts from a wooden slant board that sloped downward by 10 degrees from toe to heel.For each motor task, ten keypoints matching the OpenPose model were synthesized from the marker trajectories [22] (Figure 1, c).For each trial, the bilateral hips, knees, ankles, and big toes were directly extracted by point kinematics tool of OpenSim.The pelvis and neck keypoints of the pose estimation model were computed as the mid-point of bilateral acromial landmarks and hips joint centers, respectively.The dataset was augmented 10 times by adding noise to the x, y, and z coordinates of each keypoint (Figure 1, d).The noise was applied using axis-specific and keypoint-specific Gaussian distribution representing the errors of OpenPose compared to gold standard optical motion capture [5], [48].As the noise levels for certain tasks (i.e., SLL and SLHR) were not reported in the literature, the noise from similar motion pattern (i.e., CMJ and walking) were used, respectively.For the unreported joints (i.e., neck, pelvis, and toe), the noise distribution of adjacent keypoints was used [5].Noisy position data were low-pass filtered using a 2 nd order Butterworth low-pass filter with a cut-off frequency of 10 Hz, as recommended for real-time applications [49].Velocities of the keypoints were calculated using backward numerical differentiation of filtered keypoint positions.Through this approach, a total of 910 trials for walking, 1510 trials for running, 600 trials for CMJ, 1310 trials for SLL and 1870 trials for SLHR containing positions and velocities of synthetized keypoints were generated for subsequent model training.
Prior to model training, the keypoint information from 10 frames (i.e., 40 ms) was used to predict the AT force in the last frame of the time window.The height and mass of each participant were appended to each time frame.To make the prediction start from the first frame of the trial, nine frames of zero-padding were prepended to the beginning of each trial.All the input features were mean-removed and scaled to unit variance to facilitate convergence of gradient descent [50].

C. LSTM Model Training and Validation
The models for each motor task were trained and validated using a leave-one-subject-out cross validation (LOSOCV) (Figure 1, e).The LSTM neural network with two bidirectional LSTM layers was selected from the preliminary tests.The masking layer was applied directly after input layer to mask the zero-padding for input information.Batchnormalization and drop-out were introduced after each bi-directional LSTM layer to minimize overfitting and improve model prediction performance [51].For each motor task, hyperparameter tuning was implemented to search for the optimal number of LSTM cells of each layer, dropout rate, learning rate and batch size.The hyperband algorithm [52] was applied to tune hyperparameters by minimizing the mean square error of the model predictions and ground truth.During the tuning process, the dataset was randomly split into 14 participants for training and 2 participants for validation.The weights of the LSTM model were optimized by Adam [53], using early stopping with 10 epochs of patience to monitor the validation loss, which stops model training when the loss does not improve further.Hyperparameter tuning was performed using TensorFlow (v2.6.0) in Machine Learning eResearch Platform (MLeRP) (32GB RAM and Tesla A100 GPU) on Cloud.
The LOSOCV of each task-specific model was implemented through the best combination of the hyperparameters selected by previous tuning process (Table I).To avoid overfitting, early stopping with 20 epochs of patience was used during the training process.The difference between the LSTM and NMSK model estimates of AT force for each trial were quantified using coefficient of determination (R 2 ), root mean square error (RMSE), and normalized RMSE (nRMSE).The RMSE was normalized by dividing the peak value of the corresponding ground truth data for each trial.As the raw dataset was augmented 10 times with noise, the average nRMSE of each sub-dataset was calculated.For each task-specific model, the two-tailed paired sample t-test for 1D statistical parametric mapping (1D-SPM) [54] was run to compare the predicted AT force with the ground truth based on the sub-dataset with the lowest average nRMSE and the differences were considered statistically significant for p-values < 0.01.The bias of each model was also calculated through the subtraction of the mean of predicted values and ground truth.

D. LSTM Model Testing Using Real Pose Estimation Data
The final task-specific models were retrained using the whole available training data for model testing.These models were further assessed using in-field video data collected for one participant (height = 175 cm; mass = 66 kg) who was not included in the training dataset (Figure 1, f∼i).The high frequency videos (sample rate: 240 Hz; resolution: 1080 × 1920) were recorded from two smartphones (iPhone 8 and iPhone 13, Apple) on the sports track and manually synchronized using pulses.Prior to data collection, the smartphones were mounted on tripods and the cameras were calibrated using a checkerboard (5 × 6, square length: 3cm) to find the distortion coefficients, and the intrinsic and extrinsic camera properties [55].The participant completed walking, running, CMJ, SLL, and SLHR consecutively with the same instruction as the previous laboratory session.The actual walking and running speeds were 1.19±0.05m/s and 3.12±0.32m/s, respectively.The motion data were cropped to the same phase as the training data and OpenPose (v1.7.0) was used to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I HYPERPARAMETERS USED FOR DIFFERENT TASK-SPECIFIC MODELS
Fig. 2. Comparison of AT force prediction with NMSK modeling pipeline for different motor tasks.AT force estimates from a variety of literature sources were also displayed for walking, running and CMJ based on use of fiber optic (Finni et al. [8]), tendon tapper (Keuler et al. [47]), and buckletype transducer (Komi et al. [9]).The bias in each case was calculated by using the prediction minus ground truth.Shaded time-series areas indicate 1 standard deviation from the mean.Green horizontal bars: significant difference between LSTM predictions and NMSK modeling outputs using 1D-SPM.SLL: single leg landing; CMJ: countermovement jump; SLHR: single leg heel rise; AT: Achilles tendon; NMSK: neuromusculoskeletal; LSTM: long short-term memory neural network; R 2 : coefficient of determination; nRMSE: normalized root mean square error.
automatically obtain 2D pose keypoints.Finally, the keypoints were triangulated and transformed to the world coordinates, consistent with the training data, and used as input to the final LSTM models to predict AT force.The predicted AT forces for each task were compared to the training data for visual evaluation and R 2 was calculated between the predicted curves and directly measured AT force reported in the literature for walking at 1.1 m/s [8], running at 3.1m/s and CMJ [9].

III. RESULTS
The performance of the task-specific LSTM models for time-series AT force prediction varied according to the task.The 1D-SPM testing indicated that the developed models underestimated the peak force region during walking and running tasks, as well as the middle stage (from 40% to 86%) for the SLHR task (Figure 2).R 2 values for the correspondence between NMSK estimates of AT force from the present study and those reported in the literature for walking, running, and jumping [8], [9], [47] ranged from 0.86 to 0.94.The subjectlevel comparison revealed that the walking task achieved the most accurate individual predictions, showing lower nRMSE and higher R 2 for overall and individual level compared to other tasks (Figure 3).However, poor AT force prediction for specific participants occurred in CMJ, SLL, which manifested nRMSE more than 0.3 and R 2 less than 0.7.When testing the real-world pose estimation data for the new participant, each task-specific model was able to generate plausible AT force predictions, which aligned well with the training outputs (Figure 4) and with direct measurement of AT tendon force reported in the literature (R 2 0.71±0.13for walking [8], 0.68±0.11for running and 0.81±0.03for CMJ [9]).

IV. DISCUSSION
Our study demonstrated the feasibility of using computer vision to predict tissue biomechanics outside the laboratory.While accurate muscle force estimation currently requires comprehensive NMSK model personalization and calibration [14], [56], these processes are not easily implementable outside the laboratory.We trained LSTM models using data attained by a personalized EMG-informed NMSK modeling pipeline.After training, the LSTM models required keypoints and anthropometric information as input, which are promptly available through computer vision approaches.The developed LSTM models could predict AT force for specific tasks across all subjects (RMSE ≤ 526 N, nRMSE ≤ 0.21, R 2 ≥ 0.8).Additionally, the trained LSTM models showed good generalizability for real world pose estimation data, predicting physiologically plausible time-series of AT force when using smartphone videos as input.Overall, the proposed setup is low cost and could be rapidly deployed outside the laboratory to enhance training and rehabilitation after AT injury or disease.
The AT force is an internal biomechanics variable that cannot be directly measured in the clinical environment; yet, task-specific LSTM predictions of AT force demonstrated excellent tracking of ground truth (i.e., NMSK model) data, when assessed via LOSOCV (Figure 2).For walking, the LSTM prediction accuracy (RMSE = 189±62 N; nRMSE = 0.11±0.03,R 2 = 0.92±0.04)was comparable with prior studies that investigated other internal biomechanics variables, such as hip and knee contact forces, via recurrent (nRMSE = 0.11, r = 0.93∼0.91)[57] and convolutional neural network (r = 0.84∼0.90)[31].However, the performance reported in these prior studies was achieved by including GRFs and EMGs as input features, thereby limiting translation outside the laboratory environment.At individual level, when assessing AT force predictions in different motor tasks, walking consistently exhibited the most accurate predictions and CMJ and SLL showed the least (Figure 3).One possible explanation for the lower accuracy during certain tasks could be the limited data employed in the training of the LSTM network, which could have resulted in some participants (e.g., participants 04, 13, 15 and 16, Figure 3) being underrepresented in the training dataset, despite data augmentation being applied to increase variation of the pose data [30].Furthermore, the variation in muscle activation patterns (e.g., co-contraction of tibialis anterior and gastrocnemii) can significantly influence muscle force generation [44].Muscle activations were not captured in our LSTM models and kinematic information alone may not be sufficient to achieve accurate predictions for all motor tasks.
Using as input standard video data collected outdoor from two tripod-mounted smartphones, our models predicted physiologically plausible AT force.The time-series closely aligned with the training data in timing and magnitude (Figure 4), and accounted for 68∼81% of the variance in directly measured AT forces reported in the literature for walking [8], [9], [47].These findings demonstrated the viability of the proposed approach and the future possibility of translation to clinical settings or sporting field.A degree of degradation Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(i.e., non-smooth curves) was however observed in the AT force predictions relative to the corresponding data from the training dataset.The putative cause of degradation might be related to differences between the noise distribution of the synthesized and experimental pose estimation data.Indeed, while employing Gaussian noise to synthesize realistic keypoint data could be an acceptable initial assumption [58], the real-world noise might be distributed differently.As such, better strategies to apply noise to synthetized pose estimation data might be required in the future to improve the training of LSTM models.Pose estimation has been implemented to aid clinical assessments estimating external biomechanics variables, such as the maximum trunk angle of sit-to-stand or temporal-spatial gait parameters [59], [60].Additionally, pose estimation has been combined with neural networks to reduce the dependency on instrumentation when collecting gait data, for example predicting GRF and peak knee joint moments without force plates [25], [26].Similarly, our proposed approach to AT force estimation could be applied to remote monitoring of rehabilitation progression and for rapid screening without any additional hardware.Critically, the realtime capabilities of LSTM models could be also used in personalized biofeedback rehabilitation paradigms to target ranges of AT force magnitude and rate [6], [34], promoting desired tissue adaptations [34].
The current study demonstrated the feasibility of predicting the time-series AT force based on low-cost 3D pose estimation data outside the laboratory.To the best of our knowledge, this is the first study to demonstrate internal tissue loading could be predicted from real-world pose estimation data.Albeit the developed models produced physiological plausible outputs for the in-field pose data, the AT force predictions still require validation through laboratory-based experiments.It would be desirable to validate AT force predictions using a similar approach to what done for joint contact forces, wherein comprehensive datasets of direct measurement enabled validation of computational models [61].However, no dataset is currently available that combines invasive measurements of AT force with non-invasive motion capture, making such validation infeasible.AT force estimates of our NMSK modeling pipeline were previously shown to produce physiological plausible estimates on joint contact forces [13], [15], [56], [62], muscle forces [63], [64], and AT forces [12]; as such, we considered the NMSK model estimates as the best available proxy for ground truth in the present study.Additional confidence in our AT force estimates from the NMSK model came from the strong correspondence with AT forces for walking, running and jumping reported in the literature [8], [9], [47] (R 2 = 0.86∼0.94).Nonetheless, given these limitations, we suggest caution when interpreting the presented results or when employing similar neural networks for clinical applications.Future work should focus on assessing whether model predictions are sufficiently accurate to detect clinically meaningful changes during and following training.There is also a need to validate these task-specific LSTM models against a concurrently measured ground truth outside the laboratory.Additionally, a task classifier to enable real-time selection of the most appropriate task-specific LSTM model would facilitate clinical applications.To ease translation to outside the lab, the current study only incorporated the kinematic and anthropometric information as model input.Using EMG as an additional LSTM model input in future studies may have potential to improve model accuracy and generalizability.Finally, current models only effectively work for videos with high sample frequency (i.e., ∼250 Hz); nonetheless, highspeed cameras are common in modern smartphones.

V. CONCLUSION
We demonstrated that time-series AT force could be predicted using smartphone video data.This proof of concept could enable rapid assessment of AT biomechanics in large cohorts, for clinic and remote rehabilitation, and realize the development of mechanobiology inspired exercise programs to drive optimal tissue adaptation.

Fig. 1 .
Fig. 1.The overview of workflow for model training and validation of AT force using synthesized pose estimation data and further testing using real world pose estimation data (NMSK: neuromusculoskeletal; GRF: ground reaction force; EMG: electromyography; AT: Achilles tendon; MOCAP: motion capture system; US: ultrasound; LSTM: long short-term memory neural network).

Fig. 3 .
Fig. 3.The AT force prediction metrics for 16 participants across different motor tasks (The number of trials for each task was shown in bottom of figure and the first participant missed CMJ task.CMJ: countermovement jump; SLL: single leg landing; SLHR: single leg heel rise; NMSK: neuromusculoskeletal; LSTM: long short-term memory neural network; R 2 : coefficient of determination; nRMSE: normalized root mean square error).