Continuous Motion Intention Prediction Using sEMG for Upper-Limb Rehabilitation: A Systematic Review of Model-Based and Model-Free Approaches

Upper limb functional impairments persisting after stroke significantly affect patients’ quality of life. Precise adjustment of robotic assistance levels based on patients’ motion intentions using sEMG signals is crucial for active rehabilitation. This paper systematically reviews studies on continuous prediction of upper limb single joints and multi-joint combinations motion intention using Model-Based (MB) and Model-Free (MF) approaches over the past decade, based on 186 relevant studies screened from six major electronic databases. The findings indicate ongoing challenges in terms of subject composition, algorithm robustness and generalization, and algorithm feasibility for practical applications. Moreover, it suggests integrating the strengths of both MB and MF approaches to improve existing algorithms. Therefore, future research should further explore personalized MB-MF combination methods incorporating deep learning, attention mechanisms, muscle synergy features, motor unit features, and closed-loop feedback to achieve precise, real-time, and long-duration prediction of multi-joint complex movements, while further refining the transfer learning strategy for rapid algorithm deployment across days and subjects. Overall, this review summarizes the current research status, significant findings, and challenges, aiming to inspire future research on predicting upper limb motion intentions based on sEMG.

significantly affecting their quality of life.Hence, expedited upper limb rehabilitation is of paramount importance.A study on rehabilitation robotics-assisted therapy [2] has shown that passive movements assisted by rehabilitation robots alone do not improve motor function of patients.Instead, the training intensity and the patients' active participation are crucial factors in improving rehabilitation outcomes, rather than the mere use of the robot itself.This aligns with theories of neural plasticity, where active patient participation is crucial for inducing neural plasticity and improving rehabilitation efficiency.Furthermore, research [3] has also indicated that the motor function improvements provided by current rehabilitation robots are limited, highlighting the need for more effective assist-as-needed (AAN) control strategies beyond traditional impedance and admittance control, aiming to maximize training intensity while ensuring active patient participation.Thus, precisely adjusting the robotic assistance level according to the patients' motion intentions is pivotal for achieving active rehabilitation.
Studies [4], [5], and [6] categorized patients' motion intention into biological and non-biological signals.Biological signals include Electroencephalography (EEG), Electromyography (EMG), Force Myography (FMG), and Mechanomyography (MMG), while non-biological signals consist of video, Inertial Measurement Unit (IMU), and force sensors.However, non-biological signals have inherent time delays and cannot predict motion intentions when the patients' limb is static, which is not friendly for stroke and amputee patients.In contrast, EMG, besides having the Electromechanical Delay (EMD) of 50-100ms that reduces prediction latency, is more stable and less susceptible to interference in practical applications than EEG.Moreover, EMG sensors are more portable and easier to wear than EEG.Therefore, predicting patients' motion intentions using surface EMG (sEMG) is a highly promising approach.
Current sEMG-based motion intention prediction research can be categorized into discrete classification and continuous regression.However, as discussed in the review [6], only 11.6% of studies from 1996-2017 focused on continuous regression, and the first review on sEMG-based continuous motion intention estimation was not published until 2019 [4].Moreover, according to the prediction methods used in previous studies, continuous regression can be divided into Model-Based (MB) and Model-Free (MF) approaches.Hence, given the current research landscape and the fact that rehabilitation robots are continuously controlled, this review will focus on studies using MB and MF methods for continuous upper limb motion intention (i.e., joint kinematics and dynamics) prediction.
As illustrated in Figure 1, MF approaches mainly encompass both Machine Learning (ML) and Deep Learning (DL) methods.The difference between ML and DL is that ML requires manual feature extraction and selection from preprocessed sEMG signals.In contrast, DL can automatically extract advanced features from sEMG and utilize the neural network's potent fitting capacity to approximate the highly nonlinear relationship between features and motion intentions, thereby avoiding the reliance on optimal feature sets based on empiricism similar to ML [5].Although end-to-end MF methods are convenient to train and quick to deploy, their inherently 'black box' nature may overlook the physiological causal relationships between input and output data, consequently struggling to generalize beyond the training data and risking overfitting [6], [7], [8].In contrast, MB methods with inherent physiological causality can convert sEMG signals to muscle-tendon forces according to the neural-physiological mechanisms of muscle activation and contraction dynamics, before predicting joint kinematics and dynamics from joint torques using musculoskeletal (MSK) geometry and Newtonian motion equations [6].Additionally, most MB studies employed Hill-type MSK models with series elastic (SE), contractile element (CE), parallel elastic (PE), and viscoelastic (VE) components.However, due to physiological differences among patients, the generic proportional Hill model derived from extensive cadaveric specimens can lead to significant prediction errors.Therefore, parameter optimization for subject-specific MSK models in the final stage of MB methods is necessary to achieve precise predictions [9], [10].
In summary, the main contributions of this review towards achieving more effective active rehabilitation are threefold: 1) Comprehensively collect and screen all MB and MF studies in the past decade from six major databases and illustrate the current research landscape.2) Review and identify the research results, progress and corresponding limitations of MB and MF studies based on different single joints and multi-joint combinations.
3) Conclude and analyze the key findings, challenges, and opportunities from MB and MF studies to determine future research directions.The remainder of this paper is structured as follows: Section II introduces the literature collection methodology and selection criteria, summarizing the screening results.Section III provides a comprehensive review of collected MB and MF studies based on different single joints and multi-joint combinations catalogs.Section IV discusses the key findings, challenges, and future research directions identified from MB and MF studies over the years.Finally, Section V concludes this review.

II. METHODS A. Search Strategy
To conduct a comprehensive systematic review of the studies based on MB and MF methods, this review initially employed search keywords to query six major electronic databases, namely PubMed, Web of Science, Scopus, IEEE Xplore, ScienceDirect, and SpringerLink.According to initial search results, it was observed that the volume of relevant studies began to increase from 2012.Therefore, the review's search span was set to cover the past decade, collecting all pertinent articles published from January 2013 to June 2023 within these databases, following the PRISMA guidelines.Additionally, relevant literature from Google Scholar was also selectively included to ensure a systematic and exhaustive search outcome.The following keywords were utilized for the literature search: ((EMG OR sEMG) AND (Continuous) AND (Prediction OR Estimation) AND (Shoulder OR Elbow OR Wrist OR Hand OR Finger OR Upper-Limb))

B. Inclusion and Exclusion Criteria
Initially, since this review exclusively focused on studies that involve continuous regression instead of discrete classification, research on pattern recognition, classification, and piecewise discretization were excluded from consideration.Additionally, only articles published in English and accessible to the author were considered for inclusion.
Subsequently, this review concentrated solely on MB and MF studies driven by sEMG or intramuscular EMG (iEMG) signals, thereby excluding studies that exclusively utilize other motion intention signals for prediction without sEMG or iEMG signals.Excluded signal sources encompass non-biological signals, such as IMU, Kinect cameras, Electrical Impedance Tomography (EIT), ultrasonic sensors, and  Finally, this review was dedicated to upper limb joints, thereby excluding all other joints, such as lower limb joints.Additionally, the MSK models discussed in reviewed studies were restricted to macroscopic physiological models of muscles, tendons, and bones, excluding, for instance, the finite element models based on the stress-strain relationships amongst muscle tissues.Furthermore, the review of MSK models was limited to human models without addressing other biological species.

C. Study Selection Results
As depicted in Figure 2(a), 674 relevant publications were identified using the specified search keywords in the selected databases, which included 33 supplemental publications from Google Scholar.Following the application of inclusion and exclusion criteria, 186 publications were ultimately included in this review, with 36 and 150 studies based on MB and MF approaches, respectively.
Figures 2(b) and 2(c) show the induction and categorization of the 186 selected publications.Figure 2(b) indicates a gradual increase in the number of publications related to continuous prediction of upper limb motion intentions from 2013 to 2020, followed by the exponential growth beginning in 2021.This trend suggests that, unlike discrete classification in pattern recognition, the continuous prediction of upper limb motion intentions is an emerging research field that has garnered extensive attention in the past three years.Figure 2(c) compares the current research status of each upper limb single joint and multi-joint combination, revealing that most studies focus on the elbow, wrist, and hand joints, as well as the hand-wrist and elbow-shoulder combinations.In contrast, there are only a few studies based on the MF approaches concerning the remaining shoulder joint, wristelbow joint combination, and the entire upper limb, with the absence of studies based on MB methods.

III. RESULTS
Before conducting a comprehensive review of MB and MF studies based on each single joint and multi-joint combination, Figure 3 below illustrates the distribution of methodologies applied in both MB and MF research, as well as the percentages of offline/online prediction, subjects/databases, and different subject attributes in these studies.It is observed that approaches in MB studies can be classified into six categories.In addition to the Hill model, the integrative approaches combining MB with MF are also of particular interest.Regarding the MF research encompassing eleven approaches, besides the highly regarded traditional neural network and DL model, approaches based on attention mechanisms, muscle synergy (MS) features, and motor unit (MU) neural features have also shown promising research potential, meriting further in-depth investigation.
A further subdivision within the predominant categories of traditional neural network and the DL model is warranted to provide a comprehensive review of studies employing the MF approaches.The collected 37 studies based on traditional neural networks can be further categorized into three categories: Feedforward Neural Networks (FNNs: 25 studies), Traditional Recurrent Neural Networks (T-RNNs: 11 studies), and Spiking Neural Network (SNN: 1 study).The FNNs include Artificial Neural Network (ANN), Multilayer Perceptron (MLP), Regarding the predictive content of the collected research, the following movements were predicted for each upper limb joint under various angular and force ranges, velocities, loads, and durations: internal/external rotation of the shoulder joint, as well as the adduction/abduction and flexion/extension in both vertical and horizontal planes; flexion/extension of the elbow joint in the vertical and horizontal planes; flexion/extension, ulnar/radial deviation, and pronation/supination of the wrist joint; the independent and simultaneous flexion/extension of the Metacarpophalangeal (MCP), Proximal Interphalangeal (PIP), and Distal Interphalangeal (DIP) joints, encompassing both single and multiple fingers movements, along with grasp tasks based on different grip strengths and different object sizes and shapes.Furthermore, the study also conducted experiments based on static isometric contraction under different intensities, compound synergistic movements, and mirrored movements, as well as the simultaneous prediction of joint kinematics and dynamics.Additionally, over one-third of these studies utilized the public NinaPro dataset, along with other public datasets (e.g., putEMG-Force [11], Biopatrec [12], and KIN-MUS UJI [13]) for the development of prediction algorithms.

A. MB Approaches
1) Shoulder Joints: Compared to other upper limb joints, the shoulder joint is less studied due to its anatomical complexity, diverse multi-degree of freedom (DoF) movement patterns, and the challenges posed by sEMG signal acquisition.Study [14] integrated a muscle activation model with EMD and the ELM to achieve real-time prediction with 32ms low latency in random movement speeds.
2) Elbow Joints: Both studies [15] and [16] utilized the simplified Hill model containing only CE.Specifically, [16] combined the Hill model with a state-switching model for realtime prediction.However, the state-switching model exhibits time delays, and robustness of the simplified model is poor for small changes in joint angles.
Apart from the study [17], which used a complete Hill model encompassing CE, PE, and SE, studies [18], [19], [20], [21], [22], [23] employed the rigid-tendon Hill model containing only the CE and PE.However, the rigid-tendon Hill model's negligence of muscle stiffness variation results in substantial torque prediction errors.As a refinement, [20] employed a Hill model optimized via Genetic Algorithm (GA) and the Short-Range Stiffness (SRS) model based on torque balance equations to predict joint angles and time-varying stiffness concurrently.Research [21] employed GA to optimize the Hill model and enhanced the model with additional physiological parameters, achieving robustness across various movement loads.Regarding other optimization algorithms, [23] optimized the Hill model using the nonlinear least squares.
Studies [15], [17], [18], [22] integrated the MB and MF approaches.Specifically, [15] employed polynomials to approximate the relationship between muscle-tendon force and elbow angle, replacing the MSK geometric analysis and motion equations in the MB approach.Moreover, [18] developed a State Space (SS) model optimized with Extended Kalman Filter (EKF) by merging the Hill model with timedomain (TD) features for real-time closed-loop estimation.Additionally, [17] compared the capabilities of the Hill model and BPNN in predicting elbow isometric contraction force, concluding that BPNN provides superior predictive performance due to the linear relationship between joint forces and sEMG, but it lacks model interpretability.Hence, combining MB and MF methods offers complementary advantages.For instance, [22] trained muscle activation optimization factors using the RBFNN, reducing the conversion bias between sEMG and muscle activation, thereby adapting to individual variances.The results indicated that this hybrid MB-MF approach outperformed standalone MB or MF methods, further enhancing the accuracy of joint torque predictions.
3) Wrist Joints: Studies [24], [25], [26], [27] employed the rigid-tendon Hill model.Specifically, [24] and [25] balanced the Hill model's predictive accuracy and computational complexity through sensitivity analysis and GA optimization, addressing the oversimplified model that tends to overestimate parameters and thus neglect subject specificity.Moreover, based on the mirrored bilateral motion experiment, [25] proved that there is no statistical difference between the performance of this method on the ipsilateral side and the contralateral side.Additionally, since the supinator is the deep-seated muscle challenging to measure via sEMG directly, studies [26] and [27] employed Non-negative Matrix Factorization (NMF)based virtual MS co-activation to replace the muscle activation of the pronator and supinator, subsequently inputting them into the Hill model for prediction.Results indicated that this approach outperformed the linear regression (LR) and ANN based on TD and MS features, exhibiting robustness across various upper limb postures.
Studies [28], [29], [30] integrated the MB and MF approaches.Specifically, [28] utilized the BPNN to identify distinct motion phases based on muscle activation, subsequently employing the MSK optimized by Bayesian LR for low-latency real-time joint force prediction.The results highlighted the superiority of the Bayesian LR over the GA in simplicity and efficiency.Additionally, [29] introduced physical MSK knowledge as a soft constraint added to the CNN model's loss function.This method's predictive performance was not only superior to Support Vector Regression (SVR), ELM, ML-ELM, and CNN but also had a simpler architecture than traditional CNN, requiring less training data and converging faster.Moreover, [30] furthered [29]'s work by sharing the pre-trained CNN parameters and updating only the fully connected layer for transfer learning.Results showcased that this method of sharing CNN's advanced sEMG feature extraction knowledge not only had an excellent convergence rate and generalization but also required minimal individual data for rapid transfer learning.
4) Hand Joints: Similar to research [17], the study [31] employed the filtered sEMG's Root Mean Square (RMS) signal as input and predicted using the same complete Hill model.While studies [32], [33], [34] compared the performance of muscle activation models and TD features combined with FNN, single-output Gaussian Process Regression (GPR), and multi-output GPR, respectively.The results indicated superior performance using muscle activation models with EMD over the TD feature.Moreover, due to its ability to effectively model the inherent correlation between joints, multi-output GPR outperformed FNN and single-output GPR, underscoring the significance of considering MS features.
Unlike most studies employing the Hill model, study [35] was the only study that leveraged the Huxley model for realtime predictions.It reduced the computational complexity of the high-dimensional Huxley model using the spectral method, Galerkin method, and balanced truncation method, further employing Particle Swarm Optimization (PSO) for parameter optimization.Finally, this approach balanced the prediction performance and computational time of the Huxley model, along with excellent generalizability across days.
Regarding the MSK model built on MS features and MU neural features, study [41] suggested that the Synergistic Linear Regression Model (SLRM) based on Hierarchical Alternating Least Squares (HALS) and LR slightly surpassed traditional MSK approaches.Studies [42] and [43] integrated MS features with MSK models, where [42] modeled MS features extracted through NMF-HP with L2 regularization constraint (NMF-HP-L2) into the GO-optimized Hill model, achieving superior predictive performance and stability over both Hill and NMF-Hill models.Research [43] constructed the MSK model using MS features extracted from independent components obtained via Adaptive Mixture Independent Component Analysis (AMICA) with NMF, resulting in better predictive performance than traditional MSK models, LR, and SVR.Additionally, echoing findings from studies [44], [45], [46], [47], [48], [49], study [50] input MU discharge frequency extracted from HD-sEMG and FastICA into the GO-optimized Hill model.The outcomes highlighted that the MU-Hill model had significantly improved accuracy and robustness over Hill models based on TD features.
Regarding reinforcement learning (RL) based MSK models, studies [51], [52], [53] utilized the DDPG algorithm under the Actor-Critic framework and Proximal Policy Optimization (PPO), enabling multi-agents to compute joint angles using MSK's forward dynamics model based on the joint torque predicted by agents.The resulting RL-MSK model showed comparable accuracy to the Hill model but outperformed MLP and NARX, with strong robustness against movement speed variations.However, the training time of this method is about 8 hours and lacks generalizability to the untrained new data.
6) Elbow-Shoulder Joints: Studies [54] and [55] have constructed closed-loop SS models grounded on the Hill model and forward dynamics.Specifically, [54] integrated the complete Hill model with NARX state equations and BPNN measurement equations, then employed the Unscented Kalman Filter (UKF) for real-time prediction and closed-loop estimation, resulting in superior outcomes than the open-loop estimation using solely NARX and BPNN.Research [55] combined the rigid-tendon Hill model with the fused TD features of sEMG and MMG to build the Unscented Particle Filter (UPF)-optimized SS model, outperforming the BPNN, SVR, and GRNN and significantly reducing the demand for training data volume.b) Other methods: Study [57] utilized the capability of Fast Orthogonal Search (FOS) for rapid fitting and system identification of non-linear models.Additionally, this study considered the coupling effect between the output force and joint angles during shoulder activity to enhance prediction accuracy.This coupling effect mirrors the force-velocity and force-length relationships between muscle fibers and muscle force in the Hill model, suggesting integration of MF methods with the Hill model to further boost predictive performance.b) Machine learning: Study [60] utilized tree-based Hierarchical Projection Regression (HPR) algorithms and incremental learning for real-time prediction of elbow joint angles under different loads.Study [61] employed Random Forests (RF) for the prediction and automatic selection of significant timedelayed features.Similarly, [62] incorporated Gray Feature Weighted Support Vector Machines (GFWSVM) to assign weights to sEMG features based on their significance.c) Traditional neural networks: Studies [63], [64], [65], [66] leveraged extracted TD features to predict joint angles using ANN, WNN, and GRNN, respectively, demonstrating WNN outperformed both Support Vector Machines (SVM) and RBFNN.Moreover, [63] highlighted the superior performance of RBFNN in predicting both joint angles and angular velocities compared to MLP.Numerous studies, including [67], [68], [69], [70], and [71], employed BPNN for prediction.Specifically, [67] showed that inter-subject variability significantly impacts the predictive performance by comparing generic and personalized models.Similarly, [71] evidenced greater inter-subject variability than intra-subject by comparing TD features and subject-invariant features extracted using Maximum Independent Domain Adaptation (MIDA).Consequently, [68] incorporated the GA feature selection to eliminate the inter-subject redundant and low-correlation features, thereby enhancing the inter-subject generalizability.Lastly, [72] introduced the SNN, which emulates biological neuronal spiking mechanisms and membrane potential variations, achieving predictive accuracy comparable to LSTM.

B. MF Approaches
As for the time-series neural networks, studies [73] and [74] employed the TDNN for prediction based on TD features.On the other hand, studies [75] and [76] integrated the NARX model with the MLP, ElmanNN, and the Adaptive Neuro-Fuzzy Inference System (ANFIS) from [77] for predictions based on TD features, ultimately proving the superior predictive performance of the ANFIS-NARX model.d) Deep learning: Studies [78] and [79] employed LSTM based on TD features for precise joint angle predictions.Study [80] showed the superiority of CNN-LSTM over individual CNN and LSTM models, emphasizing the importance of establishing long-term contextual dependencies among extracted advanced features.
e) Muscle synergy features: Study [81] fed MS features extracted through Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) into Bi-LSTM for prediction.The results indicated that MS features demonstrate greater robustness to sEMG variations than traditional TD features.Furthermore, MS features extracted by MCR-ALS maintained a higher correlation across a five-day experimental period for both intra-subject and inter-subject than those extracted using Principal Component Analysis (PCA) and NMF.
f) Other methods: Studies [82] and [83] applied Kalman Filters (KF) to achieve real-time predictions of joint angles and torques under varying loads and motion speeds.Studies [84] and [85] employed nonlinear system identification methods.Specifically, [84] validated the superiority of the Parallel Cascade Identification (PCI) model over the FOS algorithm in [57], and [85] presented the robustness of the Hammerstein-Wiener model coupled with WNN for predictions across days, although it still requires individual calibration and its predictive performance can be influenced by motion loads.
3) Wrist Joints: a) Linear models: Study [86] utilized ridge regression based on TD features for predictions and demonstrated that the Least Absolute Shrinkage and Selection Operator (LASSO) is an effective method for HD-sEMG channel selection.Additionally, study [87] indicated that although ANN outperformed LR and NMF in offline experiments, their online performances had no significant differences.
b) Machine learning: Studies [88] and [89] input time-delayed TD features into Least Squares Support Vector Machine (LSSVM) and RF for predictions, respectively, achieving superior performance than that of SVM and BPNN.This again highlights the importance of considering EMD.
c) Traditional neural networks: In research [90], TD and FD features were fed into a GA-optimized ELM for prediction, achieving better results than GRNN.However, ELM still exhibits some instability.Studies [91] and [92] utilized ANN and BPNN based on TD features, respectively.Notably, [91] found that using sEMG signals based on large-area muscle activity superimposition outperformed non-directional iEMG signals targeting local muscle information, suggesting using HD-sEMG sensors to collect more comprehensive muscle activity data.Additionally, the study [93] also employed BPNN but incorporated both FD and time-frequency domain (TFD) features as inputs, further applying KF for postprocessing.
d) Deep learning: Compared to PCA, studies [94] and [95] proved that Deep Neural Networks based on Stacked AEs (SAE-DNN) can extract more representative sEMG features, hence achieving better predictions than LR and SVR.
Both studies [96] and [97] employed CNN.Specifically, [96] achieved superior predictive performance using raw TD sEMG images and Fast Fourier Transform (FFT) based FD images compared to six manually feature-engineered machine learning models, namely LR, SVR, RF, GPR, and MLP.Meanwhile, [97] indicated that CNN based on TD images outperformed ANN founded on Histogram of Oriented Gradients (HOG) features.Moreover, PCA was employed in this study to show the higher correlation of CNN-extracted spatial features with actual joint torques over pixel variation-focused HOG features and empirically-based manual features.
However, the generalization across days and subjects of CNN-LSTM still needs to be improved.This was further validated by [100], which utilized LSTM-AE to evaluate and quantify the domain shift in CNN-LSTM for the task across days based on the reconstruction error of CNN features, enabling model performance degradation monitoring and timely model recalibration.To further enhance generalization across subjects, [101] proposed a transfer learning method employing a dual-stream CNN to extract domain-invariant features from both source and target domain data, subsequently adjusting CNN weights via regression loss, Maximum Mean Discrepancy (MMD) loss, and regression contrastive loss.
e) Motor unit neural features: Study [102] applied the convolutional blind source separation to input HD-sEMG's Decomposed Spike Count (DSC) features and residual sEMG TD features into LR for prediction.Results indicated that MU features offer more significant improvements for amputees compared to TD features.Nonetheless, the DSC feature overlooks spatial information and interactions among MUs.Therefore, [103] first convolved the Motor Unit Spike Train (MUST) and Motor Unit Action Potential (MUAP) obtained from convolution kernel compensation (CKC) decomposition to reconstruct MU images, which were then fed into CNN for predictions, significantly outperforming the DSC feature-based LR, SVR, and ANN.Research [104] and [105] also utilized the CKC for MU decomposition and LR for predictions, with [104] further refining the MU pool to identify dominant MUs.Findings revealed that the MU twitch force model proposed in [105] outperformed both DSC model and MU discharge frequency model based on the Cumulative Spike Train (CST), MUST, and PCA.
f) Other methods: Study [106] applied the Gaussian Mixture Regression (GMR) statistical model to the symmetric positive-definite matrix manifolds, achieving superior prediction performance than GMR in Euclidean space.Study [107] input time-delayed TD features into the Kernel Recursive Least Squares Tracker (KRLS-T) as an online non-linear adaptive filter for predictions, significantly outperforming ANN and Kernel Ridge Regression (KRR) due to its amalgamation of non-linear kernel regression, online adaptive estimation benefits, and lower computational cost.

4) Hand Joints:
a) State space model: Studies [108] and [109] applied the SS model based on the N4SID parameter identification method to predict finger joint angles under different static wrist postures during the mirrored bilateral movement.Additionally, the study [110] utilized Recursive Least Squares (RLS) for SS model parameter estimation and the KF for post-processing, ultimately outperforming MLP, NARX, and LDA models.
b) Linear and non-linear models: Study [111] implemented LR based on spatial filtering features for prediction, and study [112] merged ridge regression, an extension of LR, with incremental learning to achieve real-time predictions for grasping tasks across days.However, study [113] found that non-linear KRR outperformed the LR across tasks, emphasizing the importance of considering the non-linear relationship between sEMG and motion intentions.Studies [114] and [115] further compared linear and non-linear approaches, specifically the linear Vector Autoregressive Moving Average model with Exogenous inputs (VARMAX) against the non-linear Gaussian Process (GP).While their performances were comparable, GP's non-linearity could model more complex motion intentions.However, the computational cost for GP increases with growing training data.Therefore, improvements can be inspired by the Sparse Pseudo-input Gaussian Process (SPGP) regression model in the study [116] or by extracting MS features via Gaussian Process Latent Variable Model (GPLVM) as in the study [117].
CNNs: According to [124], 1D-CNN exhibited superior real-time prediction performance for finger force compared to 2D-CNN and LR due to its ability to learn deeper advanced features while reducing data dimensions and avoiding redundant spatial information.[125] suggested that using TD and FD feature images as inputs to 2D-CNN can further reduce noise and improve predictive accuracy compared to raw sEMG images.Research [126] and [127] demonstrated that 3D-CNN can learn deeper muscle anatomy, MS, and motion velocity features from multiple electrode perspectives, enabling the prediction of untrained random new movements.Moreover, [131] and [132] noted that AlexNet performs better than ResNet, LSTM, and GRU.Additionally, [133] proved that TCN outperformed LSTM, and TCN model size can be reduced to 70.9Kb with 4.76ms latency using int8 quantization.Lastly, [134] indicated that sEMG signals contain atomic segments highly correlated with movement during Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
specific time frames, suggesting the CNN kernel size should approximate or slightly exceed atomic segment length to fully extract sEMG features while minimizing network parameters.Therefore, the LS-TCN, whose kernel sizes are similar to the atomic segment length, addressed the feature extraction limitations of TCN and attained better performance than TCN, RNN, and SPGP.
CNN-RNN Hybrids: [140]'s CNN-LSTM combines the advantages of CNN and RNN, and similar to [30], it can employ transfer learning by merely retraining the fully connected layer.To address LSTM's drawback of flattening multi-dimensional inputs into 1D vectors, which leads to spatial information loss when processing spatiotemporal data, [141]'s LE-ConvMN replaces the LSTM's fully connected layer with the 2D-CNN.Moreover, by progressively reducing the dimensions of long-exposure sEMG samples, it can extract high-dimensional spatiotemporal features across multiple electrode channels.Consequently, LE-ConvMN outperformed SPGP and LSTM in prediction and generalization across subjects and joints.However, due to LE-ConvMN's lengthy training time and high computational cost, it is not suitable for real-time applications.
e) Muscle synergy features: Study [143] utilized the Common Spatial Pattern (CSP) algorithm frequently used in EEG analysis.The results show that compared to NMF, as it tends to learn the differential features between samples, CSP can more effectively differentiate fingertip force signals of the highly correlated index and middle fingers in real-time without interference from inter-finger signal crosstalk.To enhance NMF's predictive performance, study [144] introduced the Hadamard product into NMF (NMF-HP), reducing erroneous estimations of non-active finger activations.Results indicated that NMF-HP can provide more accurate real-time estimations for simultaneously activated fingers than LR, CSP, and NMF.Moreover, for the dimensionality reduction and reconstruction of sEMG, studies [145] and [146] utilized Partial Least Squares Regression (PLSR), as the extension of PCA, and NMF, respectively.Notably, study [146] successfully reconstructed the original sEMG data using only three MSs by iteratively optimizing the NMF activation coefficient matrix.
f) Motor unit neural features: Study [111] emphasized that using monopolar electrode arrays and more sEMG channels can reduce prediction errors, again demonstrating the advantages of using HD-sEMG.Therefore, studies [44], [45], [46], [47], [48], [49] extracted the sum of discharge rates of MUs related to targeted fingers and tasks from HD-sEMG using FastICA and MU pool refinement and then input them to LR, enabling simultaneous prediction of joint angles and fingertip forces under various wrist postures.Studies [128], [129], [130] first estimated the overall discharge rates of FastICA-decomposed MUs by using parallel CNN based on FFT spectrograms and RMS TD images of HD-sEMG, and then input them to LR for real-time finger force and joint angle prediction.However, it still relied on the accuracy of FastICA.
g) Attention mechanisms: The potential of attention mechanisms was implied early in [11], where the gradient boosting machine (LightGBM) model could iteratively omit insignificant features, achieving better performance than LR, MLP, SVR, and CNN.Subsequent studies [142], [147], [148], [149] further introduced attention mechanisms.They incorporated self-attention or multi-head attention modules into MLP, CNN, and ConvGRU (the combination of GRU and 1D-CNN), achieving significantly higher accuracy and generalizability than LSTM, GRU, TCN, and SPGP, with shorter training times than GRU and LSTM.Furthermore, studies [150], [151], [152]  h) Other methods: Study [153] employed KF combined with TD features for prediction.Although this method exhibited some generalizability across subjects, its predictive capability largely depends on the accurate representation of model parameters for the target system.Study [119] employed Gene Expression Programming (GEP) based on GA and genetic programming, which outperformed BPNN.Study [154] utilized a logarithmic regression model, achieving real-time estimation of grasping forces under transient and steady states while being robust to sEMG drifts and instantaneous variations.Moreover, contrary to most methods based on steady-state sEMG, study [155] employed LR with elastic net regularization for accurate real-time prediction of grip force using a single transient sEMG activation, showing promising results even for amputees.Lastly, study [156] applied the energy conservation and transfer theory, stating that kinetic and potential energy within each finger dynamically interconvert and distribute within a given muscle activation level, but the total energy across all fingers remains constant.It initially extracted MS features with ICA, then deduced each finger energy under the extreme conditions of complete fixation and free movements, finally employing ANN to learn the real-time mapping between MS features and finger energy.Although this reduced computation costs and demands on training samples, prediction errors increased with energy growth, and finger flexion predictions outperformed extension.b) Machine learning: Studies [158], [159], [160] deployed SVR based on TD, FD, and TFD features for joint angle and grip force predictions.Study [161] utilized TD features alongside Gradient Boosted Regression Trees (GBRT) built on cascaded decision trees to predict joint angles with the generalizability to untrained new data.
c) Traditional neural networks: Studies [122] and [162] indicated that TDNN could also be applied to amputees and found that RBFNN outperformed FNN, CFNN, and GRNN.However, RBFNN's prediction errors for finger grasping movements exceeded those for wrist joints.
d) Deep learning: Study [163] achieved low-latency realtime prediction with the Channel-wise-CNN model, where each kernel corresponded to an sEMG channel, and enabled transfer learning by updating only the fully connected layers.LSTM-based [164] found that using all sEMG channels outperformed one-to-one mapping.The Deep Kalman Filtering Network (DKFN) in [165] extracted advanced features via CNN and trained KF parameters using LSTM, outperforming CNN and CNN-LSTM.However, its performance was still limited by sEMG sequence lengths.The Temporal Convolution (TC) model in [12] utilized 1D-CNN and PCA for advanced feature extraction, AE for unsupervised learning of MS features, and finally RNN for real-time mapping between MS and motion intentions.This method outperformed instantaneous mixture models in MS reconstruction and predictive performance and showed generalizability to untrained new data.However, the correlation across subjects of AE-extracted MS features was low, and the suppression of inactive joint activations was limited, suggesting improvements via NMF-HP-L2 as in [42].
f) Other methods: Study [167] harnessed the statistical model based on GMM and the Hidden Markov Model (GMM-HMM), using the Viterbi algorithm and model pruning to compute state probabilities and establish long-term memory, outperforming LSTM and GRU in both accuracy and computation time.
6) Wrist-Elbow Joints: Three studies on the wrist-elbow joint combination were all conducted based on the MF method and involved LSTM.Study [168] demonstrated that LSTM is more suitable than the GA-optimized BPNN for simultaneously predicting multi-joint movement.Study [169] established that LSTM, employing correlationbased feature selection and PSO optimization, outperformed BPNN and required less training time.Lastly, study [170] integrated CNN-LSTM with self-attention and KF (Attention-CNN-LSTM-KF), achieving superior prediction performance over CNN, CNN-LSTM, Attention-CNN, CNN-KF, and CNN-LSTM-KF, further emphasizing the benefits of employing attention mechanisms and KF post-processing.

7) Elbow-Shoulder Joints:
a) Traditional neural networks: Studies [171] and [172] utilized PCA and ICA, respectively, to extract MS features with MLP and ANN for prediction, demonstrating superior source muscle activity separation using ICA compared to PCA.Studies [173] and [174] utilized TD features and BPNN for predictions, with [174] introducing an AE before BPNN input to extract advanced features through unsupervised learning.As for studies [175] and [176] based on RNN, [175] employed TFD features for prediction with GA-optimized ElmanNN, while [176] utilized time-delayed TD features with RFNN to enhance robustness to movement speed variations.However, although GA-ElmanNN outperformed both ElmanNN and GA-BPNN, GA increased computational cost for real-time prediction.
b) Deep learning: Study [177] applied Squeeze-Excitation Network (SE-Net) prior to using the TCN, where SE-Net can increase the weight of features that dominate muscle movement and TCN can overcome LSTM's lengthy training and gradient explosion issues, outperforming both BPNN and LSTM.Studies [178], [179], [180] employed LSTM and Bi-LSTM, with [178] demonstrating the superior multi-joint predictive capability of LSTM over MLP, but the temporal variability of sEMG caused LSTM's accuracy to decline over time.Both [179] and [180] verified that Bi-LSTM not only outperformed MLP, CNN, LSTM, and GUR but also effectively addressed the issues of asynchrony and tremors between sEMG and joint angles caused by muscle deformation, and with generalizability to untrained new data.Studies [181] and [182] respectively input TFD features and neural activation into CNN-LSTM for prediction, outperforming the SVR and CNN, yet requiring improvement for generalization across days.Lastly, study [183] replaced the CNN in CNN-LSTM with Short-Connection AE (SCA), achieving better performance and generalization than MLP and CNN.This is attributed to AE's ability to extract redundant information of sEMG across different target movements, which is removed by short connections, allowing SCA to extract motion-specific information akin to MS features.
c) Other methods: Similar to SCA, study [184] initially employed the correlation-based redundancy segmentation to remove redundant multi-joint sEMG before using the same SS model and KF post-processing for real-time prediction as in [54].Although it outperformed BPNN and NARX, its threshold selection method for redundancy segmentation has yet to be refined.Additionally, the study [185] solely utilized the multi-parameter combined KF based on least squares estimation for prediction, reaffirming KF's robustness for random complex movements and its excellence as a postprocessing method.
8) (Hand), Wrist, Elbow, and Shoulder Joints: Most current rehabilitation robots focus on assisting the independent movement of selected upper limb joints, rarely providing coordinated multi-joint training for the entire upper limb.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Study [186] indicated that comprehensive multi-joint coordination training for the entire upper limb brings better rehabilitation outcomes for stroke patients compared to singlejoint training.However, only nine studies based on MF methods predicted all upper limb joint combinations.
a) Traditional neural networks: Study [187] utilized TD features with GRNN to predict grasping and pushing-pulling forces under specific conditions.Study [188] found that NARX outperformed TDNN for prediction.Study [189] indicated that ElmanNN outperformed BPNN, reiterating the importance of considering contextual relationships.
b) Deep learning: Studies [190] and [191] noted that the open-loop LSTM had minor prediction errors due to uncertainties in the LSTM modeling process (e.g., number of hidden neurons and dataset sizes) and physiological effects like joint damping.Thus, they combined LSTM with Zeroing Neural Network (L-ZNN) and Noise-Tolerant ZNN (L-NTZNN) to construct error functions using ZNN closed-loop feedback to eliminate errors.Ultimately, L-NTZNN outperformed L-ZNN, L-GNN, LSTM, and GPR.Additionally, Studies [192] and [193] employed two Bi-LSTM for prediction and transfer learning, with the first as the shared network and new data only training the second personalized Bi-LSTM and the fully connected layer.
c) Other methods: Study [186] input muscle activations to the N4SID-optimized SS model for real-time prediction, while study [194] input TD features into three parallel linear-nonlinear cascaded regression decoders for lowlatency real-time prediction.However, although these decoders operated rapidly, their nonlinear static functions still had limitations, suggesting further improvements through neural networks.

IV. DISCUSSION
This discussion section firstly summarizes key findings from research conducted over the past decade on upper limb single joint and multi-joint combinations motion intention prediction.Subsequently, it critically examines current research limitations and challenges in this field, thereby proposing clear future research directions for upper limb motion intention prediction algorithms.Additionally, Table I below highlights those studies among the 186 adopted papers that were identified by author as having significant referential value for future research.
A. Significant Findings 1) Advantages of HD-sEMG Sensors and Multi-Sensor Fusion: Studies [91], [111], [126], [127] indicated that standard sEMG sensors, apart from having significantly lower spatial resolution compared to HD-sEMG sensors, are also less efficient in capturing comprehensive and high-quality muscle activation information.HD-sEMG, with its broader electrode coverage, can acquire data from multiple electrode perspectives, thereby reducing the prediction errors.It also substantially mitigates the impacts of electrode placement errors, motion artifacts, and electrode displacement.Therefore, HD-sEMG exhibits inherent advantages over sEMG in extracting MU features and improving predictive performance.Regarding multi-sensor fusion, as discussed in studies [21], [55], [79], [114], [140], [151], [160], integrating sEMG sensors with EEG, IMU, FMG, and MMG sensors can further enhance the predictive performance and robustness, especially in scenarios of isometric contractions and under external force interference.
2) Closed-Loop Feedback: Contrasting with open-loop models, studies [18], [54], [55], [190], [191] demonstrated the superiority of closed-loop models.These include constructing error functions using closed-loop feedback based on ZNN and NTZNN or employing KF post-processing based on prior knowledge and probability distribution, such as EKF and UKF, to eliminate cumulative errors inherent in open-loop models while enabling safer and more cautious control strategies.Additionally, MS and MU features based on redundant sEMG information and noise are also effective for closed-loop correction.
As for the attention mechanisms, studies [11], [62], [177] using GFWSVM, LightGBM, and SE-Net have already proven the significance of adjusting key feature weight distributions.In the author's view, sEMG signals during movement can be considered as the text or video, where each segmented sEMG time window represents a word in text or a frame in video, suggesting employing Natural Language Processing (NLP) algorithms to establish intrinsic relationships.Hence, attention mechanisms, which can adaptively capture the contextual dependencies between local and global features to enhance the weight of key features and have parallel computing capabilities, have been extensively studied recently.As indicated in Table I, besides integrating attention mechanisms with DL networks, attention-based models like Transformer and its variant BERT are recent research trends that can be further expanded.
4) Muscle Synergy Features: Considering the individual variability and time-varying nature of sEMG signals, difficulties in deep muscle sEMG signal collection, as well as the sEMG coupling and crosstalk between active and passive muscles during movements, extracting MS features that are highly correlated with each type of movement is of great significance.Studies [19] and [164] affirm this, showing that even non-participating synergistic muscles can provide valuable contextual information, and separately extracting MS features for different joints can further improve the prediction Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I STUDIES WITH SIGNIFICANT REFERENTIAL VALUE FROM THE 186 ADOPTED PAPERS
accuracy [41].Moreover, studies [32], [33], [34], [146] proved that MS features not only outperform traditional TD features but also improve performance for complex multi-joint movement prediction and are robust against the variations across days and electrode displacement.Additionally, studies [131], [132], [183] indicated that since sEMG signals consist of common components, individual differences, and random noise, MS analysis can eliminate redundant individual differences and noise as well as extract highly correlated inter-subject sEMG features to enhance the generalizability across subjects.As indicated in Table I, currently outstanding MS feature extraction methods include MCR-ALS [81] and NMF-HP-L2 [166].
5) Motor Unit Neural Features: MU signals provide a more direct reflection of neural information transmitted from the brain to muscles than sEMG signals formed by the superposition of MUAPs.This has been substantiated by studies [44], [45], [46], [47], [48], [49], [102], [128], [129], [130], which demonstrated that MU neural features, such as the DSC features, MU discharge rates, and MU images, are not only unaffected by movement speeds and contain additional information not captured by TD features, but are also more robust against various forearm postures, sEMG crosstalk, and electrode and muscle fiber displacement.Furthermore, combining MU decomposition methods like FastICA and AMICA with HD-sEMG can more effectively separate mixed signals from superficial and deep muscles.Additionally, given the more direct relationship between sEMG and MU activities compared to joint kinematics and dynamics, directly predicting MU activities using DL-based methods surpasses the traditional MU decomposition and TD feature-based methods in performance, computational efficiency, generalizability across subjects and fingers, and the robustness of long-duration prediction [128].
6) Integration of MB-MF Methods: Studies [24] and [27] indicated that since MF methods ignore the physiological relationships among the muscle activation, muscle-tendon force, joint torque, and joint motion, as well as lack model interpretability, they may fail to predict new movements not covered in training datasets and may risk overfitting reduces robustness.Although biomechanics-based MB methods can explicitly define the exact relationships between sEMG and motions, overly complex MB models with numerous physiological parameters are not conducive to real-time applications, while oversimplified MB models without considering individual differences may also increase prediction errors.Therefore, combining strengths of MB and MF methods for complementarity is proposed.For instance, as demonstrated in Table I, using RBFNN to train the muscle activation optimization factors [22] due to specific muscle activation models suitable for certain actions [19], adding physiological MSK constraints to CNN loss functions [29], or combining the Hill model with MF methods to construct SS models [54], [55].
7) Transfer Learning: Considering the inherent physiological and muscle control strategy differences among subjects, inter-subject variability significantly exceeds the intra-subject variability [67], [71].Moreover, based on the fact that extracting highly correlated inter-subject features from multi-subject Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
training can improve model generalizability, and current MB and MF methods' lengthy training time precludes the rapid deployment across subjects [68], the capability for effective and precise transfer learning becomes crucial.Current transfer learning strategies include retraining only the fully connected layers in CNN, Bi-LSTM, and CNN-LSTM networks for parameter sharing [30], [140], [192], adjusting CNN weights based on domain-invariant features and loss functions in dual-stream CNN [101], as well as the subject adversarial knowledge (SAK) strategy in [152].

B. Current Challenges
1) Experimental Protocols Supplement: As illustrated in Figure 3, over three-quarters of studies relied on selforganized experiments rather than public databases.Besides, approximately 90% of the experiments recruited fewer than 10 subjects, and fewer than, on average, 15% of subjects were disabled.Moreover, the proportion of male subjects was much higher than that of females.Additionally, since most studies employed the Ninapro database, which is predominantly composed of simple and highly controlled movements, more complex ADL-based databases, such as the KIN-MUS-UJI [13] and Biopatrec [12] databases, should be employed to more comprehensively test the developed predictive algorithm.
Regarding subjects, aside from recruiting more participants, studies [79], [81], [194] highlighted that motor control impairments and aberrant muscle activation patterns caused by neurological injuries can lead to tremors and unsmooth movements in stroke patients, potentially degrading prediction performance.Therefore, future experiments should also involve more stroke patients to assess the practicality of algorithms.Furthermore, due to the gender difference in sEMG signals leading to better overall prediction performance in males [136], future experiments should balance the gender ratio to test the algorithm's generalizability across genders.Additionally, study [177] suggested recruiting subjects with diverse ages, heights, weights, and occupations to maximize physiological variability coverage for enhancing the model's generalizability.
As for the experimental content, research [15], [16], [18], [22], [26], [54], [59], [178] indicated the need to include additional robustness and generalizability tests in future experiments.Robustness tests include sEMG signal crosstalk, drift, and electrode displacement; isometric contractions and external force disturbances; error effects caused by different upper limb postures and non-target joint movements; variations in movement speed and load; more complex random movements; muscle fatigue and skin sweating; as well as the model's long-duration predictive capability.Moreover, for the rapid changes in movement speeds, studies [62] and [61] suggested improvement through adaptive sliding windows.As for generalizability tests, since multi-joint training can provide greater therapeutic benefits for stroke patients compared to single-joint training [186] and considering current research demonstrated the poorer predictive performance in multi-joint combinations compared to the single joint, it is essential to improve the algorithm's multi-joint predictive performance and generalizability across joints, days, and subjects.
Concerning the joint distribution of studies and proportion of MB/MF methods employed, as illustrated in Figure 2(c), there is a significant scarcity of studies predicting motion intentions for the shoulder joint, wrist-elbow joint combination, and the entire upper limb.Additionally, among all adopted 186 studies, only approximately 20% of studies have involved MB methods.Therefore, future research should not only further explore the aforementioned shoulder joint and joint combinations but also intensify efforts toward MB research to enhance the potential for discovering superior upper limb motion intention prediction algorithms, even extending to lower limb prediction algorithms.
Regarding the algorithm feasibility in practical applications, since offline prediction performance is not directly correlated to the real-time prediction capability [87], and it is currently challenging to distinguish between the subjects' adaptability and the genuine contribution of the algorithm during realtime prediction [37], greater emphasis should be placed on the real-time predictive performance of future algorithms.However, as Figure 3 illustrates, over three-quarters of studies involved only offline analysis without real-time validation.In addition, since the predictive performance of algorithms can also be influenced by motor noise from exoskeletons worn by subjects [188] and the contact forces between robots and patients in human-robot collaboration [69], necessitating more rigorous robustness tests in practical human-machine interaction environments.
2) MB and MF Methods: Regarding MB methods, it is crucial to further balance the real-time predictive performance of MSK models with the complexity of model parameters while also considering the changes in muscle stiffness.Therefore, sensitivity analysis could be conducted to identify the significance of each MSK parameter, such as the tendon length proportion factor and tendon length in the Hill model, which significantly impact the predictive performance, in contrast to the pennation angle [24].MSK parameter optimization methods also need enhancement, such as the GA with lengthy optimization process and tendency to converge to local optima.Therefore, incremental and online learning could also be employed for real-time updating of MSK model parameters beyond developing new optimization algorithms superior to GA, GO, Simulated Annealing (SA), and PSO.Regarding MF methods, since the Bi-LSTM and LSTM models that cannot be trained in parallel have lengthy training times, and the Transformer model that can be trained in parallel has high computational costs, the Hill model with inherent causality could replace them to establish more robust contextual relationships.In addition, current MB-MF methods can be improved, such as by upgrading the CNN in [29] to CNN-LSTM and adding more Hill model-based physiological constraints to the loss function while considering the weight distribution between CNN loss and physiological loss.
3) Muscle Synergy Features: The MS feature extraction methods used in current studies require further improvements.In addition to effectively suppressing erroneous estimations for inactive joint activations, enhancing the correlations across days and subjects of MS features is essential.Additionally, the coordinated movement caused by mechanical coupling between joint-tendon structures and skin should be considered [144], and real-time tests should be conducted in multi-joint complex movements with potential coupling among DoFs [166].Furthermore, the challenges posed to MS extraction by different MU activation patterns during concentric and eccentric contractions need to be addressed [84].
4) Motor Unit Neural Features: Current MU decomposition and MU pool refinement still require improvements.Studies [44], [45], [46], [47], [48], [49], [105] have indicated that utilizing all MUs may obscure key information in dominant MUs, leading to prediction errors.Therefore, it is crucial to initially increase the number of decomposed MUs, followed by accurately identifying dominant MUs of joint movements.However, the process of MU decomposition is time-consuming.Although the pre-calculated MU separation matrices can reduce computation time, their performance degrades over time.Hence, exploring incremental and online learning for MU separation matrices is imperative.In addition, the similarity in movement patterns caused by extreme MU discharge rates also reduces predictive performance, necessitating the improvement of MU feature robustness in complex and extreme movements.
5) Transfer Learning: Current transfer learning approaches are still limited by computational costs and hardware reset during recalibration [101], and online Domain Adaptation (DA) methods still suffer from significant delays [71].In addition, only a few studies have considered the impact of intra-subject variability on model generalizability.Future research could explore unsupervised transfer learning strategies based on MSK physical constraints, and employ unlabeled data collected during daily practices for online parameter optimization and online transfer learning [30].

V. CONCLUSION AND FUTURE WORK
This review has comprehensively surveyed the studies conducted over the past decade on the continuous prediction of motion intentions for upper limb single joints and multi-joint combinations, detailing the MB and MF methods used in these scenarios.It is evident that integrating the strengths of MB and MF methods for prediction represents the future research trend.Moreover, to inspire future research, this review discussed the seven significant findings from past studies and the five major challenges currently faced in this field.It suggested that beyond refining subject structure, experimental content, and feasibility of algorithms in practical applications, it is also essential to enhance the robustness and generalizability of algorithms based on the physiological nature of motion intention generation and transmission, particularly focusing on improving MS and MU neural feature extraction.Therefore, future research can focus on the following aspects refer to Section IV and Table I: 1) Extracting MS features from the perspective of muscle anatomy and dominant MUs, and then integrating with attention mechanisms for feature weight adjustment.2) Establishing the biomechanical contextual relationship among MS features by using personalized MSK models.3) Implementing closed-loop feedback based on KFs, redundant sEMG information, and multi-sensor fusion.
4) Employing real-time parameter update mechanisms based on incremental learning and online learning.Regarding the limitations of this review, it may have overlooked equally valuable publications from earlier periods or other literature databases.Additionally, it may have potentially overlooked studies published in other languages and those currently under review.
Overall, in the author's view, establishing a robust, precise, real-time, low-latency, and long-duration mapping between sEMG features and motion intentions is fundamentally crucial for practical motion intention prediction.Furthermore, in addition to employing one-to-one and one-to-many transfer learning across subjects to reduce training costs, generic models developed from multi-subject training and even model libraries akin to large language models could be utilized in the many-to-one and many-to-many new subject personalization scenarios.

Fig. 1 .
Fig. 1.Continuous motion intention prediction process for both MF and MB approaches.

Fig. 2 .
Fig. 2. (a) Flowchart of the study selection process based on PRISMA strategy.(b) Number of publications per year for both MB and MF approaches over the past decade.(c) Number of MB and MF studies for each single joint and multi-joint combination.

Force-
Sensitive Resistors (FSR), as well as other biological signals like EEG, functional Magnetic Resonance Imaging (fMRI), and Near-Infrared Spectroscopy (NIRS).

Fig. 3 .
Fig. 3. Percentage distribution of different method types, offline/online predictions, subjects/databases, subject numbers, and subject attributes in both MB and MF studies.Backpropagation Neural Network (BPNN), Extreme Learning Machine (ELM), Radial Basis Function Neural Network (RBFNN), Wavelet Neural Network (WNN), and Generalized Regression Neural Network (GRNN).The T-RNNs include Elman network (ElmanNN), Nonlinear AutoRegressive with eXogenous inputs (NARX) model, Time Delay Neural Network (TDNN), and Recurrent Fuzzy Neural Network (RFNN).Regarding the 44 studies focused on DL models, these can be further delineated into four categories: Convolutional Neural Networks (CNNs: 12 studies), Advanced Recurrent Neural Networks (A-RNNs: 20 studies), convolutional-recurrent hybrid networks (10 studies), and Autoencoders (AEs: 2 studies), with the A-RNNs category comprising Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), and Gated Recurrent Unit (GRU) networks.Regarding the predictive content of the collected research, the following movements were predicted for each upper limb joint under various angular and force ranges, velocities, loads, and durations: internal/external rotation of the shoulder joint, as well as the adduction/abduction and flexion/extension in both vertical and horizontal planes; flexion/extension of the elbow joint in the vertical and horizontal planes; flexion/extension, ulnar/radial deviation, and pronation/supination of the wrist joint; the independent and simultaneous flexion/extension of the Metacarpophalangeal (MCP), Proximal Interphalangeal (PIP), and Distal Interphalangeal (DIP) joints, encompassing both single and multiple fingers movements, along with grasp tasks based on different grip strengths and different object sizes and shapes.Furthermore, the study also conducted experiments based on static isometric contraction under different intensities, compound synergistic movements, and mirrored movements, as well as the simultaneous prediction of joint kinematics and dynamics.Additionally, over one-third of these studies utilized the public NinaPro dataset, along with other public datasets (e.g., putEMG-Force[11], Biopatrec[12], and KIN-MUS UJI[13]) for the development of prediction algorithms.

1 )
Shoulder Joints: a) Traditional neural networks: Study[56] fed extracted TD and frequency-domain (FD) features into the TDNN with short-term memory capabilities for prediction.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

2 )
Elbow Joints: a) Linear models: Research [58] integrated adaptive weighted peak sEMG signals with linear least squares to predict mirrored elbow movements.Study [59] demonstrated the superior predictive performance of the Autoregressive with Exogenous Input (ARX) model over Autoregressive Moving-Average with Exogenous Input (ARMAX), Autoregressive Integrated Moving-Average with Exogenous Input (ARIMAX), and SS models under various movement loads.
utilized attention-based Transformer models for further improvements, including the BERT, CNN-Transformer, and LSTA-Conv network based on the Long-Short Time Aggregation (LSTA) module.Specifically, BERT notably surpassed LSTM, TCN, and LE-ConvMN in prediction performance, training time, and generalization across subjects.CNN-Transformer was proven to outperform CNN-LSTM, Bi-LSTM, and Transformer.Finally, LSTA-Conv, incorporating the self-attention-based Transformer and multi-scale ResNet, outperformed RNN, LSTM, SPGP, and CNN-Attention.

5 )
Hand-Wrist Joints: a) Linear models: Study[157] combined TD features with linear system identification to compare four linear timeseries models: ARX, Autoregressive exogenous Regularized (ARXR), Output Error (OE), and ARMAX.The findings indicate that the OE model demonstrated superior predictive capabilities.