Model Predictive Trajectory Tracking Control of Electro-Hydraulic Actuator in Legged Robot With Multi-Scale Online Estimator

This paper addresses the trajectory tracking problem for constrained high dynamic electro-hydraulic actuator in the presence of time-varying parameters, high frequency external load interference, measurement noise and some unmeasurable states. An adaptive robust optimal control scheme is proposed for the electro-hydraulic actuator in legged robot. The framework of our presented scheme is based on a linear time-varying model predictive controller (LTV-MPC) embedded with a multi-scale online estimator (MEKF). With fast- varying and slow- varying time scales, the MEKF part is used not only for measurable states filtering and unmeasurable states estimation, but also for time-varying parameters and external load interference estimation, which will be integrated into the mpc model in real time. The LTV-MPC part is a trajectory tracking controller designed by constrained MPC with an approximate high-precision real-time model and a rapidly solved cost function, which guarantees that the input and output constraints are satisfied during the receding horizon and optimal control process. Finally, with a series of highly dynamic conditions, the comparison experiment results show that the proposed controller has a simple design process, strong adaptive robust performance and trajectory tracking performance, which verifies the effectiveness of the control scheme.


I. INTRODUCTION
Electro-hydraulic actuators are extensively used in heavyduty electromechanical systems and legged robots, for their high load capacity and large power density ratio [1]- [7]. Presently, classical PD control method has been extensively chosen in designing position controller of electro-hydraulic actuators for most legged robots. However, in situations when parameter uncertainties and unknown load disturbances cannot be neglected, the system would exhibit unexpected dynamic behavior and be difficult to meet the performance requirements. In electro-hydraulic actuator system, uncertainties are mostly caused by changes in unknown viscous damping, physical characteristics of the valve, effective bulk modulus and external load force [8]. Moreover, some parameters and external load changes are significant under The associate editor coordinating the review of this manuscript and approving it for publication was Jianyong Yao . different operating conditions [9]. In order to improve tracking performance, several mainstream model-based control schemes have appeared in recent literature, such as adaptive robust control [10], [11], active disturbance rejection control [12], the method combining Nussbaum function and adaptive control [13], adaptive backstepping control [14], and sliding mode control [15] and adaptive extended interference observer [16], etc. However, these schemes face some common problems. (1) When the system is disturbed by high frequency and large external load, the adaptive method has the problem of high gain feedback.When system parameters change, the ESO(Extended state observer)-based controller has not only poor estimated performance, but also complicated parameter adjustment. (2) In full-state feedback,the impact of measurement noise is rarely considered and filtered. Practices show that measurement noise has become the core issue in achieving high tracking performance in some cases. The measurement signals are usually polluted by heavy noise, which seriously affects the control performance.These schemes are based on the theoretical design of noise-free, however, in practice, only various low-pass filters are used to mitigate the effect of noise, causing severe phase lag in the high-frequency range. (3) All of these schemes are based on full state feedback, which means that in addition to displacement signal, velocity and pressure signals should be required. When the actuator cannot install all of these sensors due to cost and/or structural size restrictions, such model-based control methods are difficult to realize directly. (4) Their control laws are fixed after design and not related to the physical limit constraints of the actuator, which make the transient tracking performance unsatisfactory in practical situation. In recent years, many literatures have considered the above issues. In [17], [21], Yao and Gu et al. used ESO (Extended state observer) to estimate the unmeasurable state of the electro-hydraulic system for the model-based controllers. Yao et al. [18] proposed a desirable compensation adaptive approach to alleviate the noise, the adaptive compensation and the regressor in the controller depend on the desired trajectory and online parameter estimates only. Hence, the effect of measurement noise can be reduced and then high control performance is expected.
In legged robot applications, the leg performs a large swing motion at a frequency of 3.3 Hz (high frequency). At the mean time, the external load force of actuator changes significantly from −1500(N) to 2000 (N). In addition, the system parameters such as equivalent flow coefficient, viscosity coefficient and effective bulk modulus also have large time-varying characteristics. The system parameters and external load force are collectively referred to as parameters in this paper. Furthermore, since the robot design has set strict upper limit on the actuator structural size and weight, it needs to be ensured that only displacement, driving force and acceleration sensors with very small volume and weight are configured in the actuators, resulting in the problems that velocity and pressures are not measured and the measurable state (displacement) and signal (driving force) contain noise. All in all, the designed control scheme required for the actuator can provide the suitable tracking performance by (a) effectively overcoming the time-varying parameters, external load interference and measurement noise, (b) solving the problem that some states are not measurable, (c) constraining the control input and system output within corresponding ranges, and (d) providing a real-time computational optimal control action. Therefore, this actuator needs a more robust and optimal control scheme to to meet all these requirements (a)-(d).
In the last decades, Model Predictive Control (MPC) has been proven to provide the best interference attenuation control strategy for complex dynamic systems [19], so it is fully suitable for the constrained tracking control of actuators. Therefore, this paper focuses on the MPC scheme, which has the advantages that other methods mentioned above do not have, namely, constrained receding horizon and dynamic optimal tracking control. In recent years, due to these advantages, MPC has been initially applied in electro-hydraulic servo systems. On the premise of full-state feedback, Yuan et al. [20] developed a hybrid controller with proportional integral control (PIC) and model predictive control (MPC) for a electro-hydraulic servo system. In this paper, firstly PIC is played as an inner-loop controller to tune the nonlinear system to obtain a completely new similar linear system, and secondly MPC is acted as an outer-loop controller to improve the dynamic and static performance of this new system. However, the model parameters are estimated offline in this design, without considering the time-varying parameters and dynamic external load interference. And the model used in MPC is a fixed model, which has nothing to do with the current state trajectory. Zad et al. [21] proposed a robust model predictive controller for a direct drive hydraulic position servo system in presence of unknown dynamics and uncertain nonlinearities. However, the disturbance estimation mechanism is not considered in the design resulting in limited tracking performance when fully relying on the fixed model of MPC. In fact, as a model-based method, MPC relies on accurate models to predict the future evolution of state variables to generate the optimal control sequence, so unavoidable time-varying parameters and large external interference will affect the accuracy of MPC controllers. In [22], Gu and Yao et al. developed a method combining extended state observer (ESO) and unconstrained MPC based on prior model to achieve improved tracking performance for state and large disturbance estimation. However, not only the complex parameter adjustment of ESO requires a certain amount of time and experience to try it out, but also once the prior model parameters used in ESO change, the estimated state and disturbance have large errors.
Inspired by the above literatures, this paper proposes an adaptive robust MPC trajectory tracking control scheme for constrained high dynamic actuators in legged robot, which aims to achieve the following objectives.
(1) Overcoming the time-varying system parameters, large and severe external load interference and measurement noise. (2) Estimating the unmeasurable states.
(3) Making the input and system output satisfy the constraints. (4) Reducing the design and calculation complexity of the control scheme. The developed control scheme consists of two parts: Firstly, the first part is a multi-scale online estimator (MEKF) of states and parameters with fast and slow time scales, only using the real-time collected displacement, acceleration and force data for implementing the (1), (2) as shown at the bottom of the next page, and (4) purposes. This estimator is an optimization algorithm based on extended kalman filtering. It not only requires low initial parameter accuracy, but also has a mechanism for dynamically adjusting weights, which simplifies the system design process. Secondly, the second part is a linear time-varying model predictive controller (LTV-MPC) with fast calculation capability and approximately high precision, which is used to eliminate the system output error while satisfying the input and output constraints for implementing the (3) and (4) purposes. In order to show the superiority of our proposed control scheme (MEKF-MPC), we have selected four controllers for comparison, namely PD controller, ADRC controller, MPC controller based on a priori model, and MPC with ESO estimator, which have been analyzed and compared the control performance from four aspects: disturbance estimation, tracking accuracy, tracking delay and anti-disturbance capability. In addition, the estimated performance of ESO and MEKF has been compared. Through the comparison test results with high dynamic working conditions, We found that only the proposed MEKF-MPC scheme can simultaneously achieve effective suppression of parameter time-varying, high-frequency large-load interference and noise, at the same time, achieve the simplified system design and high-precision trajectory tracking control while satisfying the input and output constraints.
The remainder of this paper is organized as follows. The single leg inverse kinematics and the actuator dynamic model are described and the actuator control problem is formulated in Section II. While Section III proposes the overall framework and calculation process of the control scheme, a multi-scale online estimator is introduced for states and parameters in Section IV, followed by Section V, which designs a linear time-varying MPC controller for the actuator. Experiments are done for validation in Section VI. Finally, Section VII summarizes the main conclusions and future work.

II. PROBLEM FORMULATION A. ONE-LEG INVERSE KINEMATICS
With three joints driven by electro-hydraulic servo actuators, the kinematics of a single leg in legged robot is shown in Fig.1. And then a mathematical description of the kinematic chain is obtained based on the D-H method. Among them, the coordinate system 00 is fixed at the geometric center of the body, where the positive direction of x 00 axis coincides with the robot's forward direction, the positive direction of z 00 axis is opposite to the direction of gravity, and the y 00 axis conforms to the right-hand rule. The coordinate systems 0, 1, 2, 3 and f correspond to the side swing, hip and knee joints, calf end and foot end, respectively. The position transformation from body coordinate system to foot end coordinate system is as follows: where l i , (i = 0, 1, 2, 3) are the length of each link, and θ i , (i = 1, 2, 3) are the angles of joint. W 0 and H 0 are the relevant dimensions of the body, and R is the radius of the hemispherical foot end.
From (1), the joint angle commanded signals are comput-ed as follows.
Set x a = [x a1 , x a2 , x a3 ] T as the actuator displacement vector in the driver space. The conversion relationship between joint space θ and driver space x a is as follows: where are the position dimensions of joint space. q ij (i, j = 1, 2, 3) are the angles of joint space.

B. ELECTRO-HYDRAULIC ACTUATOR DYNAMICS
The actuator consists of the servo valve and actuating cylinder. In legged robot application, the servo valve works in the 3.3 Hz frequency band (far less than the natural frequency of servo valve (120 Hz)), so the servo valve is modeled as a proportional link and the higher order dynamics is ignored here. Then the actuator dynamics is composed of the proportional model of servo valve and the fourth-order model of valve-controlled cylinder system. The state variables are set as , and the open-loop state equation is obtained [10], [23]. where x p is actuator output displacement,ẋ p is velocity, P 1 and P 2 are two cavity pressures, K d is equivalent flow coefficient, u is control signal for valve, P s and P 0 respectively are the supply and return pressures, m is the mass of actuator rod, B p is viscosity coefficient, A p1 and A p2 respectively are the piston and rod areas, V 01 and V 02 are the pipeline volumes of piston and rod cavity respectively, L is actuator total stroke, L 0 is piston initial position, β e is effective bulk modulus, c ip is internal leakage coefficient. Under normal operating conditions, the actuator is free from external leakage. F L is the external load force on the piston. The external load force includes inertial force, Coriolis force, gravity, friction force of rigid joint and interference force, which exhibit high dynamic characteristic with the change of cylinder displacement and joint angle. In this paper, the above forces are combined into one as an external load force, which also exhibits high dynamic characteristic.
Based on the analysis of the first-order trajectory sensitivity and parameter change characteristics, this paper has selected fast-varying parameter set θ fast = [β e , F L ] T and slow-varying parameter set θ slow = [K d , B p ] T [24]. Further, this paper sets the fast varying parameter set θ fast = [β e , F L ] T and the states [x p ,ẋ p , P 1 , P 2 ] T on the same fast scale for estimation, and the slow varying parameter set θ slow = [K d , B p ] T on the slow scale for estimation.
Remark 1: For the dynamic model of the actuator, the controller performance depends on the valid and accurate sensor measurements. This means that the presence of measurement noise will affect the performance of control system. Therefore, the control scheme should be able to overcome and suppress the measurement noise.
Remark 2: The model-based controller requires that all model states are measurable and feedback. If some states cannot be directly measured, it will cause trouble for the implementation of the controller, so the control scheme must be able to cope with the problem of unmeasurable states.
Remark 3: In the control process, the controller needs an accurate system model, but in fact, the model parameters are dynamically varying, so the control scheme must be able to estimate and apply the model parameters in real time.
Remark 4: Since the system control input and output have maximum and minimum limit constraints, the control system needs to be able to satisfy the constraint requirements to avoid problems such as inaccurate tracking due to input exceeding the limit and hardware damage due to output exceeding physical limits.
Remark 5: The control scheme should have fast calculation performance, which can guarantee a real-time solution.
Overall, the control goal of this paper is to design an adaptive robust optimal control scheme with the following performances.
(1) The highly dynamic actuator can track the desired trajectory with powerful estimation performance against time-varying system parameters, large external load disturbance with motion and measurement noise. (2) The control scheme can solve the problem that some states cannot be measured. (3) The control action u can satisfy the constrained input and output. (4) The control scheme has fast real-time calculation performance.

III. ADAPTIVE ROBUST OPTIMAL CONTROL SCHEME FOR ELECTRO-HYDRAULIC ACTUATOR
Firstly, considering the estimation of dynamic parameters (including system time-varying parameters and large external load interference), measurement noise and unmeasured states, a multi-scale online estimator (MEKF) for states and parameters is designed to obtain model parameters and states of nonlinear systems. Among them, the model parameters include time-varying parameters and large external load interference. Secondly, combined with the state trajectory-based linearization method and estimated states and parameters, a linear time-varying model predictive controller (LTV-MPC) is used to eliminate the desired trajectory tracking error in the optimal control. And its advantage is that while the system input and output are constrained, the receding horizon optimization is used to compensate for the interference VOLUME 8, 2020 and deviation of the unknown distribution. So the MEKF and LTV-MPC combined to obtain the MEKF-MPC control scheme, and detailed algorithms will be explained in subsequent Section IV and V. The scheme framework in Fig. 2 shows how to use the proposed control scheme. First, the planner and inverse kinematics are used to generate the desired trajectory of the actuator during the leg swing period. Secondly, the MEKF estimator is applied to estimate system states and parameters. Third, the LTV-MPC is used to eliminate the desired trajectory tracking error.

IV. THE MULTI-SCALE ONLINE ESTIMATOR FOR STATES AND PARAMETERS
Under the condition of limited sensor configuration, aiming at the problems for time-varying parameters estimation, measurable states filtering and non-measurable states estimation of electro-hydraulic actuators, based on the fast and slow changing characteristics of states and parameters, this paper innovatively proposes a multi-scale online estimator with a fast-varying time scale (composed of a fusion KF and a fast-varying time scale EKF) and a slow-varying time scale (composed of a slow-varying time scale EKF) to realize the real-time online estimation of actuator states and parameters.

A. SYSTEM DESCRIPTION
Firstly, the two measured values are the acceleration and displacement signal of the axial movement for piston rod (the same direction of two sensors). In this paper, the piston rod is assumed to be a particle, and the axial displacement, velocity and acceleration of piston rod are at the same centroid point. Then the axial linear motion of piston rod is described by the particle motion equation. After obtaining the real-time signal of displacement and acceleration with noise, the particle motion equation is used to estimate the particle velocity as the axial velocity of piston rod, and filter the displacement measurement signal. The measured acceleration is taken as the input signal u 1,k,l , the measured displacement is set as the output signal x p , and the state variables are set as Then the discrete time state equation is as follows.
The detailed form where χ 1, k, l is the state vector at time t k,l = t k,0 + l × t(1 ≤ l ≤ L z ), time scale k and l respectively describe slow-varying and fast-varying time scale, L z is the scale conversion limit, that is, one slow-varying time scale is equal to L z fast-varying time scales, t is the calculation time interval. u 1, k, l is the measured acceleration at time t k, l . ω 1, k, l and υ 1, k, l are process and measurement noise matrix respectively, whose corresponding covariance matrices are Q χ 1 and R χ 1 . F 1 (χ 1, k, l , u 1, k, l ) is the transition matrix, and Secondly, the state vector of actuator dynamic equation (Equation (4)) is extended by two dimensions, and the fast varying parameter set In order to distinguish them from the states and parameters with common meanings, χ 2 is collectively referred to as the generalized state vector and θ is referred to as the slow varying parameter set. Then, a multi-scale nonlinear discrete state space model including the generalized states and slow-varying parameters is obtained as follows.
The detailed form  It is important to note here that the displacement and velocity in the measurement matrix are state estimates based on the state equation (5). The specific reason for the usage is detailed in Section III-B. Where χ 2, k, l is the state vector at time is the measurement matrix. ω 2, k, l and ρ k are the process noise matrix for generalized states and slow-varying parameters respectively, whose covariance matrices are Q χ 2 and Q θ respectively. υ 2, k, l is the measurement noise matrix whose covariance matrices is R χ 2 . Based on the definition of electro-hydraulic actuator system, the goal is to estimate the generalized states χ 2 and slow-varying parameters θ from the measurement data y 1 and y 2 , which contain the acceleration, displacement and driving force with noise. The generalized states refer to the filtered displacement x p , velocityẋ p , two cavity pressures P 1 , P 2 , effective bulk modulus β e and external load force F L . The slow-varying parameters refer to equivalent flow coefficient K d and viscosity coefficient B p . The generalized states are on the fast-varying scale, and the slow-varying parameters are on the slow-varying scale.

B. A MULTI-SCALE ONLINE ESTIMATOR
The multi-scale online estimator combines fast-varying time scale with slow-varying time scale by using sensor data collected in real time. The fast-varying time scale consists of a fusion KF and a fast-varying time scale EKF, which are responsible for the estimation of generalized states, and the slow-varying time scale consists of a slow-varying time scale EKF, which is responsible for the estimation of slow-varying parameters. The values of slow-varying parameters remain unchanged at time scale l = 0 ∼ (L z − 1), that is θ k = θ k,0: L z −1 . The two-scale estimators perform stepwise estimation of the generalized states and slow-varying parameters, and the two are performed alternately with each other as input. Moreover, the estimators use the innovation from same source. The algorithm has a coupling structure that guarantee a stable closed-loop estimation of the final generalized states and slow-varying parameters. And because the state innovation is used, the algorithm adapts the state estimation through the deployment of model parameters on the basis of guaranteeing the state estimation effect. The advantage of the proposed algorithm is that it fixes two slow-varying parameters in the fast-varying time scale, reduces the dimension of generalized states that need to be estimated simultaneously and improves the estimation convergence. The generalized state dimension of algorithm is six, while the generalized state dimension of the ekf and dekf algorithms is eight.
The overall framework of algorithm is as follows. (1) In the fast-varying time scale, since there is no sensor to measure the stateẋ p in actuator's state space model (Equation (6)), the fusion KF on the fast-varying time scale uses the measured acceleration and displacement data to estimate the state vector χ 1 = [x p ,ẋ p ] T based on equation (5). Then the fast-varying time scale EKF uses the measured information (also called innovation including measured driving force data, estimated values of state vector χ 1 = [x p ,ẋ p ] T ) and slow-varying parameters θ from the slow-varying time scale to estimate the generalized states χ 2 = [x p ,ẋ p , P 1 , P 2 , β e , F L ] T based on equation (6). (2) In the slow-varying time scale, the slow-varying time scale EKF uses the same measured information and generalized states χ 2 from the fastvarying time scale to estimate the slow-varying parameters The specific calculation steps of the proposed algorithm are summarized as follows and in the flowchart as shown in Fig. 3. (1) Step 1: Initialization, set the initial parameters of fusion KF, fast-varying time scale EKF and slow-varying time scale EKF, respectively.
where χ 1, 0, 0 , P χ 1 0, 0 , Q χ 1 , R χ 1 are respectively the initial state vector, the initial value of state estimation error covariance matrix, the process noise covariance, and the measurement noise covariance of fusion KF. χ 2, 0, 0 , P χ 2 0, 0 , Q χ 2 are respectively the initial generalized state vector, the initial value of state estimation error covariance matrix, and the process noise covariance of fast-varying time scale EKF. θ 0 , P θ 0 , Q θ are the initial slow-varying parameter set, the initial value of parameter estimation error covariance matrix, and the process noise covariance of slow-varying time scale EKF, respectively. R χ 1 and R θ are the measurement noise covariances. Since the same innovation is used, R χ 2 = R θ is satisfied. dχ 2, 0, 0 /dθ − 1 ∈ R 6×2 is set as a matrix with zero elements for calculation of parameter measurement matrix C θ k . When the estimation starts, the value at time (0) is converted to the value at time (k − 1), and the value at time (0,0) is converted to the value at time (k − 1, l − 1).

(4) Step 4: The posterior estimation of fusion KF and fast-varying time scale EKF
Passχ 1, k−1, l to y 2, k−1, l in this step.
For calculation of parameter measurement matrix C θ k .
Step 5: Cycle calculation of fast-varying time scale for l = 1 : L z . When the cumulative count is equal to L z , scale conversion is performed to activate the calculation of slow-varying time scale. At this time, make the following switch.
Step 6: The posterior estimation of slow-varying time scale EKF where So far, the multi-scale online estimation of the generalized states and slow-varying parameters at time k is completed, and then it is ready to enter the cycle at time k + 1 ( end ).

V. LTV-MPC CONTROLLER DESIGN
Using the MEKF estimator, we can obtain all the states and parameters, among which the parameters have included the system time-varying parameters and external load interference. Therefore, the LTV-MPC controller can be directly designed based on the above information and dynamic model.
First, the actuator dynamics is simplified into the following nonlinear dynamic system.
where f (., .) is the state transition function, x(t) is the state vector in the N x dimension, u(t) is the control vector in the N u dimension, and y(t) is the output vector in the N y dimension. Since the dynamic model contains a large number of realtime states, according to [25], if the system is linearized near the current working point in real time, it will be transformed into a linear time-varying system (LTV), which not only significantly reduces the amount of calculations compared to directly solving nonlinear problems, but also greatly improves the control accuracy compared to linear time-invariant systems (LTI). Based on the state trajectory linearization method and the zero-order hold discrete method, the nonlinear system (15) is transformed into the following linear time-varying discrete system.
where d k =x(k + 1) − A k,t x(k) − B k,t u(k),x(k + 1) is a one-step predicted state vector calculated from the current state x(k) and input u(k) according to the nonlinear dynamic model (15). So far, the linear time-varying discrete model, which is applicable for convex model predictive control, is obtained through the linearizing of nonlinear system at any state point. By replacing the control input in Equation (16) from the control quantity u(t) to the control increment u(t). The extended state space expression is obtained by the corresponding transformation of Equation (16).
If the state quantityx(k |t ) and control increment u(k |t ) of the system at time k are also known, the system output y(k + 1 |t ) at time k + 1 can be predicted by Equation (17), and iteratively, the system output y(k + N p |t ) at time k + N p also can be predicted. If the system prediction horizon is N p and the control horizon is N c , and the following assumption is made as u(k + i |t ) = u(k + N c − 1 |t ), (i = N c , . . . , N p − 1). Then the system output in the prediction horizon can be calculated by the following equation.
where Y (t), t , t , U (t), t and t as shown at the bottom of the next page. In order to track the desired trajectory, an objective function is designed to minimize the weighted trajectory deviation and the weighted control amount and increment as follows Dynamic constraint . VOLUME 8, 2020 Input and output constraints where y ref (k +i |t ), i = 1, . . . , N p is the desired displacement trajectory, and Q, R, S, ρ are weight matrices. The first term in the objective function is used to punish for the deviation between the predicted output and the expected output in the prediction horizon, which reflects the controller's ability to quickly track the desired trajectory. The second term is used to punish the system's control increment in the control horizon, that is, it reflects the system's requirement for a stable change in the control amount. The third term is used to punish the system's control amount in the control horizon, which reflects the system's requirements for the control value amount limit. The fourth term ε is a relaxation factor, which guarantees that in the case where there is no optimal solution in the control period, the suboptimal solution is replaced instead of the optimal solution to prevent the occurrence of no feasible solution.
Due to the existence of constraints, we need to use mathematical programming methods for numerical solutions, and the optimal solution no longer has an analytical form. Then the objective function is converted into a standard quadratic form, rewritten as follows: Under the premise of knowing the system state x(k) at time k and the control amount u(k − 1) at the previous time, the optimal control increment sequence U * (t) is obtained . . .
through optimization in the control cycle. Then the first quantity of this sequence is applied to the system as the actual control increment, ie

t). (21)
Remark 6: Due to the large length of stability proof for LTV-MPC control (20)- (21), this paper will not repeat the description, and the detailed proof process has been presented in [26,Section 4].

A. EXPERIMENTAL PLATFORM
In order to verify the performance of the proposed control scheme, the one-legged motion control experimental platform has been built as shown in Fig. 4. The platform consists of the single-legged rigid body, actuators and control system, in which the electro-hydraulic actuators driving the joint mechanism have integrated the acceleration, displacement and force sensor. The control system consists of a computer, PC104 small board, ARM controller, amplifier and 16-bit A/D converter sensor. The QNX operating system runs in the controller, where control and estimation algorithms are implemented.
Since the control and estimation algorithm is similar to the double loop form, in order to ensure stability, the MEKF estimator is set to the fast loop with a frequency of 10000 Hz, and the LTV-MPC controller is set to the slow loop with a frequency of 1000 Hz. The estimator's parameter setting and estimation effect have been shown in detail in [24], and this paper will not repeat them.
The implementation of control scheme is the same as that described in Section III. Because the external load interference of hip actuator is the most severe due to the high dynamic movement of large and small links, this experiment mainly discusses the displacement tracking performance of hip actuator.

B. EXPERIMENTAL RESULTS
In order to verify the performance of the control scheme proposed in this paper, two group experiments have been designed to compare the performance of estimator and trajectory tracking. In the trajectory tracking experiment, two high dynamic desired trajectories of 3.3HZ and 6HZ frequency are tested respectively. Experiment 1 (The Estimator Performance): This group experiment compares the performance between the conventional extended state observer (ESO) and the MEKF estimator. ESO uses the nominal nonlinear model of actuator system, using only the control input and displacement output feedback information to estimate the four states and the disturbance outside the nominal model (five states). MEKF uses the control input and displacement, acceleration and driving force information to simultaneously estimate the four states and four parameters of the system, so this experiment chooses to compare the state estimation effect to evaluate the estimation ability of the estimators. And the judgment basis is the ability of displacement and driving force estimates to track the centerline of measured values. In addition, the estimated driving force of ESO is calculated by F = A p1 p 1 − A p2 p 2 . In this experiment, the evaluation indices used to estimate performance are mean square error (MSE), maximum error (ME) and average error (AE). The two estimators are tested under the 3.3HZ high dynamic desired trajectory condition. In order to make the actuator closed-loop stable tracking and fair evaluation, the controller part adopts the LTV-MPC controller. The ESO gains are given by β = 5ω , 10ω 2 , 10ω 3 , 5ω 4 , ω 5 /10 T and ω 0 = 6. The model parameter settings are shown in Table 1 and K d = 5.6×10 −8 (m 3 /(s.V)), B p = 1000 (N.s/m), β e = 1700 (MPa). In addition, external load force interference and internal dynamic changes are all estimated as VOLUME 8, 2020  external disturbance. Since conventional ESO does not have the ability to filter the noisy signal, the low-pass filtering method is used to filter the displacement measurement signal before introducing it into ESO. The initial parameter settings of MEKF are as follows: (6,2), L z = 1000. Due to the adaptive gain capability, the MEKF estimator does not require accurate setting for initial parameters. The correlation estimation comparison curves of states x = [x p ,ẋ p , P 1 , P 2 ] T (Fig. 5, 7), driving force (Fig. 6) are obtained respectively. The quantitative performance indices of displacement and driving force under the two estimators are shown in Table 2.
Performance evaluation: According to Fig. 5 and 7 (a) and the index values, the displacement and velocity estimates of MEKF and ESO are almost the same, and the displacement estimates have smoothly tracked the centerline of measured values. It can be clearly seen from Fig. 6 that the estimated driving force of MEKF smoothly tracks the centerline of measured value, however, the one of ESO fails to track the centerline from 6.5 (s), that mainly due to fluctuations and inaccuracies in the estimated values of state P 1 and P 2 ( Fig. 7 (b, c)). In addition, the quantitative indices of MEKF on driving force are smaller than those of ESO. The MSE index is only 23.8% that of ESO, the ME index ratio is 66.4%, and the AE index ratio is 20.7%. Therefore, MEKF accurately estimates the actual state of actuator. Since the states and parameters in MEKF are coupled closed-loop estimation, it also indirectly indicates that the parameters (system parameters and external interference) are accurately estimated.
The advantage of ESO is that only the displacement measurement signal is used to achieve a relatively close estimation of the state and external interference, which is of great significance for applications where the weight and volume of the actuator are greatly limited. MEKF uses the multi-sensor measurement signals of displacement, acceleration and force sensors to complete the state and parameter estimation more accurately than ESO, and has better noise filtering and adaptive gain capabilities, which facilitates the application of estimator. Most importantly, it provides more real-time and accurate system states and model for LTV-MPC, laying a solid foundation for high-precision control.

Experiment 2 (Trajectory Tracking Experiment):
The tracking performance of five controllers is compared to verify the effectiveness of the proposed control scheme. The five controllers are as follows.
a) PD controller: The closed-loop feedback values use the actuator displacement and velocity estimates obtained by the MEKF estimator in this paper. And the PD gains are k p = 9000, k d = 100. b) ADRC controller: The control structure of auto disturbance rejection controller (ADRC) is divided into three parts: tracking differentiator, extended state observer and control rate [27], [28]. The ESO uses a fourth-order nominal model (Equation (4)), which expands the external disturbance as the fifth state to obtain a fifthorder form, and its parameter settings are the same as in Experiment (1  In order to comprehensively evaluate the adaptive robust tracking performance of the proposed control scheme, these five controllers will be tested under two high dynamic desired trajectory conditions of 3.3 Hz and 6 Hz, respectively. And in the experiments, the evaluation indices used to estimate performance are mean square error (MSE), maximum error (ME), average error (AE) and delay time (DT).
Remark: In order to fairly evaluate the performance of the controllers (a, c, e), the initial parameter settings of MEKF estimator are always consistent. Except for parameter sets θ fast = [β e , F L ] T and θ slow = [K d , B p ] T , other parameters are set as shown in Table 1.
Case 1 (Tracking Effect Under 3.3Hz Desired Trajectory): As the actuator's desired displacement trajectory, its total cycle is 0.6s divided into two parts. The first part performs a high-frequency movement of 3.3 Hz in the first 0.3s, and the second part performs gentle low-frequency movements in the last 0.3s. Therefore, the tracking accuracy is mainly evaluated in the high-frequency part, and in fact, the peak error also occurs in this interval.
In the case of time-varying parameters and external load disturbance on the system, the disturbance estimation of ESO is shown in Fig. 8. The parameters, θ fast = [β e , F L ] T ,    Fig. 9, the displacement tracking errors of five controllers are shown in Fig. 10, and the tracking curves of last peak error are shown in Fig. 11, showing the tracking details in the 9.74-9.75(s) time interval. In the time interval of 9.6-9.9(s), the curve shows a high-frequency characteristic of 3.3Hz, especially in the interval of 9.7-9.8(s), it shows a drastic change, which leads to the peak error. The quantitative indices of displacement tracking error performance throughout the control process are shown in Table 3. Performance evaluation: (a) Disturbance and parameter estimation: as can be seen from Fig. 8 and 9, the disturbance estimation of ESO and the parameter estimation of MEKF have both stably converged within a certain range. Combined with the performance evaluation results from Experiment (1), the MEKF parameter estimates have well demonstrated the time-varying system parameters and large external load disturbance. In addition, it can be seen that the external disturbance changes drastically in a short time.
(b) Tracking accuracy: as can be seen from Fig. 10 and Table 3, the maximum error of the PD controller is 1.5 (mm), ADRC is 1 (mm), MPC is 1.2 (mm), ESO-MPCD is 0.8mm, MEKF-MPC is 0.7 (mm). Combined with MSE and AE index data, it can be shown that the tracking accuracy of the proposed control scheme (MEKF-MPC) is superior to the other four controllers.
(c) Tracking delay: as can be seen from Fig. 11 and Table 3, the delay index value of MEKF-MPC is 1.2 (ms), which is the smallest of the five controllers, indicating that the proposed scheme has high response speed and tracking ability.
(d) Anti-interference ability: it can be seen from the change of peak error in each interval in Figure 10, since the PD controller has weak compensation ability for disturbance, with the appearance of large disturbance, the peak error value during the movement has already changed obviously. Although without accurate estimation of time-varying parameters and external load interference, the peak error fluctuation of MPC is still less than PD, indicating that the model predictive controller has certain robustness. However, the peak error also shows a small fluctuation, indicating that the MPC based on the nominal model has limited anti-disturbance capability, and weaker resistance to stronger disturbance compared with the other three control schemes with estimator. With disturbance estimator and anti-disturbance capability, these three control schemes (c,d,e) show flat peak errors in the entire interval, of which MEKF-MPC have the smallest peak errors, most of that are around 0.6 (mm), while ESO-MPC's ones are 0.7 (mm), and ADRC's ones are 1 (mm). It shows that the control scheme proposed in this paper has better adapted to the parameter time-varying and external load interference.
The evaluation results from the above four aspects show that the proposed MEKF-MPC controller has better adaptive robustness and tracking performance than the other four controllers.
Case 2 (Tracking Effect Under 6 Hz Desired Trajectory): In order to further verify the adaptive robust performance of the proposed control scheme at higher frequency, the operating space trajectory of the swing leg is re-planned in case 2, and the desired trajectory frequency of actuator is increased to 6 Hz. And the controller parameters are the same as those of case 1. The displacement tracking errors are shown in Fig.12, and the tracking curves of penultimate peak error are shown in Fig. 13. The quantitative indices of displacement tracking error performance throughout the control process are shown in Table 4.  From all these results, it is concluded that, even under higher dynamics and no need to modify control parameters, the proposed MEKF-MPC control scheme has effectively controlled the maximum tracking error to within 0.9 (mm), and its tracking performance in the quantitative indices is significantly better than other four controllers.
In summary, the main reason why MEKF-MPC has the above good performance is the following three aspects. In the first aspect, due to its own noise filtering and adaptive gain capabilities, MEKF estimator has realized real-time accurate estimation of time-varying parameters and large external load interference, simplifies the system design process and makes the model more accurate used in LTV-MPC. In the second aspect, MEKF estimator has already completed the noise filtering of measurable state and the estimation of unmeasurable states, which meets the needs of LTV-MPC for full state feedback. In the third aspect, with fast calculation, receding horizon optimization and prediction capabilities, the LTV-MPC also has high approximation accuracy and strong robustness, and additionally realizes constraints on input and output. Therefore, in the case of limited sensor hardware conditions for actuator, this constrained high-precision tracking control of the trajectory is achieved, and the proposed MEKF-MPC controller has strong adaptive robustness response to timevarying parameters and high dynamic large external load disturbance.

VII. CONCLUSIONS
In this paper, a linear time-varying model predictive control scheme with a multi-scale online estimator has been proposed VOLUME 8, 2020 for the electro-hydraulic actuator with limited sensor hardware condition in legged robot. Not only the time-varying parameters, high dynamic external load interference with motion, measurement noise and even unmeasured state estimation are considered, but also high-accuracy fast trajectory tracking with input and output limited within the constraints has been achieved. Experimental results show that the proposed control structure design process is simple and has good adaptive robustness and trajectory tracking performance in high dynamic motion.
In the future, we will gradually apply this control scheme to legged robot to achieve high dynamic and stable motion of the overall robot. From 2013 to 2019, he was a Lecturer with NUDT. His research interests include legged robot control, nonlinear control theory, and optimal control application.