Dual-Loop Robust Attitude Control for an Aerodynamic System With Unknown Dynamic Model: Algorithm and Experimental Validation

This paper proposes two kinds of dual-loop nonlinear robust control strategies that are implemented on an open-loop unstable two-degree-of-freedom (2-DOF) helicopter system with unmodeled dynamics and uncertainties. The inner feedback loop is considered as a nominal controller realized by an existing “intelligent” proportional differential controller (iPD) while the outer layer feedback control is regarded as a compensation loop. We study two different forms of outer loop in this paper. One is model-free sliding mode compensator (MFSMC) and another is model-free data-driven compensator (MFDDC). The combination of the shared inner loop and either of the outer loops forms two different kinds of model-free robust control strategies, i.e., iPD-MFSMC and iPD-MFDDC. Both robust control approaches are validated experimentally on the attitude tracking control of a 2-DOF laboratory helicopter, whose control objective is to have the helicopter attitudes, i.e., pitch and yaw motions, track specified trajectories. To demonstrate the utility of the two control approaches, we compare them with linear quadratic regulator (LQR), optimal feedback linearization control (OFLC) and iPD, respectively. The extensive comparison of the simulation and experimental results shows that the dual-loop robust control approaches are quite promising in controlling the systems with unknown dynamical models.


I. INTRODUCTION
Flight control of unmanned aerial vehicles (UAVs) is a popular topic in recent years. Small helicopters can be considered as a special type of UAVs with multi-variable, inherently unstable and strongly coupled nonlinear dynamics. They have been frequently studied as experimental platforms for control designs. In this paper, we conduct two different model-free robust control experiments for attitude control of a laboratory helicopter and compare the proposed controls with modelbased control methods.
At present, we can retrieve a large number of papers on helicopter control from various databases. From the The associate editor coordinating the review of this manuscript and approving it for publication was Sun Junwei .
perspective of whether the controller design depends on the dynamic model of the system, the control design methods can be roughly divided into two categories: model-based and model-free. Generally, the model-based control method requires the designer to establish a mathematical model that can reflect the laws of helicopter motion, often in the form of differential equations or difference equations. There are many control methods that fall into this category, such as LQR [1]- [4], feedback linearization control [5]- [7], backstepping control [8]- [10], sliding mode control [11]- [13], model predictive control [14]- [16] and so on. When the helicopter works near the hovering point, for the sake of simplicity, a linear model can be built for controller design. The LQR design is an example. To enhance the flight performance of the helicopter in a complex environment or for maneuvering, a more accurate nonlinear system model for controller design is needed. For a specific control method, the control performance is positively correlated with the accuracy of the model. However, the construction of an accurate helicopter dynamics model is quite expensive, and there is the possibility that the designed controller may not return the optimal set of parameters in model-based design. For these reasons, the model-free control design method has been favored by scholars.
In recent years, researchers are increasingly interested in controller design and optimization based on experimental data. As a result, there are less requirements for the system model, or the model is not even needed at all. Numerous model-free controller designing and tuning techniques have been proposed based on the classical control theory, such as simultaneous perturbations stochastic approximation (SPSA) [17], iterative feedback tuning (IFT) [18], virtual reference feedback tuning (VRFT) [19], model-free adaptive control (MFAC) [20], model-free control (MFC) [21], modelfree iterative learning control (MFILC) [22], etc. These techniques are general data-driven or data-based control methods with extensive theoretical research and applications in different fields [23]- [26]. Different from other model-free control approaches, the MFC works with an ultra-local model continuously updated according to input-output behavior, which can deal with the unmodeled dynamics, system uncertainties, and external disturbances. It usually works together with a P/PD/PI/PID controller, by which the combination is known as i-P/PD/PI/PID controller [21]. Without knowing the exact dynamic model of the system, MFC only needs general knowledge about system dynamics to determine and tune controller parameters. Besides, it has large advantages in dealing with the unmodeled system dynamics, uncertainties, and external disturbances that could have a crucial effect on system stability. It should be emphasized that this control method has already been most successfully applied in many practical case-studies [27]- [30]. The key to the success of the MFC is to continuously update the unknown ultra-local system dynamics with input-output data, while generating control signals. However, many objective factors, such as derivative estimation error, control signal lag, environmental noise, etc., always introduce a certain difference between the estimated value and the real value of the system dynamics. They all affect the control performance of the system and may even lead to the instability of the controller. In this paper, we take a PD controller based on MFC as a nominal controller (iPD) to stabilize the 2-DOF helicopter system firstly. Then, two different compensation strategies are introduced to the iPD-controlled system to improve the attitude control performance of the helicopter.
The first compensator derives from a nonlinear sliding mode control (SMC) design. SMC is an easily understandable nonlinear control technique with the advantage of robustness against system uncertainties, parameter variations, and external disturbances [31]- [34]. To decrease the influence of derivative estimation error, control signal lag, and environmental noise to MFC, the sliding mode controller is used to design compensators that are embedded into the MFC and enhance its control performance [35]. Wang et al. [36] design a model-free based sliding mode controller (MFSMC) by combining the MFC and a sliding mode controller together and apply it to the attitude control of a quadrotor. In the work of [37], two model-free sliding mode control system structures are designed and validated by a set of real-time experimental results on a nonlinear laboratory twin-rotor aerodynamic system. In this paper, the model-free sliding mode compensation technique is investigated and used to improve the performance of the nominal controller. The combination between the iPD and MFSMC forms the first dualloop control structure (i.e. iPD-MFSMC) we studied in this paper.
The second model-free compensator in this paper is a datadriven offline learning technique based on integral reinforcement learning. Reinforcement learning (RL) is known in the control community as adaptive (or approximate) dynamic programming (ADP). RL can be divided into model-based RL and model-free RL according to how much knowledge of the system dynamics is known. In the scope of model-based RL, the online synchronous policy iteration (PI) algorithm [38] is used to update the actor and critic neural networks (NN) [39] based on the full knowledge of system dynamics. The integral RL (IRL) technique [40], [41] is considered as a data-based RL only using partial knowledge of system dynamics. With the combination of IRL and offpolicy scheme, the offline iterative learning is used to control the partially unknown system [42]. In the scope of modelfree RL, Zhang et al. [43] make a significant contribution in the optimal robust tracking control, which provides a solid foundation for using RL in the field of optimal tracking control of unknown general nonlinear systems. The method has two main advantages: 1) only the input-output data are required instead of an exact system model, 2) the tracking error converges to zero asymptotically in an optimal way. Based on the IRL technique and off-policy scheme [44], the data-driven RL has been used in uncertain systems [45], zero-sum games [46], [47], nonzero-sum games [48], [49], H ∞ control [50], etc. With online measurement and offpolicy learning, Zhang et al. [48] solved the continuoustime unknown nonzero-sum game with partially constrained inputs by a model-free ADP algorithm. However, this method makes sense only if the right-hand side of systemẋ = f (x)+g 1 (x)u 1 +g 2 (x)u 2 is Lipschitz continuous on a compact set ∈ n containing the origin and the system is stabilizable on . Unfortunately, few real-time applications or actual studies apply this control method to open-loop unstable systems. In this paper, we extend this method to openloop unstable dynamic systems by introducing a dual-loop feedback control strategy. The model-free iPD is considered as the inner-loop to stabilize the initial system firstly, then a model-free data-driven compensator (MFDDC) is embedded into the iPD controlled system. The mixture of iPD and MFDDC fully exploits their respective advantages, broadens their application scope, and improves their control effects. This mixed control approach (i.e., iPD-MFDDC) is the second dual-loop control structure we studied in this paper.
Both dual-loop feedback control strategies are validated on the attitude tracking control of an initially unstable aerodynamic system. Furthermore, this paper offers a thorough discussion of experimental results by a cross-comparison of different control approaches. The contributions of this study are as follows, (1) Proposed two dual-loop model-free control strategies and apply them to the attitude tracking control of an unstable aerodynamic system; (2) Carried out a thorough experimental comparison between the dual-loop control approach and other modelbased approaches. It is shown that the model-free controller with/without the proposed compensator has better performance than the model-based controller. By adding a compensator to iPD, the tracking control performance can be improved largely. Besides, iPD-MFDDC has superior compensation performance to iPD-MFSMC. This paper is organized as follows. Section II introduces the helicopter setup and reviews a model-free control strategy, i.e., iPD. The MFSMC and MFDDC are designed in Section III and IV respectively. The simulation and experimental studies are presented in Section V. Finally, Section VI concludes this paper.

II. THE HELICOPTER SYSTEM AND MODEL-FREE CONTROL DESIGN
The Quanser 2-DOF laboratory helicopter shown in Figure 1 consists of a helicopter model mounted on a fixed base with two propellers that are driven by DC motors. The front propeller controls the elevation of the helicopter pitch angle, and the back propeller controls the side to side motion of the helicopter yaw angle. The pitch θ and yaw ψ angles are measured through two high-resolution encoders. The pitch encoder and motor signals are transmitted via a slip ring, which eliminates the possibility of wires tangling and allows the yaw angle to rotate freely 360 degrees.

A. MODELING OF THE HELICOPTER
To establish an accurate model, many factors, such as the thrust force produced by the rotation of the propeller, counter torque produced by the rotation, gyroscopic effect, etc., are usually taken into consideration. However, for convenience, the helicopter body and the propellers are usually assumed to be rigid. The complex rotor aerodynamics and their interaction with the helicopter fuselage are generally simplified. The mathematical model of the 2-DOF helicopter is obtained based on the following conventions [51], 1) The helicopter is horizontal when the pitch angle θ = 0; 2) The pitch angle increases positively θ (t) > 0 when the nose moves upwards and the body moves in counterclockwise direction; 3) The yaw angle increases positively ψ(t) > 0 when the body rotates in counter-clockwise direction; 4) Pitch increases θ > 0 when the pitch thrust force is positive F p > 0; 5) Yaw increases ψ > 0 when the yaw thrust force is positive F y > 0. With the Euler-Lagrange method, the equations of the pitch and yaw motions with the servo motor voltages as inputs can be described as follows [51], where V θ and V ψ are the voltage inputs to the motors of the propellers acting on the pitch and yaw respectively. To consider the extra unmodeled dynamics and system uncertainties, we can write the system model into a statespace form equation as follows,   ẋ x 3 =θ and x 4 =ψ represent the pitch angle, yaw angle, pitch angle velocity, and yaw angle velocity respectively. f i (x), g ij and u i (i = 1, 2; j = 1, 2) represent the system drift dynamics, input factor and input voltage respectively. They have the following expressions, where f i (i = 1, 2) represents the unmodeled dynamics and uncertainties in the pitch and yaw channel respectively. The unmodeled dynamics and uncertainties may include the approximation of propeller viscous damping forces and aerodynamic forces, measurement error of system parameters, external disturbances and other factors not mentioned here. δ ij (i = 1, 2; = 1, 2) denote the uncertainties of experimental measurements of the motor voltage-torque constant and propeller torque-thrust constant. However, it is difficult or almost impossible to determine the exact form of expression for f i and δ ij in real applications.
Without an accurate dynamic model of the helicopter system, it will be quite difficult to implement the control task according to the traditional model-based optimal control methods. However, building an accurate mathematical description of unmodeled dynamics, system uncertainties, and external disturbances is usually difficult and expensive, even impossible. The conflict between the need to build an accurate mathematical model and its high cost drives our research on model-free control in this paper. We take the initiative to bypass the step of establishing a precise mathematical model of the helicopter and adopt a model-free control method to complete its attitude control task.

B. A MODEL-FREE CONTROL APPROACH
By equivalent conversion, the system (3) can be approximated for a short time window into an ultra-local model, where h i (i = 1, 2) is a continuously updated value that captures all the unknown nonlinearity and uncertainties in the input-output behavior of the system. Since the above equation is valid for a short time window, it must be updated at each sampling time, t. At time interval [k t, (k + 1) t], the value of h i is updated from the measurement of α i u i and y i in the following manner, whereĥ i (k) is the estimated value of h i at the time point k t. It will be used for the computation of the control input u i (k) later. The notation [ÿ i (k)] est is the estimated value of the second-order derivative of the output y i at the time point k t.
In this study, the first and second derivative are estimated by low-pass filter (LPF) to attenuate the noisy signals with the following transfer functions, where ω cf = 20π and ζ f = 0.85 are the cutoff frequency and the damping ratio of the low-pass filter respectively. Besides, to get the second derivative of a time sequence of the measured output, we can also take the first derivative twice. The notation u i (k − 1) is the control input of the previous sampling time point. Generally, the model-free control input can be written as whereÿ di (i = 1, 2) is the second-order derivative of the desired output and u ci (i = 1, 2) is a feedback controller used to stabilize the ultra-local system. Substituting Equation (8) into (4), and assuming that h i can be well approximated by functionsĥ i in Equation (5), we havë where e i = y i − y di (i = 1, 2) is the output error. With an iPD control strategy, u ci = k pi e i + k diėi , i = 1, 2, the model-free controller (8) reads

III. A MODEL-FREE SLIDING MODE COMPENSATOR
The key step for the success of the model-free control is lying in updating the unknown system dynamics in realtime through Equation (5). However, many factors, such as derivative estimation error, control signal lag, environmental noise, etc., always introduce a certain deviation between the estimated value and the real value of the system dynamics.
In this section, a sliding mode compensator is developed to compensate for the estimation errors of iPD.

A. SLIDING MODE COMPENSATION APPROACH
An augmented sliding mode compensator, u smc , will be designed and added to the model-free controller (10), Substituting Equation (5) into (11), the closed-loop control system can be described by the state-space equations through introducing new state variables, z 1 = e 1 , z 2 = e 2 , z 3 =ė 1 and z 4 =ė 2 , represents the unknown estimation errors. Because the state equations in (12) are decoupled, two sliding model compensators can be designed separately.
Define a pair of sliding surfaces for pitch motion and yaw motion respectively, For the pitch motion as an example, we consider a Lyapunov function According to the Lyapunov stability theoryV 1 < 0, the design of pitch motion sliding mode compensator needs to satisfy the following reaching and existence condition On the sliding surface, we impose s 1 = 0 andṡ 1 = 0, An equivalent control law can be solved, Since the estimation error 1 is unknown and can't be measured accurately, the equivalent controller can't be used to control the system directly. In an actual implementation, 1 is usually replaced by a hypothetical value function ˆ 1 , and then an additional switching term u smc,sw1 = − K 1 λ 1 α 1 sign(s 1 ) with K 1 > 0 a user-defined parameter is added to ensure stable running of the system along the sliding surface. Hence the final sliding mode compensation law reads and the estimated error and the hypothetical value function satisfies that Here, E 1 > 0 is a known constant. Normally, the discontinuous switching law in Equation (18) may cause chattering phenomenon. To tackle this issue, we follow the traditional practice by introducing a saturation function to replace the sign function in the switching law, where 0 < φ 1 < 1 is the boundary layer thickness of the saturation function. The block diagram of the proposed control structure for the model-free control with the sliding model compensator is shown in Figure 2. The block diagram of the iPD-MFSMC approach. The iPD acts as the inner-loop controller that is used to stabilize the initial system while the MFSMC serves as the outer-loop compensator, which is used to compensate the inner-loop controlled system and finish the attitude tracking task.

B. STABILITY PROOF
The stability of the model-free controller with the proposed sliding mode compensator is discussed by taking different values of the sliding mode parameter, s 1 : |s 1 | ≤ φ 1 and (18) and (19) into Equation (16), the existence and reaching condition (15) becomes If K 1 > λ 1 E 1 φ 1 |s 1 | , the reaching and existence condition (15) is guaranteed.
Case 2: |s 1 | > φ 1 Substituting Equations (18) and (19) into Equation (16), the existence and reaching condition (15) becomes If K 1 > λ 1 E 1 , the reaching and existence condition (15) is guaranteed yet. When the value of parameter K 1 is selected to max λ 1 E 1 φ 1 |s 1 | , λ 1 E 1 + η 1 with η 1 > 0 a positive constant, the Lyapunov stable condition is satisfied, i.e., V 1 = s 1ṡ1 ≤ −η 1 |s 1 |. The expression of the sliding mode compensation law added to the model-free control (10) for the pitch motion is Similarly, the sliding mode compensation law added to the yaw motion can be written as where E 2 > 0 is a known constant. It is noting that the gains K 1 and K 2 are relating to the sliding surface parameters s 1 and s 2 respectively. In practice, we need to calculate the value of each sliding surface at each sampling time t. Refer to section V-B for the constant parameters used in this section.

IV. A DATA-DRIVEN COMPENSATOR
In this section, a model-free data-driven compensator (MFDDC) that serves as the outer-loop control strategy is introduced to iPD control. MFDDC is implemented by an actor-critic neural network (NN), which learns the optimal value function and optimal compensation policy simultaneously. The dual-loop feedback control structure (iPD-MFDDC) is illustrated in Figure 3. Recalling the helicopter model with unmodeled dynamics and system uncertainty described in Equation (2), substituting the model-free control law (10) into it and considering the data-driven compensator u ddc , we can write the controlled system into the following form, where L(x, u) = 1 2 (x T Qx + u T ddc Ru ddc ) is the Lagrange function with Q ≥ 0 a semi-positive symmetric matrix and R > 0 a positive symmetric matrix. The Hamilton-Jacobi-Bellman (HJB) equation for the system (25) is The optimal control can be solved from the HJB equation, where x * is the optimal state at time t. Let u k ddc and V k (x) denote the control input and value function at the k th iteration and let u ddc denote an admissible control at the (k + 1) th iteration step, we have which implies thaṫ Besides, Equation (28) According to the integral reinforcement learning [48], [52], integrating both sides of (31) from t to t + t, the following equation is true With this updating rule, the unknown value function V k+1 and the compensation law u k ddc are no longer relevant to the system model. They both can converge to the optimal ones V * and u * ddc simultaneously [48]. For implementation purposes, the optimal value function V * and control policy u * ddc can be approximated through a critic neural network and an actor neural network respectively. The approximate solutions of (32) based on the actor-critic NN can be written asV where φ V : R n → R K V , φ u : R n → R K u are linearly dependent basis function vectors,ŵ V ,k+1 ∈ R K V and w u,k+1 ∈ R K u ×m are the estimations of unknown coefficient vector and matrix with K V and K u the numbers of hidden neurons. It is known that as K V → ∞ and K u → ∞, the approximate solutionV (x) andû ddc (t) will converge to the true solution V (x) and u ddc (t) respectively. For the special VOLUME 8, 2020 case of 2-DOF helicopter system, the parameters satisfies: m = 4 and n = 2.
Define a time sequence t j = j t with j = 0, 1, . . . , q for a large interval. The residual error of the critic NN is The residual error can be written in a compact form by introducing the Kronecker product ⊗, whereW T k+1 ∈ RK is the estimated weighting function vector withK = K V + mK u . vec(·) denotes the vectorization of a matrix formed by stacking the columns of the matrix into a single column vector. Besides, the iterative index k ∈ {0, 1, . . .}, the time sequence index j ∈ {0, 1, . . . , q}, and ρ j , π j are defined as Based on the least-squares (LS) principle, the estimated weighting function vectorW k+1 can be determined by minimizing (e k+1 j ) 2 . The solution is with The inverse of the matrix P T (W k )P(W k ) must exist, i.e., the matrix P T (W k ) is a full rank matrix. In general, the number of data points should satisfy q ≥ rank(P(W k )). Besides, the terminate condition of the updating rule is set as W k+1 −W k ≤ , where is a very small positive number. So, combining the data-driven compensator with the modelfree controller, we obtain the final control input whereŵ * u is the optimal gain parameter trained with the input and output data of the system. The model-free control law with a data-driven compensator is completely independent of the system model but only related to the input and output data of the system.

V. RESULTS AND DISCUSSIONS
We first review two kinds of model-based control approaches that are used to compare to the proposed model-free control strategies. Then, a simulation is taken to show the robustness proprieties of the model-free controller with different types of compensation mechanisms. Finally, we designed three experiments to demonstrate the compensation effect of the two designed compensators.

A. MODEL-BASED CONTROL FOR COMPARISON
By neglecting the unmodeled dynamics, system uncertainties, and external disturbances in Equation (2), the 2-DOF helicopter system can be described accurately with the system parameters listed in Table 1. The linear quadratic regulator (LQR) control [51] and an optimal feedback linearization control (OFLC) [7] are taken as two comparison baselines to evaluate the control performances of the two model-free control methods above. It's worth noting that the two modelbased control methods are based on the state equation (2) that ignores the unmodeled dynamics, system uncertainties, and external disturbances.

1) LQR DESIGN
According to [51], the pitch angle θ is regulated by a proportional integral differential (PID) with a feed-forward term, meanwhile, the yaw angle ψ is regulated by a PID controller without a feed-forward term. The nonlinear feed-forward term in the pitch angle control compensates the gravitational torque τ g = m heli gl cm cos θ in Equation (1) and reads where θ d is the desired pitch angle and k ff = 1.0 is the feedforward control gain, which compensates the gravity. The PID feedback control [u 1 ,

2) OPTIMAL FEEDBACK LINEARIZATION CONTROL (OFLC)
Let y = [y 1 , y 2 ] T = [θ(t), ψ(t)] T be the system output, and , ψ d (t)] T be the desired trajectories for the outputs. Neglecting the unmodeled dynamics and system uncertainties in Equation 2 and extracting the last two rows of this equation, we havë ]. Referring to the results in [7], the optimal feedback linearization controller (OFLC) is whereẏ e is the numerical estimation of the derivative of the output signal, Before the simulation and experiment, we first give the parameters of each controller and their selection basis. The parameters of the iPD controller (10) are selected as Based on iPD, the control parameters used in iPD-MFSMC are Furthermore, for convenience, we assume that the estimation error is zero, i.e. ˆ 1 = ˆ 2 = 0, and after multiple attempts, we set the estimation upper bounds as E 1 = 2 and E 2 = 3.
The implementation of iPD-MFDDC needs to train an optimal compensator coefficient w * u using the input-output data of the helicopter system. In this study, we artificially choose a set of probing excitation signal u p (t) = [u p,θ , u p,ψ ] T with the following form, sin(0.4t) + 2 sin(1.6t), sin(0.5t) + sin(1.9t) + sin(9.1t) .
The sampling time of each experiment is set as t = 0.005s. When the data generation experiment runs T run = 20s, we totally accumulate 4001 pairs of inputoutput data. The input data consists of the probing signal u p (t) and the iPD control signal u mfc (t) while the output data is composed of the system output y , the output derivative estimation [ẏ ] est and the integral of the system output signal t 0 y dτ . The input data and the corresponding output data are shown in Figures 4 and 5 respectively. Without loss of generality, we select a narrow time window [12s, 15.5s] that contains 700 pairs of input and output data from the whole database for updating the compensator gain w u .
To approximate the optimal solutions of the value function and control policy with actor-critic NN, we define the complete basis function vectors as x 4 ] T = [θ 2 , ψ 2 ,θ 2 ,ψ 2 , θ ψ, θθ , θψ, ψθ , ψψ,θψ] T , (55) The initial weights w V 0 and w u 0 of the two NNs are both initialized to zero. In order to get an optimal solution with VOLUME 8, 2020 FIGURE 5. The measured output data and its derivative estimation and integral value. Only the data located in the time window [12s, 15.5s] will be used to train the data-driven compensator.
sufficiently high precision, we set the iteration termination condition of the LS updating rule as W k+1 −W k < 10 −8 . After 13 times iteration, the LS updating procedure is terminated. We make the weight matrices w 13 V and w 13 u obtained from the 13 th iteration represents the gains of the optimal value function and optimal compensator policy respectively, i.e.,

C. SIMULATION RESULTS AND DISCUSSION
This section presents the simulation of step response for the helicopter control based on five different control strategies. By comparing the simulation results, we studied the robustness of different control methods against unmodeled dynamics, system uncertainties, and external disturbance. We artificially set up two scenarios for comparison: (1) Assume that the system model in Equation (2) is completely known and accurate, which means that the system has no unmodeled dynamics, system uncertainties, and external disturbance, i.e., f i = 0, δ ij = 0. This scenario serves as a baseline that aims to illustrate all five control approaches are effective.
(2) Assume that the system model has a certain degree of unmodeled dynamics, system uncertainties and external disturbance, i.e., f i = 0, δ ij = 0. In this scenario, we artificially add uncertainty and random noise to the system and take it as the real accurate model of the system. For convenience in this scenario, we assume that the uncertainty δ ij can be written as a percentage of the corresponding item in the original system. For example, we set δ 11 = −0.153g 11 , which means that we take g 11 in Equation (2) without uncertainty minus 15.3% of the original value as the exact value of this item. Similarly, we add uncertainty to the other terms in turn, i.e., δ 12 = 0.165g 12 , δ 21 = 0.137g 21 and δ 22 = −0.185g 22 . Different from the handling way on δ ij , we set f i as a serial of random number with a specific mean value and variance to represent the uncertainty and external disturbance. In this study, we set f 1 = rand(0.14, 0.27) and f 2 = rand(0. 31, 0.23), which means that f 1 and f 2 are two random sequences with mean value 0.14 and 0.31 and variance 0.27 and 0.23 respectively. Besides, we add two Gaussian white noise signals with mean value zero and variance 0.0141 to the state variables x 1 and x 2 respectively to simulate the sensor noise.
Since LQR and OFLC are model-based methods that require a completely known system model, we design them in scenario (1). On the contrary, the iPD, iPD-MFSMC, and iPD-MFDDC are model-free methods. We design them in scenario (2). After finishing the design of five controllers, we test them in both scenarios (1) and (2). Figures 6 and 7 show the output and input signals of the step responses respectively. Observing the output signals in scenario (1) alone, we can see that each control approach has good tracking control performance. It means that if we have an accurate mathematical model, all five control methods can well realize the control task. However, when we add unmodeled dynamics, system uncertainties and external disturbance to the helicopter system, the tracking control performances of LQR and OFLC become worse. Meanwhile, the control performances of iPD, iPD-MFSMC, and iPD-MFDDC are still satisfied, which means that the three model-free control strategies are quite robust against uncertainties and external disturbances. The simulation results shown in this section imply that the proposed model-free control strategies have stronger robustness compared to the model-based ones.

D. EXPERIMENTAL RESULTS AND DISCUSSION
The controls of the 2-DOF laboratory helicopter are carried out by two servo motors. To protect the system hardware from damage, the input voltages of two servo motors should be limited in the finite intervals. The pitch control voltage of the UPM-2405 DC motor is bounded by the amplifier outputs V p,max = 24V and V p,min = −24V . The yaw control voltage of the UPM-1503 DC motor is bounded by V y,max = 15V and V y,min = −15V .
We demonstrate the effectiveness of the proposed control algorithm through three tracking control scenarios: Tracking a Circle: The reference signals θ d and ψ d together form a circular trajectory, whose center coordinate and radius are with (−10, −20) and 20 respectively, Tracking a Square: The reference signals θ d and ψ d together form a foursquare trajectory, whose center coordinate and side length are with (−10, −20) and 40 respectively, Tracking Complex Trajectories: The reference signal consist of the summation of different harmonic signals, To clearly evaluate the performance difference between model-based and model-free control methods, we define a statistical indicator, where e θ = θ − θ d and e ψ = ψ − ψ d are the tracking errors of the pitch and yaw motions. Besides, we use the average integral absolute error function to quantitatively discuss the control performance of each control method, where T = 80s is the total running time of each experiment.   Figure 8 shows the experimental output of the first scenario. We can see that the control performance of the OFLC design is better than the LQR design. Compared to the modelbased control approaches (i.e., red and blue lines in each subgraph), the model-free control method (i.e., green lines in   each subgraph) has better control performance because they are closer to the desired trajectory.
The values of (62) for each control method are shown in Figure 8(d). Since there are similar outputs of the three model-free methods, we calculate their average (green line in Figure 8(d)) and compare it with the model-based method.  This graph quantitatively shows that the model-free method is superior to the model-based method. To investigate which of the three model-free control methods works best, we calculate the average integral absolute error of each method and present them in Figure 9. It proves again that the control performance is significantly improved with three model-free controls compared to LQR and OFLC. Besides, the control performance slightly improves with iPD-MFSMC and iPD-MFDDC compared to iPD. According to the quantitative calculation, the control effect of iPD-MFSMC is increased by 5.41% compared with iPD. However, the control effect of iPD-MFDDC is 9.48% higher than that of iPD. Figure 10 shows the experimental output results of the second scenario. Similar results of the first scenario can be found in this scenario. Figure 10(a)-(c) qualitatively show that the model-free control has better control performance than the model-based control. Also, Figure 10(d) quantitatively shows that model-free control is superior to model-based control. Figure 11 shows the statistical results of the average integral absolute error for each control method. We can see that the model-free control performance improves significantly compared to the model-based. It shows that iPD-MFSMC and iPD-MFDDC are about 4.93% and 5.27% higher than iPD respectively. Figures 12 and 13 show the experimental outputs of the third scenario. Subfigures (a)-(c) qualitatively indicate that the model-free control has smaller tracking errors than model-based. Subfigure (d) quantitatively shows that modelfree control is superior to the model-based control. Figure 14 shows the statistical results of the average integral absolute error for each control method. We can also see that the performance of iPD-MFSMC and iPD-MFDDC are about 5.58% and 9.68% higher than that of iPD respectively.

VI. CONCLUSION
This paper investigates two model-free compensators, the MFSMC and the MFDDC. They are used to compensate for a nominal iPD controller. The compensated nonlinear controller is a dual-loop robust model-free controller with iPD as the inner loop and the compensator as the outer loop. The proposed dual-loop model-free control algorithms are validated by simulations and experiments on a nonlinear laboratory helicopter setup. The cross-comparisons between two robust model-free control methods and two model-based control methods show that: 1) the tracking performance of the dual-loop robust model-free control approaches are superior to that of the model-based method, 2) with a compensator (MFSMC or MFDDC) embedded in the inner loop controller, its tracking performances are improved significantly, and 3) the compensator MFDDC performs better than the compensator MFSMC slightly. The results in this paper indicate that the proposed dual-loop robust modelfree control approaches are quite promising in dealing with the control tasks for the systems with unmodeled dynamics and uncertainties, even with unknown dynamical models. This is a very favorable property to potential practical applications.