Robust Adaptive Control of Maximum Power Point Tracking for Wind Power System

A novel data-driven robust approximate optimal Maximum Power Point Tracking (MPPT) control method is proposed for the wind power generation system by using the adaptive dynamic programming (ADP) algorithm. First, a data-driven model is established by a recurrent neural network (NN) to reconstruct the wind power system dynamics using available input-output data. Then, in the design of the controller, based on the obtained data-driven model, the ADP algorithm is utilized to design the approximate optimal tracking controller, which consists of the steady-state controller and the optimal feedback controller. Ulteriorly, developing a robustifying term to compensate for the NN approximation errors introduced by implementing the ADP method. Based on the Lyapunov approach, it proves the stability of the designed model and controller to show that the proposed controller guarantees the system power asymptotically tracking the maximum power. Finally, the simulation results demonstrate that the control method stabilizes the tip speed ratio near the optimal value when the wind speed is lower than the rated wind speed. Moreover, the tracking response speed of the proposed method is fast, which enhances the stability and robustness of the system.


I. INTRODUCTION
With the rapid development of the social economy, traditional energy sources are gradually depleted. Therefore, the development of sustainable renewable energy is imminent. As a widely distributed clean and renewable energy, wind energy has received universal attention in the world and has delightful application prospects. To protect the environment and slow down the energy depletion of the world, the installed capacity of wind power continues to grow worldwide [1].
The wind power generation system cannot convert all the captured wind energy into electrical energy. The wind energy utilization coefficient is an index indicating the conversion efficiency of wind energy into electrical energy. The wind energy utilization coefficient varies with wind speed, so the maximum power point of its power output changes with wind speed. Since the randomness of wind speed and the nonlinearity of the wind power generation system, its output characteristics are greatly affected by the external environment, it is The associate editor coordinating the review of this manuscript and approving it for publication was Christopher H. T. Lee . difficult to capture the maximum power point. The conversion efficiency of the wind power generation system is low, and the equipment is expensive, it is necessary to improve the utilization rate of wind energy and increase economic benefits through the maximum power point tracking (MPPT) control [2].
There are many control strategies to achieve MPPT, mainly include the tip speed ratio method, power curve control method, three-point comparison method, mountain climbing method, etc. [3]. Researchers at home and abroad have also proposed a large number of control methods for the nonlinear characteristics of wind power generation systems. In [4], the researchers present a fuzzy PID compound controller, which combines fuzzy control and traditional PID control technology to achieve the maximum power output of the generator, but its control accuracy is not high. Reference [5] proposed a fuzzy neural network controller that combines neural network and fuzzy logic control. It uses indirect vector control and reactive power control technology to control the power transmission between the machine and the power grid, but control rules of fuzzy logic are complex, and its design process is based on experience. When it is difficult to obtain an accurate wind turbine power curve, a method that combines sliding mode control and mountain climbing search algorithm is proposed in [6]. This method can track the optimal power curve without measuring wind speed, but it complicates the modeling process and its requirements for parameter selection and adjustment are high.
Adaptive dynamic programming (ADP) is an effective intelligent control algorithm, which plays an important role in searching for optimal control solution [7]. Compared with the intelligent control method mentioned above, adaptive control [8] can make the controller have the ability to automatically adjust the learning process when the environment or the system changes, which is very useful for the wind power generation system with many uncertain disturbances. As an effective method of approximating mapped input-output data, it has applied a neural network in many nonlinear control systems [9]- [11]. The neural network can adjust the network weight on-line adaptively without pre-training, which can guarantee the stability and performance of the closed-loop system. Through the above analysis, we can see it that the application of adaptive neural network control to uncertain nonlinear dynamic systems such as wind power generation systems has relative significance. Therefore, in this paper, we propose a data-driven robust adaptive control (DRAC) method for maximum power tracking of the wind power generation system, which changes the wind energy utilization coefficient by controlling the rotation speed of the wind turbine to keep the wind energy utilization coefficient at the maximum value, so as to guarantees the system to asymptotically track the maximum power when the wind speed is under the rated wind speed. Then, by adding a robust compensation term to improve the robustness of the system to the disturbance, reduce the uncertainty of on-line estimation, and speed up the convergence speed of the system.
A. RESEARCH CONTRUBUTIONS 1) Based on the nonlinear equations of wind power generation and adaptive dynamic programming algorithm, combined with recurrent neural network (RNN), we propose a data-driven robust adaptive control method for maximum power tracking of the wind power generation system. The construction of the datadriven model only requires the available input-output data of the system, which reduces the requirements for system dynamics knowledge. 2) Combine adaptive dynamic programming, actor-critic structure [12] and neural network to design the adaptive controller. The adaptive dynamic programming algorithm is used to approximate the optimal feedback control input value so that the control input error converges to a tiny range, which proves the tracking effectiveness of the controller. 3) we design a robust compensation term to make the control input error converge to zero, reduce the influence of uncertain disturbance terms on the tracking effect of the wind power generation system, and enhance the robustness and performance of the controller.

B. PAPER STRUCTURE
The rest of this paper is organized as follows. In section II, some preliminary knowledge and preparation for the subsequent controller design are introduced. In section III, we introduce the construction of the data-driven model of the wind power generation system and its stability proof. The design of the robust adaptive controller and its stability analysis is introduced in detail in section IV. In section V, the simulation experiment in MATLAB is given. Finally, the conclusion and prospects are provided in section VI.

II. PROBLEM FORMULATION AND PRELIMINARIES
In this section, we briefly introduce the wind power system model, the principle of maximum power tracking control strategy based on the tip speed ratio method, and the related knowledge involved in the control method proposed in this paper.

A. WIND POWER GENERATION SYSTEM MODEL AND MAXIMUM POWER TRACKING CONTROL
According to the aerodynamic characteristics of the wind power generation system [13], [14], the wind energy captured by the wind turbine is the active power output P m of the pneumatic machinery, which is defined as: where ρ, R, v and C p denote the air density, blade radius, wind speed and wind energy utilization coefficient, respectively. λ is the tip speed ratio, which can be measured by the ratio of the blade tip circumferential speed to the wind speed, which can be defined by (2): where ω is the angular velocity of wind rotor. C p is a nonlinear function of tip speed ratio and pitch angle, and its mathematical model can be obtained according to the method of numerical approximation calculation [15], [16]. The empirical expression equation is where β is the pitch angle; is the intermediate variable. It can obtain the curve of the wind energy utilization coefficient C p (λ, β) function according to (3)-(4), as shown in Fig.1.
According to (1), when the wind speed is stable, the active power obtained by the wind turbine is proportional to the wind energy utilization coefficient C p . The higher the wind VOLUME 8, 2020 energy utilization coefficient is, the higher the wind energy conversion efficiency will be, and the greater the power of the wind turbine will be. Then, further observation of Fig.1 shows that within a certain range, the smaller the pitch angle, the greater the wind energy utilization coefficient; when the pitch angle is fixed, there is an optimal tip speed ratio λ opt to maximize the wind energy utilization coefficient. Therefore, when the wind speed is less than the rated wind speed, fixed the pitch angle β= 0. By controlling and adjusting the wind turbine speed to the optimal angular speed ω opt , λ can reach the optimal tip speed ratio λ opt , and the wind energy utilization coefficient can reach the maximum C pmax , to obtain the maximum output power. In this way, the maximum power point tracking (MPPT) control of the wind power generation system can be realized. It shows the schematic diagram of the control strategy principle in Fig.2. If it ignores the transmission damping between the wind turbine and the generator, the simplified dynamic equation of the wind power transmission system is shown in (5) [17].
where J r , J g , N, and T e denote the inertia of the rotor, inertia of the generator, transmission ratio and electromagnetic torque of generator, respectively; T a is the torque of pneumatic machinery, and its expression can be obtained by combining (1) and (2), as shown in (6).
In this paper, the data-driven model of the wind power generation system is simulated based on the above (1)-(6).

B. RECURRENT NEURAL NETWORK
Recurrent neural network (RNN) is a type of neural network that takes sequence data as input and establishes a full connection between neurons between layers [18]. RNN has the characteristics of memorability, parameter sharing, and Turing completeness, so it has certain advantages in learning nonlinear features of sequences [19]. Therefore, based on the RNN model, this paper learns the nonlinear characteristics of the wind power generation system and constructs its datadriven model.  X t = {X 0 , X 1 , . . . , X τ } is the given time series, the length is τ . The evolution direction of this sequence is called ''timestep''. h t is the system state of the recurrent neural network, s t is the internal state, which is related to the system state.
where U and W are the weights of the neural network, and b is the bias;σ represents the activation function. The output function of the current hidden layer o t is defined as follows: where V and c are weight coefficients. The weight update can be iterated using the gradient descent algorithm until the required parameters are obtained. However, due to the activation function, with multiparameters, the problem of gradient disappearance and gradient explosion is prone to occur in the learning process of recurrent neural networks, so it is necessary to take corresponding measures to solve such problems [18], [19].

C. ADAPTIVE DYNAMIC PROGRAMMING
Adaptive dynamic programming (ADP) is a novel approach for approximately solving nonlinear optimal control problems [20], [21]. It effectively overcomes the difficulty of solving the Hamilton-Jacobi-Bellman (HJB) equation and the ''dimensional disaster'' problem in dynamic programming (DP) [22]- [23]. The HJB equation is a partial differential equation, and its solution refers to a real-valued function that satisfies the minimum value of the performance function of a specific dynamic system. Taking a continuous system as an example, the corresponding HJB equation of the system F [x (t) , u (t)] is derived as shown in equations (10)-(12) [24].
is the utility function related to the system state x (t) and the control input u(t);V * is the derivative of V * (x (t) , t) with respect to time t, ∇V * is the gradient of V * with respect to state x. According to Bellman optimal principle, the goal of optimal control is to find an admissible control u(t) that minimizes the performance function and satisfies (12). By observing equations (10)- (12), we know it that the solution of the HJB equation is difficult. Therefore, the ADP algorithm uses a function structure to approximate the solution of the HJB equation, and then uses the offline iteration or online update method to obtain the approximate optimal control solution.
The Dual heuristic dynamic programming [25] (DHP), one of the basic structures of ADP, is shown in Fig. 4. In Fig.4, it uses a control network to map the relationship between the system state and the control input, and the control input is output; the output of the critic network is used to estimate the gradient of the performance function V (x (t) , t). In this paper, the ADP algorithm is applied to the MPPT control strategy of the wind power generation system. The difference in the angular speed of the rotor rotation is used as the input, and the generator torque control value is the output. The specific steps are explained in the Algorithm 1.

III. DATA-DRIVEN MODEL OF THE WIND POWER GENERATION SYSTEM
This section discusses the establishment of a data-driven model of wind power generation systems based on recurrent neural networks. The existing state values and control values of the system are used as input-output data of the neural network. The neural network uses these data for learning and

Algorithm 1 Dual Heuristic Dynamic Programming
Initialize critic network V x, u | θ V and control network µ (x | θ µ ) with weight θ V and θ µ .
according to the current policy Execute action u (t) and observe new state Update critic by minimizing the loss Update control by minimizing the loss E a = 1 2 u(t) − argmin u y i 2 end for training and then gets the trained model. Finally, its stability is analyzed.

A. ESTABLISHMENT OF DATA-DRIVEN MODEL
First, we rewrite the simplified dynamic equation of wind power transmission system (5) to the standard affine form: where u = T e is the input of the network, G (ω, v) and g is: Then, the data-driven model of wind power generation system is constructed by on-line approximation based on recurrent neural network, rewrite (13) as RNN [26]: is a bounded neural network approximation error term; A * , B * , A * u and B * u are unknown ideal weights, and g is a constant, which can be obtained by (15).
The activation function f (·) = tanh(·) in (16) is a monotonically increasing function that satisfies: for any x, y ∈ R and x ≥ y, k > 0. Based on (16), the data-driven model of wind power generation system can be constructed aṡ (18) where d (t) defined as: VOLUME 8, 2020 whereω (t) is the estimate of ω (t);Â (t),B (t),Â u (t) and B u (t) are the estimate of unknown ideal weights at t; e m (t) = ω (t) −ω (t) is the modeling error of ω (t);S ∈ R is a given design term;κ(t) ∈ R is an additional tunable parameter, the role is to increase the flexibility of neurons and improve the fitting ability; η > 1 is a constant.
It can make the following assumptions [27]: Assumption 1: The upper bound of ε(t) is defined by a function related to e m (t) in the following form: where κ * is the target of bounded constant.

B. STABILITY ANALYSIS OF DATA-DRIVEN MODEL
From (16) and (18), the dynamic equation of modeling error e m (t) iṡ . Theorem 1: Update the weights and parameters of RNN according to the rules of equations (22)- (26). When t → ∞, the modeling error e m (t) gradually converges to zero [28].
The update rules of the RNN weight and an additional tunable parameterκ are formulated as follows: where γ i is a positive number, i = 1, 2, 3, 4, 5. To prevent gradient explosion during weight update, the weight matrix is normalized after each iteration.
Proof: According to the Lyapunov Theorem, the Lyapunov function is constructed as follows: where Then calculate the derivative of J (t) along the trajectory of e m (t) (21) with respect to time: From (17), we can get: According to Assumption 1, we have Therefore (28) can be rewritten aṡ Computing the time derivative of J 2 (t) Combining (31) with (32), we can obtaiṅ where F = A * + 1 2 B * 2 + 1 + 3κ * + k 2 + S. S is selected to make F< 0. Therefore, it can be concluded thatJ (t) < 0. Since J (t) > 0, it follows from second method of Lyapunov that e m (t) → ∞ as t → ∞.
This completes the proof. According to the results of Theorem 1, since e m (t) → ∞ as t → ∞,Â (t) ,B (t),Â u (t), andB u (t) all tend to be constant matrices, which are denoted as A, B, A u , and B u , respectively.
Consequently, the driven-data model of the wind power system can be obtained aṡ The design of the adaptive controller will be based on the data-driven model (34).

IV. DESIGN OF ADAPTIVE CONTROLLER AND ROBUST TERM A. DESIGN AND ANALYSIS
According to the dynamic characteristics of the wind power generation system, when the wind speed is under the rated wind speed, to obtain C pmax , β can be fixed at 0 • . ω can be adjusted by controlling T e , so that it can track to the optimal angular velocity of the wind rotor ω opt to achieve maximum power point tracking (MPPT) control. ω opt can be regarded as the desired angular velocity of the wind rotor ω d in MPPT control, which defined as Combining (34) and (35), assume that the desired trajectory of ω d (t) has the following form: where u d (t) is the desired control input, which is T e .
By subtracting (34) and (36), we can get the dynamic equation of the error system, which can be formulated aṡ where It can be seen that the controller u (t) is composed of two parts, namely, the steady state controller u d (t) and the error feedback controller u e (t).
The objective of the optimal feedback controller design is to find the optimal manner in the instantaneous state and stabilize the state tracking error dynamics. In the following, for the convenience of description, e (t), u d (t), u e (t), u (t), and f e (t) can be abbreviated as e, u d , u e , u, and f e , respectively.
According to the optimal control theory and (10), the infinite horizon performance function of the wind power generation system is transformed into where r (e, u e ) = Qe 2 + R c u 2 e is the utility function, R c and Q are positive definite constants. The optimal feedback controller is to find the optimal error feedback admissible control u * e , so that stabilize the state tracking error dynamics of the rotor angular velocity and make the performance function V (e) to obtain a minimum [29]. Then, define the  where V * e = ∂V * (e) /∂e, V * (e) is the minimum value of V (e).
The steady state controller u d can be obtained by solving (36). In summary, the complete optimal control input is u * = u d + u * e . Based on the above analysis, it applies the ADP Algorithm to the design of an optimal feedback controller. The specific data flow of the controller using the actor-critic structure is shown in Fig.5. The action network approximates u * e and the critic network approximates the performance index V (e). The main concrete steps are as follows: 1) Calculate the optimal angular velocity of the wind rotor ω opt (t), which is ω d (t). Then subtract ω d (t) and ω(t) to obtain the error e(t); 2) The action network takes e(t) as input, and calculates u e (t) as output of the action network; 3) The critic network takes e(t) as input, and calculates the performance function value V (e) and the derivative V e with respect to e; 4) The critic network uses r (e, u e ) and u e (t) for learning and training, and iteratively updates the weight parameters of the critic network, V (e), and V e ; 5) When the critic network updating, the action network uses the updated V e obtained in step 4 for learning and training, and iteratively updates the weight parameters of the action network and u e (t); 6) When the objective function value of the action network and the critic network is less than the threshold value or the iteration number reaches the maximum value, the final u e (t) is output. The steady-state control input u d (t) and robust compensation term u r (t) are added to obtain the overall control input u(t). Repeat Step 1 to 6.
The optimal feedback controller design using the ADP method, which is implemented by employing the critic network and action network. The design of the neural network is given in part B and part C.

B. CRITIC NETWORK DESIGN
In the design of the critic network, we use a neural network to approximate the performance index V (e), which is defined VOLUME 8, 2020 as follows: where W c is the unknown ideal constant weights of the critic network and ∅ c (e) : R → R N c is called the critic network activation function vector, N c is the number of neurons in the hidden layer, and ε c is the critic network bounded approximation error. The derivative of the performance index function V (e) with respect to e is where ∇∅ c ∂∅ c (e)/∂e and ∇ε c = ∂ε c /∂e. Define the estimate of W c asŴ c , then we get the estimate of V (e) asV Combining (39), the approximate Hamiltonian function can be derived as follow: Define the objective function of the critic network as The weight update law for the critic network is a gradient descent algorithm, and the update rule is defined as: where l c > 0 is the adaptive learning rate of the critic network, σ c = σ σ 2 +1 , σ = ∇∅ c (e + Bf e + gu e ). From the definition of σ c , we know that there exists a positive constant σ cM > 1 to make σ c ≤ σ cM .
where ε HJB the residual error due to the neural network approximation. Define the weight estimation error of critic network to bẽ W c =Ŵ c − W c .Rewriting (46) by using (47), we havė

C. ACTION NETWORK DESIGN
Similar to the design of the critic network, the action network is used to approximate u e , which is defined as follows: where W a is the unknown ideal constant weights of the action network and ∅ a (e) : R → R N a is called the action network activation function vector, N a is the number of neurons in the hidden layer, and ε a is the action network bounded approximation error. Define the estimate of W a asŴ a , the actual output can be expressed asû The feedback error signal used for tuning action NN is defined to be the difference between the optimal feedback control input u * e and the actual output of the action networkû e , and combined with control input minimizing (43) as Define the objective function of the action network as The ultimate objective of the action network is to make E a (Ŵ a ) as zero, on which the weight update rules of the action network are defined. The weight update law for the action network is a gradient descent algorithm, which is given bẏ where l a > 0 is the adaptive learning rate of the action network. Combining (40), (41), and (49), we can have Define the weight estimation error of critic network to bẽ W a =Ŵ a − W a , Combining (53) with (54), we havė where ε ca = − ε a + R −1 c g∇ε c /2 . It is important to note that to ensureŴ a andŴ c converge to W a and W c respectively, the tracking error must be persistently excited sufficiently under the persistent excitation condition [29]. Further, the persistent excitation condition ensures σ c ≥ σ cm and ∅ a ≥ ∅ am , with σ cm and ∅ am being positive constants.

D. CONTROLLER STABILITY ANALYSIS
After the design and analysis of the controller, the final overall control input is written as Substituting (50) into (37), (37) is rewritten aṡ Combining with (49), subtracting and adding W T a ∅ a (e) to (57), (57) is rewritten aṡ In the following, the stability analysis is performed. First, make the following assumptions [28], [30]: Assumption 2: 1) W c and W a are upper bounded so that W c ≤ W cM and W a ≤ W aM , respectively; 2) ε c and ε a are upper bounded so that ε c ≤ ε cM and ε a ≤ ε aM , respectively; 3) ∅ c (·) and ∅ a (·) are upper bounded so that ∅ c (·) ≤ ∅ cM and ∅ a (·) ≤ ∅ aM , respectively; 4) The gradients of the critic network approximation error and the activation function vector are upper bounded so that ∇ε c ≤ ε cM and ∇∅ c ≤ ∅ dM , respectively; And the residual error is upper bounded so that ε HJB ≤ ε HJBM . Based on the current problem settings and controller design, the above assumptions can be reasonably satisfied. After making assumptions, we can start to prove stability. Choose the Lyapunov function as follows: The stability proof of the controller with robust terms is similar to the stability proof in section III. Choose the same Lyapunov function, according to (68) and (69), we can geṫ Equation (72) and (60) select the same parameters, we havė L (t) ≤ 0. Because the variables on the right-hand of (71) are bounded,ė is also bounded. From (72), we havė Integrating both sides of (73) and then organize according to the mean value theorem, we have since the right side of (71) is bounded, e ∈ L 2 . According to Barbalat's lemma [31], we have lim t→∞ e = 0. Similarly, we can prove that lim t→∞ W c = 0, lim t→∞ W a = 0, and u ad − u * ≤ ε aM .

A. ENVIRONMENT AND PARAMETERS
The simulation environment of the experiment is shown in Table 1. We simulate the wind power generation system in MATLAB and record the system state data and control data during the simulation [32]. After that, the data is preprocessed, and use the pre-processed data as input-output data for the data-driven model learning. Finally, we get the completed data-driven model of the wind power generation system. This paper uses the parameters in Table 2 to simulate the wind power generation system. Based on the data model, we use the tip speed ratio method as the MPPT control strategy for wind power generation systems. Then, carry on a simulation experiment to the DRAC control method.

B. EXPERIMENT
The simulation uses the four-component combination wind speed mathematical model to simulate the actual wind speed  change signal [33], [34], and it shows its related parameters in Table 3.
We use the parameter values in Table 3 to simulate the wind speed of the wind farm, and the variation curve of the wind speed simulation signal is shown in Fig.6. It considers various conditions such as basic wind, gust, gradient wind, and random wind, and it conforms to the distribution law of actual wind speed. In the actual combined wind speed environment, the maximum power tracking effect of the DRAC control method is compared with the traditional PID control method and the particle swarm optimization PID (PSOPID) control method [35]. The control input of PID and PSOPID is the same as DRAC, which is the electromagnetic torque T e of the generator. The fitness function Q PSO of PSOPID is defined as When Q PSO ≤ 0.5, the algorithm stops iterating. Through experimental simulation, the tracking error variation curve of the angular velocity of the rotor of the DRAC in a sinusoidal wind speed environment with an average wind speed of 7 m/s is shown in Fig.7. We can observe from Fig.7 that the tracking error can converge, and the convergence speed is fast. At the same time, it can also be verified that the controller stability proof is correct. The angular velocity tracking error variation curve of DRAC, PID, and PSOPID under the actual combined wind speed environment can be obtained, and the comparison curve is shown in Fig.8. By observing Fig.7 and 8, we can find that the DRAC control method has a similar error convergence rate in the sinusoidal wind speed and the combined actual wind speed, and it is not more difficult to converge due to the increased randomness and variability of wind speed. It shows that the control effect and tracking reaction speed of this control method has no obvious relationship with the change rule of   wind speed, and it has strong robustness. In Fig.8, we can observe that although PSOPID has a smaller variance in the tracking error of the angular velocity of the rotor compared to the traditional PID, the error is still larger than that of DRAC, and the tracking error of DRAC is much smaller than PID and PSOPID.
The maximum power tracking curves of different algorithms are shown in Fig.9. To observe the curve more clearly, Fig.9 is partially enlarged, as shown in Fig.10. From Fig.9 and Fig.10, we can observe that the power variation curve of the DRAC control method almost coincides with the maximum power curve, which proves that the tracking effect is the best. At the turning point where the wind speed changes suddenly, the maximum power tracking error of DRAC is small, indicating that the control response speed is fast. Compared with DRAC, the tracking effect of PSOPID is poor, especially at the turning point, the control response speed is slow, but the tracking effect is better than the traditional PID method. Through the above analysis, we can know that all three methods can effectively track the maximum power, but the DRAC control method proposed in this paper has the smallest error and fast-tracking response speed, is relatively more stable, and can track the maximum power better.
With the rapid development of computer performance, the computing power of computers is becoming more and more powerful. The algorithmic complexity of the control algorithm is no longer the primary consideration, but the control performance of the algorithm is the primary concern. Therefore, by using the aerodynamic efficiency performance index ξ aero to compare the performance of the DRAC method with other control methods [14]. ξ aero reflects the effect of tracking the maximum power, and its expression is shown in (76). where P opt = 1 2 ρπR 2 v 3 Cp max .
Comparing the data in Table 4, it can be observed that the maximum power tracking effect of DRAC is improved by nearly 13% compared with PID, and it is improved by nearly 1.5% compared with PSOPID, which is more than 10% better than NSSFE. The tracking effect of DRAC is compared with NDSFE also increased by about 7%. Although DRAC and NNC have similar effects, their ξ aero values are close to 100%, which indicates that their wind energy capture optimization efficiency is high, which is conducive to the improvement of wind energy utilization and can effectively track the maximum power.
In this section, it can draw the following conclusions through simulation experiments: The control method proposed in this paper can achieve the desired effect in an environment with random variation of wind speed, has strong robustness, and can better track the maximum power.

VI. CONCLUSION
In this paper, based on the nonlinear equations of wind power generation system and adaptive dynamic programming algorithm, combined with a recurrent neural network, a datadriven robust adaptive control (DRAC) method for maximum power point tracking of wind power generation system is proposed. Simulation experiment results show that the DRAC has a better tracking effect and improves the efficiency of wind energy conversion. At the same time, because it is driven by a data model, this method only requires the inputoutput data of the wind power generation system, and the requirements for internal dynamics knowledge are low.
However, the DRAC has limitations and is currently only applicable to wind speeds below the rated wind speed. In future research, we will try to combine the latest machine learning technology and propose an MPPT control method that can be applied above the rated wind speed. Further to solve the shortcomings of this method, which requires a sizeable amount of computation and a long time to train the neural network. We will consider more wind power systems and integrate machine learning with wind power systems to improve control performance.