Ultra-Short-Term Wind Power Prediction by Salp Swarm Algorithm-Based Optimizing Extreme Learning Machine

Wind power generation accounts for an increasing proportion of the power grid, so efficient and accurate real-time wind power prediction is particularly important for wind power grid. In view of the strong randomness and fluctuation of wind and the difficulty of predicting wind power, a Salp Swarm Algorithms-Extremely Learning Machine (SSA-ELM) based ultra-short-term wind power prediction model is proposed. In this case, the multi-input sample set is composed of historical wind speed, temperature, wind direction, atmospheric pressure and other climatic factors that are highly correlated with wind power, and the network parameters are determined in the training process. In order to improve the adaptability and accuracy of the prediction model, the input weight matrix and hidden layer deviation of the Extreme Learning Machine (ELM) are optimized by exploring and developing the Salp Swarm Algorithm in the iterative process. Finally, the simulation experiment is conducted with the actual data of a wind farm in Henan Province, and the comparison with the traditional Extreme Learning Machine, Particle Swarm Optimization Extreme Learning Machine (PSO-ELM) and Back Propagation (BP) neural network model shows that the new method avoids falling into the local extreme value, and has faster convergence speed and higher prediction accuracy.


I. INTRODUCTION
Wind power, which is environmental-friendly and is abundant in nature, plays an important role in the development of human society. Since the wind is formed by the asymmetric flow of air, its inherent randomness and volatility result in a large fluctuation in the output power of the wind farm when the grid is connected. At this point, if the wind power is accurately predicted, the impact of wind power fluctuation on the power system can be reduced, which is beneficial to realize real-time balance of power system, and avoid large-scale blackouts [1]. Literature [31] is a document on wind power grid-connected dispatching management standard issued by China Energy Administration, according to which the realtime forecast requires the grid-connected wind farm to The associate editor coordinating the review of this manuscript and approving it for publication was Ravindra Singh. report wind power prediction data and wind speed and other meteorological data 15min to 4h in the future every 15 minutes. In order to meet this requirement, the wind farm must adopt effective ultra-short-term wind power prediction. Referring to literature, it can be found that with the increase of prediction time, the prediction error will gradually increase. For example, the margin of error for current wind power forecasts at wind farms is usually 25% to 40%, sometimes more [34], [35]. In order to avoid the phenomenon that the wind power rises or falls significantly in a short time, which brings sudden harm to the grid-connected dispatching, increasing the power system's ability to accept wind power, and improving the prediction accuracy of ultra-short-term wind power is particularly important.
At present, a lot of research has been conducted on ultra-short-term wind power prediction around the world. In early years, the Single-hidden Layer Feed Forwarding Networks (SLFNs) [2], the wavelet decomposition algorithm [3], [4], the empirical mode decomposition [5] and the other algorithms combined with traditional neural networks (such as BP neural network, Support Vector Machine, etc.) can process and predict wind power time series data. However, most of these methods have complex parameters and weak generalization ability, so it is difficult to obtain the optimal prediction effect. In recent years, many optimization algorithms are widely applied to the short-term and ultra-short-term wind power prediction, among them, Genetic Algorithm (GA) [6] and Particle Swarm Optimization (PSO) [7], [8] are the typical algorithms. Both of them optimize the neural network parameters, avoiding complicated process of network setup, but they both have over-fitting phenomenon, and are easy to fall into the local optimal solution. In recent literatures, various scholars have proposed various solutions to the problems commonly existed in optimization algorithms. For example, the historical data of wind farms are processed, and the new data after processing is taken as the input of neural network, so as to improve the prediction accuracy [9], [10]. The Least Squares Support Vector Machine (LSSVM) model has strong generalization ability and learning ability. It uses Variational Mode Decomposition (VMD) to decompose wind speed series, and uses Bat Algorithm (BA) to optimize parameters, so that to conduct wind speed prediction for wind speed subseries of different frequencies [11]. For example, through the improvement and optimization of the neural network's own parameters, the extracted characteristic values of wind power prediction are used as the training set to improve the ability of wind power prediction [12]. Extreme Learning Machine (ELM) is one kind of single hidden layer forward neural network, which has the advantages of strong generalization ability, simple structure and easy training. However, during the training process, ELM will have over-fitting phenomenon when there are too many parameters, and it is easy to be interfered by the outliers in the sample data, which affects the prediction accuracy of the model. In [13], authors proposed to use the Regularized Extreme Learning Machine (RELM) to predict short-term wind speed. Compared with ELM, RELM considers the structural error while solving the least squares error, effectively avoiding the over-fitting problem caused by the excessive number of hidden layers and improving the prediction accuracy. In [14], the dynamic inertia weight method is used to improve the Bat Algorithm (BA), and then the Kernel Extreme Learning Machine (KELM) is optimized to enhance the processing of data, speed up the convergence, and avoid ELM randomly selecting parameter nodes. Literature [15] uses Principal Component Analysis (PCA) to screen the wind power data, eliminating some redundant components, and uses ELM to predict the wind power of the processed data to verify its effectiveness.
Salp Swarm Algorithm (SSA) is a heuristic algorithm developed in recent years inspired by salp swarm behavior in the ocean [16]. Since its introduction, SSA has proven its effectiveness in various applications. In literature [17], the feature selection problem based on SSA algorithm was put forward. The transfer function was used to convert SSA into binary system to maximize the classification accuracy and extract the optimal feature set. Literature [18] proposed the application of SSA in electrical engineering. The author applies SSA to Complementary Metal-Oxide-Semiconductor (CMOS) differential amplifier and comparison circuit size optimization. The experimental results show that the CMOS was based on SSA analog integrated circuit design has better performance.
This paper applies SSA algorithm to ELM, and further proposes a SSA-ELM based ultra-short term wind power prediction model. First, SSA is used to optimize the input connection weights and implicit bias of the single hidden feed forward neural network for ELM, which improves the generalization ability of ELM and avoids its over-fitting problem. Secondly, based on the SSA-ELM network, an ultrashort-term wind power prediction model is established. The historical data of an electric field is used to train it. Finally, the prediction model proposed in this paper is verified to be of high accuracy by comparing with the traditional Extreme Learning Machine, Particle Swarm Optimization Extreme Learning Machine (PSO-ELM) and BP neural network model.

II. EXTREME LEARNING MACHINE AND SALP SWARM ALGORITHM
A. EXTREME LEARNING MACHINE Extremely Learning Machine (ELM) is a new algorithm proposed by Huang based on Single-hidden Layer Feed Forward Networks (SLFNs) [19]. Different from the traditional neural network, ELM randomly calculates the connection weight between the input layer and the hidden layer, and analyzes the connection weight between the hidden layer and the output layer to obtain a global optimal solution. The network structure of ELM is shown in Fig. 1.
∈ R m , N be the quantity of the hidden VOLUME 8, 2020 layer neurons. Let g(x) be the activation function, then the standard SLFNs output function can be expressed as where b i is the bias value of the i-th hidden neuron, ω i = [ω i1 , ω i2 , . . . , ω in ] T is the weight vector that connects the i-th hidden neuron with the input neurons, and β i = [β i1 , β i2 , . . . , β im ] T is the weight vector connecting the neuron in the i-th hidden layer with the output neuron [21]. When the quantity of neurons in the hidden layer is equal to that of samples in the training set, for randomly selected ω and b, the SLFNs with N hidden neurons and activation function g(x) can approximate the training samples with zero error, i.e., N i=1 o j − t j = 0 . Then the ELM output function with N random samples can be obtained as: which can be rewritten as where H is the hidden layer output matrix of ELM, which can be expressed as Therefore, the learning process is converted into the least squares solution of the linear system (4), which means that the connection weight β between the hidden layer and the output layer can be obtained by solving the least squares solution of the following equation: whose solution can be expressed aŝ where H + is the Moore-Penrose pseudo inverse of the hidden layer output matrix H. To sum up, the main step of ELM learning algorithm can be summarized as follows: 1) Determined the quantity of neurons in the hidden layer. According to the empirical formula mentioned in the literature and the corresponding test data [1], [36], it is determined that the hidden layer neuron selected in this paper is 30.
2) Randomly select the connection weight ω between the input layer and the hidden layer and the offset value of the hidden layer neuron b. 3) Select an infinitely differentiable function as the activation function of the hidden layer, and then calculate the hidden layer output matrix H and its Moore-Penrose pseudo inverse H + . 4) Calculate the output layer weights ofβ.
ELM has the advantages of fast learning speed and good generalization performance, which is not only suitable for regression and fitting problems, but also suitable for classification, pattern recognition and other fields. At the same time, due to the research of many scholars at home and abroad, many optimization methods and strategies for the ELM have been proposed continuously. The performance of ELM has been greatly improved, and its application scope has been increasingly wide.

B. SALP SWARM ALGORITHM
Salp is a small type of far-sea colloidal chordate with a translucent body. The salps that exist in groups in the ocean are only the size of a human's thumb. Billions of them gather together to form a chain structure of the salp chain, moving by sucking in seawater from the front and then expelling it from the back. Researchers believe that the salps move and forage through this chain structure.
Salp Swarm Algorithm (SSA) is a heuristic algorithm proposed by Mirjalili et al. in 2017 based on the group behavior of salps in nature. Salps are different from wolves, fish and birds in group behavior, which keep a ''group'' distribution, with one leader as the leader's guide and the other individuals keep absolute obedience to the leader and constantly update their positions as the leader moves. Salps are distributed in a ''chain'' mode, where the leader is at the top of the chain and the followers follow each other closely for chain food capture and movement [16].
Because SSA is iterative, it iteratively generates and evolves some random salp individuals in the boundary box of the research problem. When building a mathematical model of the salp swarm algorithm, all salps update their positions after identifying the leader and follower of the salp chain. Salp as leader will attack in the direction of the food source (F) and all followers can be able to follow each other in the direction of the leader. Fig. 2 shows the salp chain.
Suppose that the population of salps X consists of N agents with d-dimensions. Hence, it can be Expressed as follows: In SSA, the position of leader changes with the change of food source (F), which can be calculated by: where x 1 j stands for the position of the leader, which changes only according to the position of food source, F j is the position vector of food source in the j-th dimension, ub j and lb j are the upper and lower bounds of the j-th dimension respectively, c 2 and c 3 are random values in the interval [0,1], which are related to whether the leader's next position in the j-th dimension should be positive infinity or negative infinity and the step size. c 1 is the main parameter for balancing search and development in SSA, and its expression is [21]: where k is the number of iterations and K max is the maximum number of iterations. As the number of iterations increases, the c 1 value decreases. Therefore, in the final stage of optimization with SSA, more emphasis can be placed on the diversification tendency of the initial stage.
The position update of followers can be calculated by using the following equation (Newton's law of motion): When i ≥ 2, x i j means the position of the i-th follower in the j-th dimension, t is time, v 0 is initial velocity, satisfying that Since the time of optimization is iterative, the difference of the iteration is 1, and v 0 = 0, the equation can be expressed as follows: Firstly, we can randomly select all the salps of the SSA in space using the SSA algorithm, and conduct the initial evaluation to select the most suitable salp chain for capturing the food source F, forming the form as shown in Fig. 2. Secondly, the variable c 1 is adjusted by equation 10, and the positions of leaders and followers are constantly updated by the calculation of equations (9) and (12). Finally, before completing the number of iterations, except for the initialization step, the rest steps are updated repeatedly until the food is captured, that is, the global optimal solution is obtained.

III. SALP SWARM OPTIMIZATION EXTREMELY LEARNING MACHINE
The output layer weight matrix of the ELM is obtained from the pseudo inverse matrix of the hidden layer output matrix, and the excessive number of nodes in the hidden layer can lead to over-fitting. Moreover, the input weight of the ELM and the offset value of the hidden layer are randomly generated, which may result in the appearance of multiple output layer weight matrix. Salp swarm algorithm optimizes the input weights and hidden layer bias values in the ELM to avoid the deviation caused by random selection of the two. Then the global optimal solution is obtained through continuous updating and optimization.
This section mainly according to the literature [21] describes the design and the process of SSA-ELM training algorithm. In SSA-ELM, the operator of SSA is used to optimize the ELM network, where each salp's path represents a candidate ELM network. To achieve this representation, SSA is designed to preserve the network parameters we want to optimize, i.e. the connection weights ω between the input layer and the hidden layer and the bias value b of the hidden layer neurons. Therefore, the length of each salp can be calculated as L = I × N + N , where I is the quantity of input variables. Structural design of SSA-ELM is shown in figure 3. is the location of food source F. Half of the remaining salps are leaders and half are followers. 4) Update the location. Salp chain leaders and followers are updated according to equations (9) and (12), and the corresponding fitness value of each salp is updated. The fitness value obtained in this iteration is compared with the optimal fitness value obtained in the previous iteration, and the global optimal fitness value is updated, that is, the position of the food source F is updated. 5) Repeat steps 3 and 4 until the global optimal solution is output.

IV. ULTRA-SHORT TERM WIND POWER PREDICTION MODEL A. TYPES OF GRAPHICS
In order to verify the feasibility of SSA-ELM mentioned above, in this section, we will combine the historical data of actual wind farms and extract the main influencing factors according to the literature and formula 13.
where P is the wind power output power (kW ), ρ is the air density (kg/m 3 ), A is the area swept by the wind wheel (m 2 ), V is the wind speed (m/s), C P is power coefficient.
According to the formula of output power, the main parameters that affect the wind power are the air density, the area swept by the wind turbine, the wind speed and the power coefficient of the wind turbine. Wind power output is proportional to air density ρ, which is closely related to atmospheric pressure and temperature. Therefore, the influence of atmospheric pressure and temperature should be considered in the analysis of wind power. According to the formula, the output of wind power is proportional to the cubic of wind speed V, so the wind speed has a great influence on the output of wind power. Wind direction will also have a greater impact on the use of wind energy.
Based on the above discussion, this section selects the wind speed, temperature, wind direction, atmospheric pressure and other factors that have a great influence on the power of the wind farm as the input variables of the neural network to predict the ultra-short-term wind power.
The specific steps are as follows: 1) The historical data of a wind farm in Henan Province are selected as experimental data. According to the definition in Section II, a set of N data of wind speed, temperature, wind direction, atmospheric pressure and other factors over a period of time are randomly extracted as input variables, i.e. X = [x 11 , x 12 , · · · , x 1N , x 21 , · · · , x 2N , x 31 , · · · , x 3N , x 41 , · · · , x 4N , · · · ] and output variables are the wind power corresponding to each moment, i.e. Y = [y 1 , y 2 , · · · , y N ] 2) The input and output variables of the model are divided into training set and prediction set are normalized for preprocessing. 3) According to the SSA-ELM optimization steps mentioned in the literature [21], combined the historical data of actual wind farms, take the training set samples for network training. Input the data to be predicted into the trained network and the corresponding wind power prediction results is output. 4) In order to determine the validity of the forecast results, the normalized wind power data obtained from the prediction are reversely normalized, and the validity is identified according to the evaluation indexes in the ''wind power forecast function specification'' issued by State Grid.

B. THE EXAMPLE ANALYSIS
In this paper, the historical data of a wind farm in Henan Province in 2015 are selected, and the wind speed, wind direction, temperature, atmospheric pressure and other dates are sampled every 10 minutes at the turbine height. BP, ELM, PSO-ELM and SSA-ELM models are established respectively, and the errors of prediction results are compared and analyzed from various angles. Among them, the BP has double hidden layers with fifteen neurons in each layer. PSO-ELM has double hidden layers with thirty neurons, the maximum velocity is 0.5, the minimum error is 0.00001, the max and min inertia weight are 0.9 and 0.3 respectively. All algorithms are realized by programming in MATLAB.

1) ERROR EVALUATION INDEX
The following four error evaluation criteria are used to analyze the feasibility and effectiveness of each model, namely, Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and determination coefficient (R 2 ).

RMSE
where n is the quantity of the prediction sample P Mi and P Mi are the actual power and the average values of actual power at i-th moment, respectively, P Pi and P Pi are predicted value and average value at i-th moment.

2) ITERATIONS AND OBJECTIVE FUNCTION
In this section, the convergence of SSA-ELM algorithm is empirically studied. The objective function of the neural network was determined to be RMSE. By random selection of historical data, wind speed, wind direction, temperature, atmospheric pressure and other major factors were taken as input characteristics for network training. As can be seen from Fig. 4, the objective function changes with the change of iteration number. When the number of iterations is 50-150, the objective function is large; when the number of iterations changes from 150 to 250, the objective function plummets and starts to stabilize. During the iterations in 250-300, the objective function has no fluctuation and steady convergence. Therefore, when SSA-ELM, PSO-ELM, ELM and BP prediction models are established, the number of iterations is determined to be 300.

3) SAMPLE SIZE AND ERROR ANALYSIS
Using the historical data of wind farm to train the neural network can make it have the ability of predicting wind power. VOLUME 8, 2020  In addition to the influence of neural network parameters on its learning ability and prediction ability, the input sample of the network is also one of the main influencing factors. Therefore, on the basis of determining the types of input features, it is particularly important to study and analyze the influence of sample size and error. The annual historical data of a wind farm in Henan Province can be divided into four seasons: spring, summer, autumn and winter. Historical data in the four seasons are randomly selected, and the data collection interval was one sample point every 10 minutes. The sample sizes are set as 894 and 1788 respectively. Among them, 894 is a small sample size (750 data are training samples, 144 represents 24-hour test samples), and 1788 is a large sample size (1500 data are training samples, 288 represents 48-hour test samples). Fig. 5 and Fig. 6 show the comparison of two samples and four prediction models' prediction results of wind farm in winter respectively. According to the results,  the corresponding histogram of the data comparison between 894 and 1788 samples in January, April, July and October for SSA-ELM, PSO-ELM, ELM and BP model prediction errors is drawn. It can be intuitively seen from the figure 7, 8 and 9, compared with the small sample size used for training, verification and testing, the larger sample size will lead to a slight increase in the values of MAE, MAPE and RMSE generated by the neural network during the prediction. This anomaly is mainly due to the strong randomness of wind power generation. Therefore, the larger the sample size is, the greater the randomness of wind power included in the prediction will be, that is, the error will be increased.

4) ROLLING ERROR ANALYSIS
According to article 6 of the second chapter of Literature [31], real-time predict requires wind farms connected to the grid to report wind power forecast data and real-time wind speed and other meteorological data in the next 15 mins to 4 hours rolling every 15 minutes. In order to report the prediction data of wind power in real time, this paper adopts ultrashort-term prediction and randomly selects the historical data of January in 2015 in Henan Province as training samples (10min/time sample point) to test the wind power on January 18 (Fig. 10). The 4-hour period stipulated in Literature [31] is used as the time scale for calculating rolling prediction. The predicted values of SSA-ELM, PSO-ELM, ELM and BP model and actual values in the period of 10-240 min are selected to calculate its error, and the sequential error is updated to the period of 250-480min successively.
The results shown in Fig. 10 (b) are obtained by training four algorithms in the same time, and it can be seen that the SSA-ELM prediction model has a better tracking effect than the ELM and BP models, and compared with the PSO-ELM model, the prediction effect of both models in the first 100 min is very good, with only a small error with the actual power. However, with the increase of prediction time, the wind power fluctuates with a high frequency for a period of time, and the prediction deviation between BP and ELM model is large. PSO-ELM model can basically have the same fluctuation range as the actual value, but the actual power cannot be tracked correctly. Only SSA-ELM model still has good prediction accuracy in this case. Therefore, rolling prediction is adopted to calculate the prediction errors of each model in different time. It can be seen intuitively from Fig. 11 (a) and (c) that the MAE and RMSE of the SSA-ELM prediction model are relatively small, and Fig. 11 (b) cannot accurately distinguish which model has a slightly lower MAPE in the sample points 1-13. Combined with Table 2, it can be clearly seen that the error of SSA-ELM is smaller. Therefore, comparing the rolling error data lists of the four models, SSA-ELM prediction model has a better performance in real-time power prediction.

5) CASE ERROR ANALYSIS
750 data in January, April, July and October of the four seasons from the historical data were extracted to training samples, and the trained network was used to predict the ultra-short-term wind power within one day after the prediction point. For the prediction samples of each model, 144 prediction samples of January 18, April 18, July 18, and  October 18 are respectively used in this paper to compare the prediction curves as shown in Fig. 10 (a) and Fig. 12-14.
The average temperature in the winter of 2015 in Henan Province was −3 • C − 6 • C, and the wind force varied greatly. According to the sample moments of 0-100min section  in figure 10 (a), SSA-ELM, PSO-ELM have better tracking ability than ELM and BP when the power fluctuation range is large. Combined with the MAPE index in Table 4, the prediction error of SSA-ELM is smaller. The spring wind is bigger and change speed, as shown in figure 12, namely power is high, the frequency of the wave SSA-ELM although there are more time tracking on actual sample output, but according to the R 2 data can be seen in Table 4, the SSA -ELM decision coefficient than other two kinds of model is closer to 1, shows that SSA-ELM network prediction fitting performance better. When the summer temperature is too high and the wind speed changes slowly (as shown in Fig. 13), the actual sample output changes steadily in the 40-80min segment. All the four prediction models can keep up with the variation trend of samples, but the SSA-ELM tracking and prediction effect is better. In late autumn, the wind in Henan Province is fluctuates. In figure 14, the actual value is suddenly small or large. SSA-ELM can better adapt to the abrupt change of wind power and effectively track wind power in a timely manner.

6) TIME COMPLEXITY
Time complexity is a key problem in the application of neural network in wind power prediction. Because the prediction results of neural network are random, historical data of January are selected for 20 times of wind power prediction training to obtain the running time of the four models and analyze the mean results. As shown in Table 5, the training time for ELM prediction model is the shortest, followed by BP neural network, and the prediction model for the longest training time is PSO-ELM, the training time of SSA-ELM is moderate compared with the three models. Where, since the input weight of ELM and the bias value of the hidden layer are generated randomly, only the output weight of the hidden layer needs to be calculated, so its average training time is the minimum. BP neural network is based on the feedback transmission of the predicted output error, so as to calculate the output error of the upper layer of the neural network, and continuously reduce the error through iterative operation. VOLUME 8, 2020  Compared with ELM, the training time is longer. The training time of PSO-ELM model is too long because it needs to seek optimization through continuous iteration in the disordered particle swarm. The SSA-ELM prediction model optimizes the input weight value and hidden layer bias value of ELM in the training process through the SSA algorithm, so as to ensure its prediction accuracy. Compared with ELM and BP model, its training time average is the longest. However, compared with PSO-ELM which also optimizes the weight and offset value of ELM, training time of SSA-ELM is much shorter. Through the comparison of time complexity, it is verified again that SSA-ELM has better performance when predicting wind power.

C. THE PRACTICAL APPLICATION
Combined with the above example analysis, the advantages and disadvantages of BP, ELM, PSO-ELM and SSA-ELM in terms of time complexity and error were discussed by changing the number of iterations and sample size. The results showed that compared with the other three models, SSA-ELM had higher prediction accuracy and stronger ability to track the actual wind power. In order to further prove the effectiveness of this method in grid-connected scheduling, according to the requirements of real-time scheduling, the ultra-short-term prediction time scale is generally less than 4 hours. The wind power of a wind farm in Henan was predicted from 0:00 to 4:00 on May 17, 2019, and the prediction results of BP, ELM, PSO-ELM and SSA-ELM were compared and analyzed. As shown in FIG. 15, there is a deviation between the predicted value and the actual value of the four models, but the comparison shows that SSA-ELM and PSO-ELM has a strong ability to follow the actual wind power, and the prediction deviation is small. At 50-150 minutes, the wind power had a sudden rise and drop. At this time, compared with PSO-ELM, it could be found that SSA-ELM had higher prediction accuracy and stronger ability to track the sudden change of wind power. Although PSO-ELM has strong prediction ability compared with BP and ELM, it still cannot accurately predict and track the sudden change of wind power, and there is still a certain gap between the predicted value and the actual value. Moreover, the MAE of SSA-ELM is only 3%, which shows that the SSA-ELM has more higher accuracy, compared with 8% of PSO-ELM, 19% of BP and 17% of ELM.

V. CONCLUSION
In order to realize real-time wind power dispatching, reduce the damage of wind power grid caused by random changes of wind power, and strengthen the emergency measures taken by the dispatching agency for sudden wind power instability in the process of grid connection, the main task of this paper is to improve the accuracy of ultra-short-term wind power prediction. In this paper, a Salp Swarm Algorithm (SSA) is proposed, which is to optimize the Extreme Learning Machine (ELM) and avoid the over-fitting phenomenon of the ELM, and improve its generalization ability. By preprocessing the historical data of a wind farm in Henan, a classic BP prediction model, ELM prediction model, Particle Swarm Optimization Extreme Learning Machine (PSO-ELM) and SSA-ELM prediction model of the Salp Swarm Algorithm optimization Extreme Learning Machine are respectively established and compared through simulations. It is verified that the SSA-ELM model is superior to other models in prediction, it fills the blank of SSA-ELM in the field of wind power prediction and thus has good engineering research value. Dr. Zhang is a member of the Teaching Steering Committee of automation undergraduate specialty in Henan, the Deputy Secretary-General of the Henan Graphics and Image Society, and a Project Evaluation Expert of the National Natural Science Foundation.