Multistep-Ahead Prediction of Ocean SSTA Based on Hybrid Empirical Mode Decomposition and Gated Recurrent Unit Model

The prediction of sea surface temperature anomalies (SSTA) is vital to the study of marine ecosystems and global climate. The SSTA can be accurately forecasted one step ahead by numerical and statistical methods. However, multistep-ahead forecasting for SSTA is greatly challenging since the nonlinearity and nonstationarity of SSTA and the lag problem of prediction. Therefore, in this article, a multistep-ahead SSTA forecasting method based on the hybrid empirical mode decomposition (EMD) and gated recurrent unit (GRU) model is proposed considering that EMD has the advantage of reducing data complexity and GRU is good at long-term prediction of data. First, the EMD algorithm is applied to obtain several intrinsic modal mode functions that are more stationary than the original data. Then, GRU is used for multistep forecasting, in which three multistep forecasting strategies (recursive, direct, and multioutput) are compared. The proposed hybrid model is validated by multistep forecasting for monthly average SSTA at the Niño3.4 region. The experimental results show that using reconstruction error as part of the prediction effectively improves the prediction accuracy of the EEMD-GRU model and outperforms other EMD algorithms combined with GRU. Compared with traditional models, the EEMD-GRU model can better predict future multimonth trends of SSTA and effectively alleviate the problem of prediction lag of the traditional model. Finally, this model is applied to the Niño3.4 regional SSTA prediction, and the results show that this model can provide a reference for ocean research.


I. INTRODUCTION
S EA surface temperature anomalies (SSTA) has profound effects on marine ecosystems and climate change [1]. In recent years, as ocean temperatures continue to rise, extreme weather has become more frequent [2]. Therefore, many scholars have studied SSTA forecasting as a way to study the prediction of ocean weather [3]. However, the nonlinear and nonstationary characteristics of SSTA pose a great challenge to the prediction of SSTA [4].
At present, the main SSTA forecasting methods include numerical methods and statistical methods. Chiodi [5] considered the laws of ocean and atmospheric dynamics and used ocean general circulation model accurately predict the development of SSTA. Although numerical forecasting is the main method of SSTA forecasting at this stage, there are problems such as large amounts of calculations and sensitivity to initial conditions [6]. Statistical forecasting methods, when the amount of data in the sample is large, can establish a forecast model in the statistical sense without considering the physical laws of the forecast object [7]. Statistical forecasting methods used for SSTA forecasting include support vector machines [8], autoregressive integrated moving average (ARIMA) [9], regression model [10], and neural networks [11], [12], [13]. However, the nonlinear and nonstationary data reduces prediction accuracy. Therefore, some signal processing methods, such as wavelet analysis [14] and EMD methods [15], are used to decrease the data complexity to improve forecasting effects.
Among statistical methods, neural networks have strong nonlinear expression capability and are now widely used in the prediction of complex time-series data [16]. Wu et al. [17] used the back propagation (BP) method to establish a nonlinear SSTA forecasting model that improves prediction accuracy compared to traditional linear model. In recent years, the long short-term memory neural network (LSTM) has been widely used in SSTA prediction [18], [19], [20]. Recurrent neural network (RNN) was proposed by Elman [21], which showed unique advantages for time-series data prediction. However, RNN suffers from the disadvantages of gradient disappearance and explosion, making them only have short-term memory [22]. To address this problem, Schmidhuber and Hochreiter [23] improved the RNN structure in 1997 and proposed LSTM, but it has the disadvantage of slow and time-consuming training. Chung This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ et al. [24] proposed GRU neural network, and compared to LSTM, GRU has a simple structure, which greatly improves the training efficiency. Su et al. [25] used bidirectional LSTM (Bi-LSTM) for predicting SSTA, and proved Bi-LSTM is more advantageous in time-series modeling. Sun et al. [26] used Internet of Things technology and temporal convolutional network to predict the monthly SST. Zhang et al. [27] applied GRU neural network to predict SSTA and proved the GRU performs better than LSTM in terms of stability and accuracy.
Among signal processing methods, empirical modal decomposition (EMD) has been widely used in the data processing and prediction of climate, such as the ocean, meteorology, and rainfall [28], [29], [30], [31]. These findings suggest that EMD and its improved models can effectively improve prediction accuracy. EMD was proposed by Huang et al. [32] to process and analyze the signal, which can analyze the physical characteristics of the various data, thus reducing the complexity of nonlinear time-series data. However, EMD has the problem of modal aliasing. Wu and Huang [33] proposed the ensemble empirical mode decomposition (EEMD) method by adding white noise to the original signal, which effectively solves the modal aliasing problem. However, the EEMD decomposition is computationally intensive and the added white noise cannot be completely neutralized. Yeh et al. [34] proposed a complete ensemble empirical modal decomposition (CEEMD) method by adding complementary white noise to the original signal, which effectively reduces the reconstruction error caused by white noise. Torres et al. [35] proposed a CEEMD with adaptive noise (CEEMDAN) method to obtain a component with smaller error and stronger physical meaning. Currently, a combination of signal processing and neural network methods has been applied to temperature prediction. Zhang et al. [36] used the EEMD-LSTM model to forecast land surface temperature and proved the hybrid model is superior to the signal neural network. Shao et al. [37] combined the empirical orthogonal function of multivariate, CEEMD, and multilayer perceptron to forecast SST considering the correlation of different variables in the real marine environment. Wu et al. [38] used the CEEMD-BP model to forecast SSTA and proved that the prediction result of the CEEMD-BP outperforms EEMD-BP, but this conclusion is not absolute.
Ocean forecasting studies often require data for multiple months into the future, and most of the previous studies are one-step forecasting. However, multistep forecasting of ocean data is rarely studied because it often suffers from low prediction accuracy and lag. Meanwhile, the complexity of SSTA data poses a great challenge to multistep prediction [39]. Niu et al. [40] built optimal feature selection and an artificial neural network model for multistep wind prediction. Wu et al. [41] used a decomposition-ANFIS model to predict wave conditions multistep-ahead. These studies demonstrated that the prediction error increases greatly with prediction steps increasing. However, the prediction results of these models have lagging problems. Therefore, the main purpose of this article is to build a multistep-ahead model for SSTA prediction and alleviate the problem of the multistep forecasting lag of the previous methods.
Considering the advantages of empirical modal decomposition to reduce the nonlinear characteristics of data and the high prediction accuracy of deep neural networks, this article builds a hybrid EEMD-GRU model for predicting SSTA. The contributions of this article are as follows: 1) The EMD algorithm is used to reduce the complexity of the data and a hybrid model of EEMD and GRU is established. 2) The SSTA multistep-ahead prediction model is built based on the hybrid EEMD-GRU model. 3) The reconstruction error of EEMD decomposition is used as a component for its prediction, which effectively improves the accuracy of the prediction. The experiments show that one-step-ahead forecasting of EEMD-GRU performs better than other EMD algorithms combined with GRU, and the hybrid EEMD and GRU models outperform the single traditional model and alleviate the lag problem of traditional multistep-ahead prediction.
The rest of this article is organized as follows. Section II gives the methodology for SSTA multistep prediction, where the hybrid EMD and GRU model is explained. Section III describes the data and study area, and introduces relevant parameters and evaluation metrics of the experiment. Section IV analyses and discusses the different EMD decomposition results, one-step forecasting results, and multistep forecasting results. Finally, Section V concludes this article.

II. METHODOLOGY
The nonstationary of the time-series data makes prediction extremely difficult, so this article combines signal processing methods and neural network models to build a multistep prediction model. This article first uses modal empirical decomposition signal processing to reduce the complexity of the data and then uses a neural network prediction method to achieve multistep prediction.

A. Empirical Mode Decomposition
Compared with other signal processing methods, EMD overcomes no self-adaptation problems of the basis functions. Various improved EMD algorithms (EEMD, CEEMD, CEEMDAN) are currently used to reduce the reconstruction error of the algorithm decomposition by adding different forms of white noise to the signal [33], [34], [35], so only the principle of EMD decomposition is presented here. Fig. 1 illustrates the EMD decomposition process [32], with the specific steps of the decomposition as follows: Step1: Calculate the upper and lower extreme value points of the original signal x(t) to obtain the upper and lower envelopes. Calculate the mean value of the upper and lower envelopes to obtain the mean envelope a(t). Subtract the mean envelope from the original signal to obtain the intermediate signal c ik (t). Step2: Determine whether the signal c ik (t) meets the two conditions of IMF: 1) The difference between the number of extreme value points and the number of crossing zero points must be equal to 0 or 1; 2) The upper and lower envelopes are locally symmetrical with respect to the time axis. If it is Step1 and Step2 until the IMF is a monotonic function, and end the EMD decomposition.

B. GRU Neural Network
GRU and LSTM neural networks are both variations of recurrent networks (RNN), both of which solve gradient explosion in RNNs. The GRU has been widely used in the prediction of nonlinear data, with only two gate structures, update gate, and reset gate (see Fig. 2), which is one less gate than the LSTM network structure and can effectively prevent overfitting.
The GRU network structure includes four main layers: the input, the hidden, the fully connected, and the output (see Fig. 2). The hidden layer (GRU cell) is the one with the memory function. In the GRU, r t is the reset gate. The larger its value the less information is ignored in the previous moment, which is calculated as follows [24]: where W r represents the weight of the reset gate, x t is the input, σ is the sigmoid function, and h t−1 is the output of previous moment. z t is the update gate. The larger its value the more information is retained from the previous moment, which is computed by where W z is the weight of the update gate, and the excitation h t of the GRU network at time t is the linear interpolation of the excitation h t−1 at the previous moment and the candidate excitation h t where Θ represents the multiplication of the corresponding point elements of the matrix, and the candidate excitationh t is calculated as follows: where W is the weight matrix corresponding to h t and tanh is the hyperbolic tangent function.

C. One-Step-Ahead Prediction Model
For a one-step-ahead prediction model, signal processing, and neural network methods are used in SSTA prediction. Fig. 3 gives the EMD-GRU model forecasting steps applied in SSTA prediction [33]. First, the SSTA time series data are decomposed by EMD to obtain intrinsic modal mode functions (IMFs), then each IMF component is forecasted separately via GRU neural network, and then the predictions of each component are reconstructed to obtain the final SSTA prediction results.

D. Multistep-Ahead Prediction Model
In marine atmospheric studies, especially El Niño and La Niña predictions, data for multiple months in the future are required and one-step predictions cannot meet the needs of the study, so multistep prediction results are essential for analyzing marine studies. Multistep prediction refers to use the historical Taieb et al. [39] reviewed the time series multistep prediction strategies. The current methods of multistep prediction are recursive, direct, and multioutput strategies. In (5)-(7), g and G represent the model of prediction and represent the functional relationship between the known and forecasted data, d is the number of function input variables, and w denotes the error term.
Recursive strategy is to use the output of the last one-step forecast as the input of the next forecast, the forecasting model is shown in (5), but there is the disadvantage of error accumulation as the predictions are used to determine subsequent predictions; Multioutput strategy is a direct many-to-many mapping relationship to predict future multistep data at one time (6). It is a unique advantage of deep neural networks, but the number of features required for the data is particularly high and the accuracy of the results is difficult to guarantee.
The direct strategy is to build h models to predict the future data from 1 to h months, respectively (7). The advantage of the direct strategy is that it does not rely on the prediction results of the previous step, which overcomes the disadvantage of error accumulation and has relatively high accuracy for short-term prediction. Since the h models are learned independently, the conditions of model predictions are independent, which will affect the prediction accuracy to some extent.

A. Study Area and Dataset
El Niño (La Niña) is defined as +0.5°C (−0.5°C) above (below) the mean of the SSTA in the Niño3.4 region [see Fig. 4 for five consecutive months. The SSTA refers to the magnitude of the deviation from the mean temperature of previous years. The SSTA data in this article are derived from the extended reconstructed sea surface temperature dataset with a resolution of 2°× 2°from the international integrated ocean-atmosphere dataset, which can be used for long-term global and basin-wide studies [42]. Fig. 4(a) shows the global SSTA distribution in July 2021, which can be concluded as anomalously low SST in the east-central Pacific. Since many extreme kinds of weather are closely connected with SSTA in the equatorial central-east Pacific, the Niño 3.4 region (170°W to 120°W, 5°S to 5°N) is selected as the experimental area in this article.
From Fig. 4(c), it can be concluded that the SSTA are nonstationary, nonlinear, and stochastic. The large range of fluctuations makes prediction extremely difficult, and the SSTA near the equator is more nonlinear and more difficult to predict than other regions. The graph also shows that the top three time periods on the timeline where SSTA peaked are 1982-1983, 1997-1998, and 2015-2016, consistent with strong El Niño years, which indicates that SSTA can reflect the El Niño phenomenon to some extent. If the trend of SSTA in the coming months can be predicted, it will have important implications for the prediction of El Niño-Southern Oscillation events.

B. Data Processing and Network Training Strategies
A diagram of the multistep prediction model training and prediction is given in Fig. 5. The first part of the SSTA data is the training set, the latter part is used to test prediction accuracy, and finally, the last N data forecast the SSTA for the next h months. The network parameters are set as follows: SSTA data are processed using minimum and maximum normalization, the input to the dataset is 12 consecutive months of time data, the target output is the latter month of the input, the number of hidden layers is 2, the number of neurons is 10, data training iterations is 100, the error is judged using mean squared deviation, and the optimizer is Adam. Fig. 6 illustrates the process of direct strategy training and prediction [40]. In the training process, a total of h models are trained, where the blue boxes represent the input data and the red, grey, and yellow boxes represent the output data of the different models. Modelh represents the model predicted h steps ahead. In the testing process, a total of K − N − h test samples are applied to each of the h models, where the blue boxes represent the input data and the grey boxes in red font are the predicted data. The final prediction results for each model are aggregated together and used to assess the accuracy of the predictions.

C. Evaluation Metrics
To quantitatively evaluate and compare the predictive ability of different models, mean absolute error (MAE), maximum error (MRE), root square error (RMSE), and concordance index (CI) are selected as evaluation indicators, and the formulas are as follows: where x i and y i are the true and predicted value, respectively, x andȳ are the average value of the true and predicted value, respectively, and n is the number of samples.

A. Decomposition Results
The decomposition effects of different EMD algorithms are discussed in this section. Fig. 7 gives the results of the decomposition of SSTA at the 0°N, 150°W location by the different EMD algorithms. From Fig. 7, it can be concluded that the final decomposition of the IMF components differs for different EMD algorithms. For all EMD algorithms, the first eigenmodal function component has a high frequency and exhibits a certain nonsmoothness, and the nonsmoothness of IMFi decreases as i increases. The last few IMF components have a certain periodicity and relatively regular fluctuations.
In addition, it can be concluded from the figure that the amplitude of the second component of the EEMD decomposition is smaller than the results of the other EMD decompositions. The small amplitude value can reduce the final prediction error to some extent, because the larger the fluctuation of the component, the larger the prediction error. In Fig. 7, RERES represents the reconstruction errors for different EMD decompositions, where the values of both EMD, CEEMD, and CEEMDAN are at the 10 −15 level and EEMD are at the 10 −2 level. The decomposition effect of the EMD algorithm can be reflected by the magnitude Fig. 6. Architecture of the multistep-ahead forecasting direct strategy. The h models are individually trained to predict the data in the next h steps. Model 1 uses the first N data as input and the first data after as output, and model h uses the first N data as input and the h data after as output. of the final reconstruction error, which is calculated as follows: where x(t) is the original signal and IMFi is the decomposed component. The reconstruction error results of different EMD algorithms are given in Table I, where the reconstruction errors of EMD, CEEMD, and CEEMDAN are close to 0, and EEMD is close to 0.02. It can be found that the reconstruction error of EEMD is larger than that of other EMD algorithms. This is mainly due to the fact that the white noise added in the original signal cannot be canceled by EEMD decomposition, which may lead to a larger reconstruction error of EEMD. However, the  Table II shows the GRU prediction error results for SSTA data at the 0°N, 150°W location after different EMD decompositions. It can be concluded that, for different EMD algorithms, the errors in the predictions of the first two components are large, mainly caused by the high frequency of the first two components. The predicted mean squared deviations of the first components are EEMD, CEEMDAN, CEEMD, EMD in descending order of magnitude. All components of the EEMD predict indicators that are mostly better than the CEEMD and CEEMDAN. After original EMD decomposition, although the component IMF3-IMF6 predictors are better than those predicted by EEMD decomposition, the errors of components IMF1-IMF2 are much larger than those of EEMD. Therefore, the EEMD decomposition prediction is better than the other EMD decomposition predictions.

B. One-Step-Ahead Prediction Results
It is also found experimentally that it is not the case that the better the decomposition of the EMD algorithm, the higher the accuracy of the prediction in combination with the GRU model. The prediction effectiveness of the GRU is closely related to the large amplitude and frequency of the decomposition components of the different EMD algorithms. The larger the frequency, the larger the nonlinearity, and the larger the prediction error of the GRU. The variance of the data can reflect the magnitude of the data nonlinearity to some extent, but not completely. Table III gives the variance of the first two components of different EMD decompositions, and it can be concluded that the first two components of EEMD are smaller than other EMD algorithms. This indicates to some extent that the nonlinearity of EEMD decomposition is weaker and the GRU prediction ability is better. In addition, the reconstruction error is large in the EEMD decomposition, and if the effect of this part on the prediction is ignored, it must reduce the final prediction effect. Therefore, the sequence composed of reconstruction errors is also used as part of the final prediction. Since the EMD, CEEMD, and CEEMDAN reconstruction errors are close to zero, the effect of reconstruction errors need not be considered. Table IV gives the result of EEMD-GRU final predictions in six different locations considering the reconstruction error and not considering the reconstruction error. The number of EMD performance with added noise is set to 100. Due to the zero-mean property of white noise, the larger the number, the smaller the reconstruction error of EEMD decomposition, but the computational cost also rises. From the table, it can be concluded that the larger the mean value of the reconstruction error, the greater the impact on the final prediction, and using it as part of the prediction can greatly improve the accuracy of the EEMD-GRU prediction.
To further validate that the EEMD-GRU model outperforms other EMD algorithms combined with GRU, 492 months of SSTA data from 1980 to 2020 are selected for six other locations. The prediction results (RMSE and CI) of different EMD and GRU combination models for different locations are given in Table V. It shows that the EEMD-GRU model outperforms the other models for SSTA prediction in six different locations. Moreover, the effectiveness of the prediction is closely connected to the data variance: the larger the data variance (the less smooth the data), the larger the mean squared error of the prediction, and the lower the consistency index.

C. Results Analysis of Multistep Prediction Strategies
To compare the prediction effects of different kinds of multistep-ahead forecasting strategies, the SSTA data from 1980 to 2014 are used as the training set to calculate forecasting errors for each month from 2015 to 2020 when the lead time is 1-6, respectively. One of the advantages of neural networks is that they can achieve multiple outputs (multiple output strategy), which means that a multiple output model can be trained to forecast continuous future data at one time. However, the prediction error of the multioutput strategy is larger for the time-series data of a single SSTA [see Fig. 8(a)], the mean error is about 0.76°C when the output dimension is 6, and the maximum error reaches 2°C, which is due to the fewer features of single data. The ARIMA model is commonly used for multistep forecasting, but it has too large a prediction error and the maximum error is also close to 2°C [see Fig. 8(b)]. Fig. 8(c) shows the recursive strategy forecasting results with different lead times, from which it can be concluded that the prediction error is small in one-step-ahead, but from the second step onward, the prediction accuracy decreases and the maximum error reaches 4°C. This is mainly due to the problem of error accumulation and has no error feedback mechanism. If the error in the previous step is large, the error in the next step will show a dispersion trend, and the final prediction value may tend to a constant value when the lead time is large. The direct strategy prediction effect has a common problem with the previous strategy: the error gradually increases as the number of steps increases. However, it overcomes the problem of error accumulation, the stability of prediction is higher, and the prediction error is also smaller than the previous strategies Fig. 8(d). Therefore, this article uses the direct strategy for the multistep prediction of SSTA.

D. Error Analysis of Different Model Predictions
This section compares the prediction effects of single models and combined models and that of different EMD algorithms with GRU combined models. Fig. 9 shows the results of the error metrics (mean squared error, maximum error, average error, and consistency index) for the single model (GRU, LSTM, BP, ARIMA) and the combined model (EEMD-GRU) with 1-6 steps ahead prediction. From the figure, it can be concluded that the error of each model increases with the lead time. However, the error metrics results of the combined model are much better than those of the single model. The mean squared error of the combined model with 6 steps ahead is smaller than that of the BP with 1 step ahead. The consistency indices of the combined model are all above 0.9, while the minimum values of the single model are all around 0.5. This indicates that the forecasting ability of the combined model outperforms the single model. Fig. 10 shows the comparison of the multistep-ahead forecasting results of different EMD algorithms with the GRU combination, where the data are SSTA sequences at the position of 0°N 150°W. It can be concluded that when the number of prediction steps is 6, the mean squared error and average error of the combined model of EEMD and GRU are about 0.1 lower and the consistency index is about 0.05 higher than that of other EMD algorithms with GRU models. Therefore, the EEMD-GRU model predicts better than the other combined models, from which the previous conclusion is further verified: it is not the case that the lower the reconstruction error of the improved EMD algorithm, the better the prediction effects, which also relays on characteristics of the data itself.

E. Lag Analysis of Different Model Predictions
Multistep prediction models often suffer from lags, so the prediction lags of different models are investigated here. Fig. 11 gives the 1-6 multistep forecasting results for different models, from which the lags of the combined model and the single model predictions are compared. From Fig. 11(a), it can be concluded that when the EEMD-GRU model is used for multistep forecasting, the forecasting accuracy is highest in one step ahead, and the prediction accuracy gradually decreases with steps increasing. Due to the EEMD decomposition, the predicted trend curve is relatively smooth, resulting in a larger error at the crest and trough. However, the model forecasts the future trend of the data well and solves the problem of prediction lag in the traditional model. Fig. 11(b) shows the multistep forecasting results of the single GRU model, and it can be concluded that as prediction steps increases, the lag phenomenon becomes more serious. In terms of prediction error and the degree of lag, the prediction effect of the LSTM model is slightly inferior to the GRU model [see Fig. 11(c)], and the ARIMA multistep prediction has the most severe lag, with the predicted values for the next few    months being essentially constant [see Fig. 11(d)]. Therefore, for the SSTA multistep forecasting, the single GRU, LSTM, and ARIMA models have serious lagging problems, which can be effectively mitigated by using a hybrid EMD and GRU model.

F. Model Application
To validate the hybrid model prediction effect in practical application, the EEMD-GRU model is applied to the Niño 3.4 regional SSTA study for multistep prediction of the Nino3.4 index (average of the Niño 3.4 regional SSTA). Fig. 12(a) and (b) shows the prediction results of the 15-year El Niño and 20-year La Niña events, respectively. The 12-month Nino3.4 index (light blue area) is used to forecast the change of the Nino3.4 index (red and blue areas) for the next 6 months. From the figure, it can be concluded that the last three steps have larger prediction errors, but the first three steps of the prediction are highly accurate and the trend of the Nino3.4 index is consistent with the real change trend. The average Nino3.4 index is close to 1.5°C in the second half of 2015 and close to −1°C in the end of 2020. According to the measures of La Niña and El Niño, it can be concluded from the prediction results that a strong El Niño will be formed in the second half of 2015 and a moderate La Niña will be formed from the end of 2020 to the beginning of 2021, both of which are consistent with the reality. Therefore, although the prediction errors of the later steps of the EEMD-GRU model are slightly large, the trend of the nino3.4 index can be predicted many months in advance, thus providing some reference for the analysis and forecasting of extreme weather.

V. CONCLUSION
In this article, a multistep-ahead prediction model based on hybrid EMD and GRU was proposed to predict SSTA data multiple months in advance. SSTA data for the Niño3.4 region from January 1980 to December 2020 were used as an example for the article. The main findings of this article are as follows: 1) The improved EMD algorithm was used to reduce the nonstationary of the SSTA. Four statistical evaluation metrics (MAE, MRE, RMSE, and CI) were used to assess the performance of different EMD algorithms for decomposing forecasts. It was concluded that the reconstruction error of the EEMD decomposition has a significant impact on the prediction, and using it as part of the prediction can effectively improve the prediction accuracy, and the prediction result is better than other EMD combination models.
2) The multistep forecasting strategies of recursive, direct, and multioutput prediction errors were compared, and it was concluded that the direct strategy has higher prediction accuracy for SSTA, but still suffers from the problem that the error increases with increasing lead time. 3) Comparing the forecasting accuracy of the hybrid EEMD-GRU model with that of the single GRU, LSTM, and ARIMA models, it was concluded the combined model has better forecasting effects and can better predict the SSTA development trend in advance, alleviating the problem of multistep-ahead forecasting lag of the traditional models. 4) The EEMD-GRU model was applied to the Niño3.4 regional SSTA prediction, and the results show that the forecasting results can better describe the regional SSTA change process. In a follow-up study, how to overcome the increase in error with increasing forecast lead time is the focus. Due to the small number of SSTA data features, it may be difficult to further improve the accuracy of multistep prediction by using statistical methods alone. Future research can consider ocean dynamics and combine numerical methods with statistical methods to build multistep prediction models.