An Efficient Oceanic Eddy Identification Method With XBT Data Using Transformer

Oceanic mesoscale eddies are relatively small, short-lived circulation patterns that are approximately in geostrophic balance. Meanwhile, eddies are omnipresent and can be characterized by dynamic sea level anomalies and temperature anomalies. This makes the eddy identification mainstream with Sea Level Anomaly (SLA). Unfortunately, nearly 90% of sea level dynamic anomalies caused by oceanic eddies cannot be observed due to insufficient resolution of satellite altimeters. Combining in situ Expendable Bathythermograph (XBT) profiles data, and sea surface temperature data calibrated by the altimeter, this article proposes a deep neural network to identify subsurface oceanic eddies and inverse the corresponding sea surface eddy properties. First, the eddies identified by SLA are purified to match the corresponding vertical profile dataset. Then, a neural network with a self-attention mechanism is constructed by combining the eddy vertical profile structure with temporal and spatial characteristics and external features to effectively identify the eddy. Furthermore, the eddy properties including radius, amplitude, and energy are inversion with XBT profile and SST features. Finally, the experimental results show that the accuracy of eddy classification can reach 98.22%, which demonstrates that vertical profiles can be used to classify eddies effectively. Subsequent reclassification of the outside altimeter-identified eddies recaptured about 36% of the eddies. The authenticity of the newly identified eddies can be demonstrated by statistical model validation as well as validation of sea surface temperature anomaly. These results indicate that the subsurface eddies identification can be implemented by vertical profiles with deep learning.


I. INTRODUCTION
O CEANIC eddy is a common and complex seawater flow phenomenon in the ocean [1], [2]. It is a general term for the rotating motion of seawater whose scale is smaller than the Rossby wave and is controlled by the conservation equation of the quasi-geostrophic equation [3]. The oceanic eddy structure plays an indispensable role in transferring nutrients, organic salts, heat, and energy in the ocean [4]. Meanwhile, the eddies drive part of the kinetic energy of the ocean circulation, so oceanic eddies have a significant influence on the direction and intensity of the ocean current [5], and also affect marine biological productivity [6]. Specifically, the upwelling caused by mesoscale eddies brings nutrients from the lower layer of the ocean to the euphotic layer of the ocean, resulting in plankton blooms, which promote the improvement of marine primary productivity. Consequently, the study of oceanic eddies has always been an active research topic in modern oceanography.
The traditional mesoscale eddy detection methods depend on the physical properties of the eddy [7]. The Okubo-Weiss (OW) parameter method defines the region where the OW parameter is less than a certain threshold as an eddy [2], which promotes the automatic extraction of oceanic eddy and has been widely used. Nevertheless, the OW method is time-consuming and laborious for experts to manually set the threshold, and the generalization ability is poor, which cannot be applied to all sea areas. At the same time, deep learning has made preliminary exploration in the field of eddy identification and tracking [8], [9], such as using an enhanced multiscale convolutional network to identify eddies through SSH [10]. But deep learning methods are not yet mature in the field of eddy identification and tracking. Eddies can also be identified through the gridded maps of the Sea Level Anomaly (SLA) automatic algorithm [11]. However, since the SLA data have certain limitations in spatial resolution and small eddies cannot be well distinguished, the SLA grid products can only capture about 10% of the total number of eddies [12], [13]. The main reasons include two aspects. First, the main foundation for altimeter surface grid products to capture eddies is that the sea surface features anomalies caused by eddies [14]. To improve the identification ability, multisource data of oceanic eddies should be integrated [15]. the advanced speed of numerical computing ability, so it is urgent to study data-driven eddies identification methods.
For multisource data of oceanic eddies, the submarine in situ observation data should be combined to improve the identification ability [16]; the main reason is that submarine data can complement weak sea surface features, thus, capturing the eddies with specific profile characteristics while not causing shallow sea surface feature fluctuations [17], [18]. Therefore, by analyzing the characteristic attributes of the deep-sea area [19], it is expected to identify the oceanic eddies that cannot be identified by traditional physical methods or remote sensing methods, especially submesoscale eddies. With the increasing maturity of the Expendable Bathythermograph (XBT), a large number of ocean profile data have been accumulated [20]. By calibrating the XBT profiles with the altimeter eddies dataset, the vertical structure of eddies with different polar can be analyzed. Fig. 1 shows the distribution of global anticyclonic eddy (AE) and cyclonic eddy (CE) in 2018. The data surveyed by XBT can be employed as the original dataset. Through a series of operations such as purification and screening, it can be used as the dataset for oceanic eddy identification.
For data-driven eddies identification methods, deep learning methods have been demonstrating their preliminary advantages [21], [22]. Since AlexNet was proposed [23], various deep neural networks have emerged, which shine brightly in solving problems such as computer vision and natural language processing (NLP). The end-to-end advantage makes it unnecessary for experts to define too many thresholds, so that deep learning develops rapidly and gradually enters the public view. At the same time, with the continuous improvement of remote sensing technology, remote sensing satellites can capture more abundant ocean images as training data for deep learning to identify oceanic eddies. However, the use of the convolutional neural network (CNN) to identify oceanic eddies has certain limitations. On the one hand, CNN usually solves tasks in the field of computer vision, and it is difficult to analyze pure data observed by satellite altimeters. On the other hand, the local perception of CNN in the image domain is not fully applicable to data analysis. Therefore, the above problems can be solved using recurrent neural networks (RNN) and their variants. RNN and its variants usually solve traditional NLP tasks; then the self-attention mechanism was proposed in the Transformer model [24], [25], [26], which has advantages in capturing internal correlations. With the development of research, the self-attention mechanism has been expanded from NLP task to categorical recognition task and achieved excellent results in the tradeoff between speed and accuracy [27]. Consequently, the prior research results show that a deep neural network model with a self-attention module can also be employed to realize the identification and reversion of oceanic eddies.
The specific objective of this study is to construct oceanic eddy identification and inversion deep learning method with remote sensing and in situ observations data [28]. An excellent oceanic eddy identification network should compose of a feature extraction module and a high-performance classification module [29], [30]. The feature extraction module is responsible to extract the detailed information in the shallow network and the semantic information in the deep network to obtain the feature map and feed the feature map into the high-performance classification module to obtain the final classification result [21]. Although the above methods can achieve effective classification results, there are more problems to be considered at the same time [31]. It is mentioned earlier that traditional methods can only capture about 10% of the eddies, and many eddies cannot be identified by the traditional SLA algorithm. Therefore, an excellent oceanic eddy classification network can not only classify the existing eddies with high accuracy but also capture and classify the eddies that are not captured by traditional algorithms, which can supplement the identification ability of the altimeter. In addition, the newly identified eddies with XBT profiles need to prove the authenticity with the pattern and vertical characteristics, while also inverting the corresponding properties [32].
The rest of this article is organized as follows. Section II describes the original data. Section III introduces the original data processing method and describes the proposed neural network framework. In Section IV, the experimental results are obtained and analyzed. Finally, Section V concludes the article.

II. DATASETS
Since the network model built by deep learning requires end-to-end data, an excellent dataset is essential, which will directly affect the performance and accuracy of the network model. With the continuous development of ocean observation satellites, not only ocean surface data such as sea surface temperature (SST), sea surface height (SSH), sea surface salinity (SSS), and chlorophyll concentration are measured [33] but also ocean submarine data are constantly accumulating, which provide rich data for our research [34]. We use SLA data, vertical profile data, and sea surface temperature anomaly (SSTA) data.

A. SLA Data
The SLA data adopted in this article are the delayed time products by Archiving, Validation, and Interpretation of Satellite Oceanographic (AVISO) from a combination of T/P, Jason-1, Jason-2, Jason-3, and ENVISAT missions [35]. This study used a total of 17 years of SLA dataset from 2002 to 2018 and had a 0.25 • × 0.25 • spatial resolution and daily temporal resolution as original data [36]. However, the original data observed by the satellite altimeter cannot be directly used for eddy current identification, so it needs to be optimized through the four-step eddy current identification scheme proposed by Liu et al. [37]. Through the above identification scheme, a complete and comprehensive oceanic eddy dataset is created, which can be available at http://data.casearth.cn/ (Data ID: XDA19090202) [38], [39].

B. Vertical Profiles Data
The vertical profile data used in this study is XBT data, which is one of the best sources of altimeter in situ profiles. The quality control and processing of vertical profile data come from the Coriolis Center. To ensure the high-quality data required by the neural network, only the profiles marked as "good" or "possibly good" will be downloaded.
XBT is a one-time measurement sensor, which is mainly used to quickly measure the seawater temperature profile data during ship navigation [20]. The vertical profile data, in this article, are available from https://www.noaa.gov/. Since eddy affects various properties in the ocean, especially temperature, the temperature anomaly (TA) in vertical profile data is selected as the characteristic of the eddy vertical structure for eddy current identification [17]. The method of calculating profile anomaly from XBT measurement includes the following steps. First, additional data filtering shall be conducted for the profiles with the first measurement less than 10 m and the last measurement greater than 700 m, and at least 30 effective data points within the depth range of 0-700 m. The aforementioned high-quality profiles are processed by linear interpolation with an interval of 1 m in the global ocean to obtain the final data [35].

C. SSTA Data
SST is a physical quantity that represents the thermal state of seawater, usually the temperature value of the sea surface [40]. However, due to the influence of seasons, latitude, and ocean currents, the SST varies greatly in different locations of the ocean [41], so it cannot be directly used as a new eddy verification data. SSTA is the difference between the sea surface temperature at the same location and at the same time and the normal temperature at that location. And the remote sensing system has collected and released the global daily 0.25 • × 0.25 • resolution SSTA dataset. Therefore, the SSTA dataset was selected for secondary validation of the newly identified eddies to increase the identification robustness [42].

A. Eddy Dataset Preprocessing
The vertical profile data are labeled by spatio-temporal calibrated with the SLA eddies dataset. According to the rotation direction of the eddy, it can be divided into AE or CE, which can judge whether the floats are within the effective boundary through the SLA for identification. Some eddies that are not inside the effective boundary through the SLA are defined as outside eddies (OE).  resolution over 17 years. Since vertical profile data are measured by the ship during navigation, it is mostly distributed on the channel. Carefully observe the geographical distribution of all eddies in Fig. 3(a). Numerous eddies are located in the positions marked with red boxes, namely the North Atlantic Ocean, the Arabian Sea, the Kuroshio Extension, and the Hawaiian Islands. In addition, comparing Fig. 3(b) and (c), it can be observed that the sum of the number of AE and CE is less than that of OE. Comparing the data in Fig. 2, it can be seen that the sum of AE and CE accounts for one-third of the total eddy number, and the remaining two-thirds are OE. However, the real situation is not the case and the OE determined by SLA may be a real eddy, but only the satellite altimeter recognizes it as OE. In other words, whether a profile is a true eddy or noneddy (NE) cannot just be determined by satellite altimeters; its vertical structural features should also be considered.
The theoretical basis for optimizing the data is that AE should have positive TA and CE should have negative TA to purify the corresponding dataset. However, it is normal for some AE classified by SLA to have negative TA properties and CE to have positive TA properties, which we refer to as cold anticyclonic eddies (CAE) and warm cyclonic eddies (WCE) respectively, collectively referred to as abnormal eddies [43]. We also eliminated these eddies before classifying and identifying them to ensure the quality of the dataset. In addition, to prevent the influence of extreme weather such as rainstorms and typhoons and ships traveling on the sea surface, thereby affecting the vertical structure of the eddy profile, the near-surface depth is set to 20 m [35], [36]. However, it has not yet been identified which profiles are NE. It is generally assumed that profiles are outside the effective eddy boundary identified by the existing altimeter resolution and the profiles with weaker vertical structure signals are NE [43]. Therefore, the specific process of processing the dataset is as follows: First, the 20-700 m vertical profile data are summed, and the AE (CE) profile of the positive (negative) TA is selected to eliminate the abovementioned abnormal eddies. All 20-700 m TA profile gradient values from 2002 to 2018 were then summed and sorted in descending order. The top 30% of the data were chosen as NE as they have the weakest vertical structure signal. Furthermore, the vertical profile data have a very high resolution, while the SLA data are generated by linear interpolation with a resolution of 0.25 • × 0.25 • , so there will be certain deviations and anomalies, so some abnormal data and data with high similarity are eliminated. After the above screening, 27485 AE, 25375 CE, and 53451 NE were selected as the final eddy classification and recognition dataset.

B. EDTR: Eddy Transformer
The eddy identification can be transformed into a triclassification problem due to the use of the neural network. The data used in this experiment is vertical profile data, and the k vertical profile data are represented as a set of datasets as as the ground truth observed by the satellite altimeter in the dataset χ. Since eddy detection and classification can be classified as a triclassification problem, C = {0, 1, 2} is set as the label value, which is the ground truth of the vertical profile data, namely AE, CE, and NE. Finally, the cross-entropy function is used as the final objective function to reflect the accuracy of probabilistic classification.
With the Vision Transformer (ViT) framework [44], the eddy identification deep neural model is constructed, namely eddy transformer (EDTR), which consists of the embedding block, X identical encoder blocks (EB_X) module, and output block. The layer normalization (LN) block, multihead attention (MSA) module, and local residual connection constitute the EB_X module, which ensures that the feature extraction is sufficient and effectively reduces information loss. The EDTR model chooses ViT as the basic framework because it uses a self-attention mechanism to solve the problem of long sequences and pays more attention to the connections between long sequences. The vertical profile data can also be regarded as a long sequence, and the EDTR model can be used to analyze the relationship between different depth profile data. The experimental environment for model building is performed on a server with Inter(R) Core (TM) i7-11700 K @ 3.60 GHz, 64 GB memory, NVIDIA GeForce RTX 3090, and Windows 10 OS. As the core programming language, Python is used to build a neural network framework with the Pytorch1.7.1GPU version.
The eddy profile can be expressed as the x i of the D dimensional vertical feature vector. Four 20-700 m TA profiles 4 ] are used as the input of the model EDTR, and the output is the corresponding eddy type. Fig. 4 shows the EDTR mainframe in detail.
Embedding block includes reshaping layer and linear layer. Although the input of the standard ViT model is a 1-D sequence like the eddy profile, to reduce the length of the input sequence, it is necessary to reshape where B is the batch size, and N is the length of the input sequence. Then, the feature map X is projected to the P dimension through a trainable linear layer for feature extraction. So far, a new feature map X ∈ R B×N ×P is obtained through the embedding block.
To realize the final classification, the embedding block is followed by concatenating the class token, which is a learnable 3-D feature vector X class ∈ R B×1×P . In addition, the TA profile has an order relationship like the sequence, so PE ∈ R B×(N +1)×P is the reserved position information module like the traditional positional encoding. But traditional positional encoding uses sine and cosine functions to calculate positional encoding, while PE block uses learnable positional encoding.
Next is the EB_X module. EB_X module is a stack of X identical encoder block (EB) blocks, which contains two residual structures. One of the remaining structures includes the LN and MSA modules. MSA was first proposed by Ashish Vaswani in 2017 as a core module for processing machine translation models, replacing traditional RNN to solve long sequence problems [24]. The LN layer and the MLP module form another residual structure. The MLP module is a serial structure consisting of two linear layers, two dropout layers, and a GELU nonlinear activation function. Since the size of the feature map does not change through the EB block, several can be stacked for feature extraction.
Output block includes LN layer, ECT module, and linear layer. The ECT is the feature extraction layer, which slices the feature map corresponding to the previous class token. Since the class token is a learnable 3-D feature vector, enough features have been learned for eddy classification. Then the ST-A modules with temporal and spatial characteristics and external features are then concatenate to further improve the classification performance. Finally, the classification results of AE, CE, and NE are obtained through the linear layer and softmax.

A. Vertical Structure Characteristics of Eddy
Before proceeding with eddy identification and classification, it is important to understand the eddy vertical characteristics of different eddy types. Fig. 5 shows the vertical profile data calibrated by the altimeter. The thick line in Fig. 5(a) is the global average of TA for all eddy profiles in 17 years. The thin line is a randomly selected eddy vertical profile without smoothing. Although there is large noise, it can show the vertical structure characteristics of a general eddy. Fig. 5(b) shows the annual average vertical profile data from 2002 to 2018 and the average vertical profile data for all years. It is not difficult to see that the TA of AE at 20 m under the sea surface is about 0.6 • C, while the TA of CE is slightly less than that of AE, which is about −0.4 • C. At 700 m under the sea surface, AE fluctuates in the range of 0.65 • C to 1.45 • C, and CE is basically in the range of −0.6 • C to −0.5 • C, with little fluctuation. Fig. 5(c) is monthly average vertical profile data of all years. It can be seen that there will be some differences in the eddy profile of different months. Therefore, the month is integrated into the network as a temporal feature to improve the eddy classification accuracy. Fig. 5(d) shows the quarterly average vertical profile data in all years. Careful observation shows that AE and CE have smaller TA in winter, while TA are larger in summer and autumn. This shows that changes in the external environment will also affect the eddy profile. Looking at Fig. 5 as a whole, it is found that AE has a stronger TA than CE at 20 m and 700 m under the sea surface, but the TA reaches the maximum at 80-120 m. Through the above series of comparisons, it can be seen that different months have a greater impact on the eddy profile. Although the season also has a certain influence on the eddy profile, the month also reflects the season situation to a certain extent. Therefore, to improve the operation efficiency of the model, only the month is added as the time feature of the eddy to improve the classification accuracy.
Affected by ocean currents, climatology, and the Earth's rotation, spatial features cannot be ignored as eddy attributes. Fig. 6 summarizes vertical profile data at different geographic locations. Fig. 6(a) and (b) represents the average vertical structure of TA every 10 • latitude in the northern and southern hemispheres, respectively. It is not difficult to see that the overall eddy intensity in the northern hemisphere is higher than that in the southern hemisphere, and there is a stronger eddy core at 40-200 m in low latitude and the intensity of the eddy core gradually decreases with increasing latitude. Fig. 6(c) and (d) shows the average vertical structure of TA every 30 • longitude in the eastern and western hemispheres. Since the number of eddies in the western hemisphere is more than that in the eastern hemisphere, the eddy profile in the western hemisphere has more obvious vertical structure characteristics, while the eddy profile in the eastern hemisphere has noise and burr signals. Compared with Fig. 5, different geographic locations in Fig. 6 have a greater impact on the vertical characteristics of the eddy. Therefore, the geographic location of the eddy is also input into the model as a spatial feature.

B. Result of EDTR Experiment
This experiment divides the data into three parts: 1) training, 2) verification, and 3) test datasets, in which the 60%, and the verification and test datasets account for 20%, respectively. The model is trained with a batch size of 64 mini-batches, uses cross-entropy as the loss function, and is optimized using the stochastic gradient descent (SGD) function. The ReduceLROn-Plateau strategy with an initial learning rate of 1e-3 adjusts the learning rate, specifically, the adjustment multiple is 0.1 times, and the patience is set as 5 epochs. Finally, all trainable parameters are randomly initialized to ensure that the gradient does not drop to 0.
The confusion matrix is a standard format for expressing accuracy evaluation in classification tasks. The accuracy of the eddy classification and the precision, recall, and specificity of various eddies are calculated by the confusion matrix to quantify the classification results. The specific calculation formula is as follows: Fig. 7 details the accuracy and loss of different models on the training dataset. Res34, MobV2, and EffV1 are ResNet34, MobileNetV2, and EfficientNetV1 as the eddy classification models of backbone, respectively [45], [46], [47], [48]. While EB*1, EB*2, EB*4, EB*6, and EB*12 indicate that the EDTR model uses the corresponding number of EB blocks, respectively. Looking at Fig. 7 as a whole, Res34 and MobV2 have poor accuracy and can only reach 87% accuracy, and the corresponding loss declines slightly slowly. However, the accuracy of the EDTR and EffV1 keeps improving and gradually stabilizes after 30 epochs. At the end of the training, the accuracy can reach about 98%, and the corresponding loss also decreases rapidly in the first 10 epochs. This means that the classification of eddy by the EDTR model and the EffV1 model is approaching saturation, so further performance testing on the test dataset is required.
To further evaluate the performance of each model, Table I summarizes the accuracy and running time of each model on the test dataset. Analysis of Table I shows that with the increase of the number of EB blocks in the EDTR model, the running time gradually increases, but the accuracy does not continuously improve. For Res34 and MobV2 models, the accuracy rate is only 87%. Finally, although the accuracy of EffV1 can reach    I  COMPARISON OF ACCURACY AND RUNNING TIME OF DIFFERENT EB BLOCKS ON THE TEST DATASET   TABLE II  PERFORMANCE COMPARISON OF DIFFERENT EB BLOCKS ON THE TEST DATASET 97.6%, the running time is too long to achieve the real-time detection of eddy in subsequent experiments. Table II summarizes the quantitative evaluation of different models on the test dataset, and lists the precision, recall, and specificity of each type of eddy in detail, respectively. In conclusion, both the EDTR and EffV1 models can realize the classification of eddies, especially the EDTR model using six EB blocks has an accuracy of 98.22%. Finally, under the dual consideration of accuracy and running time, we select the EDTR model using six EB blocks.

C. Identification of New Eddies
The EDTR model can realize end-to-end eddy identification without expert-defined thresholds, which has important implications in oceanography. The main objective is to identify new eddies that cannot cause surface characteristics with submarine in situ observational data. Therefore, the vertical profile dataset OE, which could not be identified by the altimeter but may be eddies, is input into the EDTR model. From 2002 to 2018, there were 202655 OE vertical profile data. Since only the vertical profile data are available and there is no ground truth, it is impossible to judge whether the newly identified eddy is an actual eddy. Hence, there are two methods to verify the authenticity of the new eddy. One is to compare and verify the distribution pattern of the newly identified eddy profile with the real eddy profile calibrated by the satellite altimeter. If the above two have the same distribution pattern, it means that a new eddy has been identified with a high probability. Then using SSTA verification, the specific process is as follows: Screen out the newly identified eddies with the same time and location as the actual eddies, and then calculate the sum of the average values of the 20-200 m TA of the two types of eddies. If they have the same trend with SSTA at the same time and position, it is considered that the newly identified eddy is real.

1) Validation of Distribution Pattern:
We first counted the number of eddies identified by the altimeter (called Alt-eddy) and new eddies identified by the EDTR model (called EDTReddy). The specific statistical proportions are shown in Fig. 8(a). Through the EDTR method, 18.04% AE (called EDTR-AE) and 17.17% CE (called EDTR-CE) were identified, respectively. Then calculate the vertical profile averages of EDTR-eddy and Alt-eddy AE and CE, respectively. As shown in Fig. 8(b), it can be found that Alt-CE is slightly larger than EDTR-CE, which indicates that the altimeter can identify CE with more obvious TA, but is not sensitive to CE with smaller TA. The TA of EDTR-AE is mildly smaller than that of Alt-AE from 20 to 480 m, and the EDTR-AE has a larger TA after 480 m. On the whole, Alt-eddy and EDTR-eddy have similar vertical profile structures. We further analyzed the vertical profile structure of EDTReddy according to latitude. Fig. 9 summarizes the global EDTReddy vertical profile, and the average vertical structure every 10 • latitude in the northern and southern hemispheres is shown in Fig. 9(a) and (b), respectively. Fig. 9(a) conforms to the vertical profile structure distribution of Alt-eddy in the northern hemisphere, while for Fig. 9(b) CE accords with, but EDTR-AE has a certain difference with Alt-AE. The TA of EDTR-AE in Fig. 9(b) is small at 20-200 m, and the TA increases gradually after 200 m, which is especially obvious at 20 • S to 40 • S latitude. This can explain the larger TA in EDTR-AE in Fig. 8(b). The main reason for this is that the XBT is the data measured by the ship on the route, and the number of eddies identified by the altimeter is less due to the fewer routes of 20 • S to 40 • S latitude. And the SLA grid can only capture eddies with stronger signals on the sea surface, but the sea surface signals at 20 • S to 40 • S latitude are weaker and while the submarine TA signals are stronger. Therefore, through the XBT vertical profile structure, the eddies with obvious changes TA under the sea surface but not identified by the altimeter are identified.
Finally, to facilitate the analysis of the geographic location of EDTR-eddy, Fig. 10(a) shows the distribution of the newly identified eddies under each 1 • × 1 • grid of global, and Fig. 10(b) and (c) are EDTR-AE and EDTR-CE distribution, respectively. These eddies are highly similar to the geographic distribution of the eddies identified by the altimeter in Fig. 3(b). It can be concluded that the eddy vertical structure can reflect the eddy signal intensity, and TA can be used as an important feature of the eddy vertical structure.
2) SSTA Validation of EDTR-Eddy: The altimeter can only identify some eddies, and the new eddy identified using the EDTR method cannot be verified by the altimeter, so it is necessary to verify the new eddy independently. The previous distributional pattern verification of EDTR-eddy was insufficient to demonstrate authenticity. Independent verification of EDTR-eddy is also required, which is very difficult. Considering that the eddy will affect the vertical seawater temperature, and the eddy density will also affect the seawater temperature, the SSTA dataset is used for independent verification of the EDTReddy. The SSTA dataset has a global ocean of 0.25 • × 0.25 • spatial resolution and daily time resolution is sufficient to verify the authenticity of EDTR-eddy.  The specific method of independent verification is as follows. First, select Alt-eddy and EDTR-eddy at the same time and place (with the same 0.25 • × 0.25 • spatial resolution) as sampling points, and calculate Alt-eddy and EDTR-eddy 20-200 m average TA, respectively. Then, find the SSTA value of the corresponding time and place in the SSTA dataset, and draw the curve of the SSTA value of all sampling points and the sum of the above two TA. If the two curves have similar trends, it proves that the newly identified eddy does exist. The validation method uses similar trends rather than a complete coincidence since changes in ocean temperature are not only affected by eddies but also by ocean currents, climatology, and submarine volcanic eruptions. Fig. 11 shows the trend of global all sampling points in 2017, in which the red line is the sum of the TA of Alt-eddy and EDTR-eddy 20-200 m at the sampling point, and the blue line is the SSTA value of the corresponding sampling point. The two curves in Fig. 11 have the same trend, but they are not the same, which is reasonable, because the average TA of the eddy 20-200 m is selected, while the SSTA is the sea surface temperature anomaly, eddy as a whole the change of undersea temperature will inevitably affect the sea surface temperature.
To further verify the authenticity of the new eddies identified by the EDTR method, Fig. 12 shows the newly identified eddies on January 1, 2017, and embedded them into the SSTA map at the corresponding time. The vast majority of EDTR-AE are in the positive SSTA region, and EDTR-CE are in the negative region because traditional AE is called cold eddy, and CE is called the warm-core eddy. However, there are also some new eddies with opposite polarity, which is also correct since the existence of abnormal eddies has been confirmed [43], [49].

D. Inversion of New Eddy
At present, we have identified eddies missed by the altimeter through the EDTR model and vertical profile structure and verified the authenticity of the new eddy through the distribution pattern and SSTA. However, the eddy is not an independent vertical profile point, but a whole that has a radius, and can cause sea surface amplitude, and has a certain amount of energy. Due to the inherent correlation between the eddy properties and the vertical structure, so invert the eddy properties through the vertical profile structure and SST.
Radius is the most basic property of eddy and also has a stronger correlation with other properties. We first invert the radius through the vertical structure and the corresponding position SST and then use the radius as a known to invert other properties. The specific inversion process is as follows: where depth is the maximum absolute value depth of TA; u depth is the maximum gradient position on depth; d depth is the maximum gradient position under depth; and sst is a set including the location of the profile and the SSTs of eight adjacent locations. Then, F is the machine learning model, including linear regression, Bayesian regression, Elastic regression, SVR, and Gradient Boosting Regressor. Finally, the results of each model are averaged to obtain the final result, that is, the properties of the eddy. According to the above inversion process, we inversed the newly identified eddies in January 2017 from 20 • S to 40 • N and 40 • E to 160 • E. Although the predicted final result can be inverted by machine learning models, the properties of eddies are usually not only affected by vertical profiles and SST but also by ocean currents, climatology, submarine volcanic eruptions, and ship navigation. Therefore, the inversion results are represented by intervals, and the specific inversion results are shown in Fig. 13.

V. CONCLUSION
This article proposes a deep neural network named EDTR to identify global eddies missed by satellite altimeters from 2002 to 2018 through vertical profile data. The main conclusions of this study are summarized as follows.
1) The eddy as a whole vertical structural feature will affect the sea surface anomalies, so the eddy profiles are divided into training, validation, and test datasets using the data observed by satellite altimeter and vertical profile data to train the network.
The final eddy identification accuracy is 98.22%, which indicates the feasibility of identifying eddy using the eddy vertical profile structure. After that, 18.57% new AE and 17.41% new CE were identified through the OE dataset.
2) The newly identified eddies are verified with the distribution pattern and the consistency with SSTA. Meanwhile, the properties including radius, amplitude, and energy of them were inversed.
3) The results demonstrate that artificial intelligence can be applied to the field of oceanic eddy identification, which allows oceanography experts to study higher-level eddy problems rather than merely stop at low-level eddy identification.
Finally, although the properties of the eddy have been inverted through the vertical profile structure, the inversion of the properties of the eddy needs to be improved. The next step is to further improve the accuracy of the inversion and verify the authenticity of the inversion properties.