Data Augmentation for Imbalanced HRRP Recognition Using Deep Convolutional Generative Adversarial Network

In radar high-resolution range profile (HRRP) recognition, the recognition accuracy will decline when the training samples in some classes (majority classes) greatly outnumbers other classes (minority classes). To alleviate the above imbalanced problem, an HRRP data augmentation framework is proposed. A one-dimensional (1-D) deep convolutional generative adversarial network (DCGAN) is developed to generate artificial HRRPs. The fidelity of the generated HRRPs is evaluated subjectively in the raw data domain and quantitatively by the similarity in the feature domain. The experimental results show that the generated data are similar to the true HRRPs and demonstrate that the proposed framework outperforms the state-of-the-art oversampling methods when handling the imbalanced problem.


I. INTRODUCTION
Recent years have witnessed a rise of wideband radar in various applications from perimeter surveillance to drone detection [1]- [4], where a strong demand exists for automatic target recognition (ATR) due to complicated operational environments. Among the different available ATR methods, high resolution range profile (HRRP) is considered to be the fundamental approach. The formation of HRRP is quite simple and does not require relative motion. The profile represents the distribution of target scattering centers along the line-of-sight (LOS) and provides abundant structural features [5]- [8]. In addition, the computational complexity is relatively low, which enables real time processing [9]- [13].
Although fruitful achievements have been made, HRRP recognition is still a non-trivial task. One difficulty known as the imbalanced problem exists when the training samples of some classes (majority classes) greatly outnumbers other classes (minority classes) [14]- [17]. This inequality leads to adverse effects on the final recognition performance. It has been observed that most existing classifiers, especially The associate editor coordinating the review of this manuscript and approving it for publication was Seung-Hyun Kong . data-driven models, favor the majority classes, because their goal is to achieve high overall accuracy on the training data. Consequently, a large drop in the recognition rate for minority classes is inevitable [18]- [20]. Therefore, dealing with the imbalanced problem is an essential step when developing a practical ATR system.
The imbalanced problem has been widely discussed in the machine learning field. The existing methods can be divided into two categories: algorithm-level and data-level. The algorithm-level methods modify conventional classifiers to improve their accuracy on minority classes without degrading the performance on the majority classes [21], [22]. Typical methods include cost-sensitive learning [23] and transfer learning [24], [25]. The data-level methods balance the training data by either oversampling the minority classes or undersampling the majority classes [26]- [28]. In the oversampling approach, data from the minority class are duplicated or interpolated until the imbalance is eliminated. The undersampling approach removes some of the samples from the majority classes to equalize the number of training samples in each class. Because undersampling might eliminate useful samples, oversampling is generally preferred in practical applications.
In HRRP recognition, oversampling can also be realized by data synthesis in either the echo domain or the HRRP domain. In the former, the radar raw echo is first synthesized from the given transmitted signal and the target scattering field based on its three-dimensional electromagnetic model [29]- [31]. Then, HRRP is obtained by echo processing. This method depends on the accuracy of the electromagnetic model and carries a heavy computing burden. In the latter, the HRRP is directly synthesized by convoluting the radar impulse response with the target scattering field, which is usually approximated by the scattering center model. Due to the complicated structure, it is difficult to determine the positions and electromagnetic parameters of the scattering centers, which restricts the accuracy of the results.
In recent years, the generative model has shown great potential in generating new realistic-looking samples [32], [33]. The generation process is modeled as a mapping from a certain 'latent space' to the data space. The deep generative network (DGN) introduced a deep neural network to form the mapping and achieved the state-of-the-art performance in various applications [34]. Among different network structures, the generative adversarial network (GAN) has received considerable attention. A GAN consists of a generator that fits the data mapping and a discriminator that estimates the probability of a sample coming from the data space. Training a GAN involves an adversarial process in which the generator and the discriminator are updated alternately.
In this paper, we propose an HRRP generation method based on GAN. There are two key issues. One is to construct a GAN model that achieves good convergence; the other is to evaluate the generated data appropriately. Good convergence leads to an optimal generator with high probability. However, in practice, it has often been observed that the min-max training process does not always lead to convergence, mainly because of the high parameter freedom and lack of architectural constraints [35]. Evaluation is conducted to determine whether a generated sample is similar to the true data. The existing evaluation metrics are largely designed for two-dimensional (2-D) images, thus a one-dimensional (1-D) sample evaluation method is needed.
To address the convergence issue, we propose a onedimensional deep convolutional GAN (1-D DCGAN), which uses a convolutional neural network (CNN) for both the generator and the discriminator. Furthermore, 1-D DCGAN introduces stride convolutional (SC) and fraction stride convolutional (FSC) operations to stabilize the training process, and it achieves good convergence.
To evaluate the 1-D generated HRRP data, our method includes both raw data and feature domain evaluations. The former is performed directly on the HRRP data through visual examinations. The latter extracts a set of features and compares the feature distributions between the generated and true data.
The reminder of this paper is organized as follows. Section II introduces backgrounds regarding HRRP and GAN. Section III describes the proposed HRRP generation framework in detail. Section IV reports the results of true data experiments including the generated samples evaluation and the classification after data augmentation. Section V concludes this paper.

A. RADAR TARGET HRRP RECOGNITION
When the radar range resolution is smaller than the target size, HRRP can be approximated as the amplitude of the coherent summation of the complex returns from scattering centers, as shown in Fig. 1. Suppose the transmitted signal is s(t)e j2πf c t , where s(t) stands for signal envelope, and f c denotes the carrier frequency. The complex returns in the l-th range cell in the baseband can be represented as: where K l represents the number of scattering centers in the l-th range cell, σ li is the intensity of the i-th scattering center in the l-th range cell, c stands for the velocity of light, R l is the radial range between the radar and the l-th range cell, r li stands for the range from the l-th range cell to the i-th scattering center, and λ denotes the wavelength. When s(t) is a rectangular pulse signal with unit intensity, it can be omitted.
The scattering centers in the l-th range cell share an initial phase l = −(4π/λ)R l . Then, the HRRP can be approximated as follows: σ Li e jφ Li (2) where φ li = −(4π/λ) r li . VOLUME 8, 2020 The amplitude of the l-th range cell x l is The HRRP recognition consists of two procedures: feature extraction and classification. Based on the differences among feature extraction methods, the existing recognition methods can be divided into two categories: shallow and deep methods.
In the shallow methods, both the features and the classifier are manually designed. The most commonly used features include geometric features, energy features and transforming feature. Geometric features reflect target shape information, such as the length and the number of scattering centers [36]. Energy features show the power distribution of the target, such as the energy distribution and the envelope entropy. Transforming features describe the property of HRRP in a transformation domain, such as the spectra and the micro-Doppler [37], [38]. Various classifiers can be used for target classification, such as decision tree, support vector machine (SVM) [39], Bayes classifier [40]- [42] and ensemble learning. A decision tree [43] is a flowchart-like predictive model whose internal nodes represent individual decision rules. SVM is a classifier that builds decision boundaries based on geometric distance. The Bayes classifier determines the categories of samples by estimating the maximum posterior probability, and ensemble learning involves some combination of different basic classifiers. Among the various classifiers, the SVM is the most commonly used model. Hand-designed methods are over-reliant on expert experience and suffer under high workloads.
In the deep methods, the concept of deep learning is adopted to automatically extract features and construct classifier using deep neural network (DNN). Various DNN architectures have been applied to HRRP recognition [44]- [47], including autoencoder (AE) [45], CNN and recurrent neural network (RNN) [44] models. An AE extracts latent features by minimizing the recovery loss, then AE-extracted features are used to discriminate target class with the help of the previously mentioned classifiers. A CNN is a multiple-layer classifier that introduces the convolutional operator, and it can capture detailed features from the initial convolution layers and global features from the final layer [46]. RNN models have received attention due to their advantages for extracting features from sequential data [44], and HRRP can be regarded as a time series that can be input into the network. CNNs are the most widely used methods for recognition tasks. The automatic feature extraction process reduces the difficulty of developing a recognition system. However, deep learning methods increase the required amount of training data [47].

B. GENERAL GAN PRINCIPLES
The generative model is a useful data representation framework, and it is often applied to generate artificial data. It introduces a concept of 'latent space'. The data distribution is represented as a mapping from the 'latent space' to the data space. In recent years, the deep network has been employed to form the mapping, yielding the DGN methods. The successes of DGNs are due to their incredible flexibility which results from the large number of learnable parameters. GAN is a most widely used DGN which introduces the adversarial learning mechanism to fit the mapping to generate artificial data.   2 shows the framework of GAN, which has two main components: a generator and a discriminator. The generator G(Z ) is used to fit the mapping from the 'latent space' Z to the data space. The discriminator D(·) is used to distinguish whether the data comes from the generated data space or the true data space. A GAN is based on a joint optimization of the generator and discriminator, which act as players in a game. The training of the generator is aimed at generating samples for which the distributions are close to those of true examples, and the training of the discriminator is aimed at distinguishing between the generated samples and the true examples. This value function can be formulated as follows: (4) where E x∼p [] denotes the expectation of x with distribution p.
GAN training is accomplished through an alternating iterative approach that updates the generator and discriminator alternately until meeting a convergence criterion. The objective function of the generator can be written as follows: and the objective function of the discriminator can be written as The training of the generator and the discriminator are performed using gradient descent algorithm. The parameters are updated using a learning rate η: where θ g and θ d indicates the parameter of generator and discriminator, and ∇V represents the gradient of function V.

III. PROPOSED HRRP GENERATION FRAMEWORK
The proposed HRRP generation framework is composed of three procedures: training data preprocessing, generation model building and generated sample evaluation. Training data preprocessing extracts the target section to avoid influence from noise and clutter, and normalizes the amplitude to stabilize the training. This procedure consists of target section segmentation (to acquire a precise target location), padding (to ensure all samples with the same length), and normalization (to make all samples within the same amplitude interval).
Generation model building is intended to construct a HRRP generation model with good convergence. We employ the DCGAN architecture with operation constraints to ensure the good convergence. Furthermore, all operations are modified to 1-D form to adapt the data structure of HRRP.
Generated sample evaluation is applied to estimate the similarity between the generated samples and true HRRPs. The evaluation is important not only for assessing the quality of the generated samples but also for offering the guideline of data selection. The evaluation is still a challenge due to the absence of meaningful evaluation metrics. In this paper, the generated samples are evaluated from both the raw data domain and from the feature domain.

A. TRAINING DATA PREPROCESSING
Radar echo data contains information from not only target but also noise and clutter. If the noise, the clutter and the target are all used as input of the generative model, it is difficult to converge. What is worse, the model can learn the noise and clutter characteristics. Thus, acquiring effective target section is an important procedure. The preprocessing is applied to provide a normalized target section for training the GAN, and it stabilizes the training procedure by discarding the redundant noise and clutter. The preprocessing consists of target section segmentation, padding and normalization.
Target segmentation is used to detect the start and end of the target section. The proposed target segmentation method includes two steps: detection and combination. The detection is to find the strong scattering centers, and the combination process is to cluster the strong scattering centers together.
During the detection process, the amplitude of HRRP is compared with a threshold to find strong scattering center. False alarm probability detection is the most widely used detection algorithm. In this paper, cell averaging false alarm probability detection (CA-CFAR) is used to determine the threshold adaptively. The threshold of the l-the point can be expressed as follows: where N T means the length of reference section, k stands for a predetermined constant, x i represents the amplitude of i-th range cell.
The combination is based on the distance criterion. When the distance between adjacent strong scattering centers is below a threshold, those scattering centers are considered belonging to the same target. The start and end points can be get by traversing all scattering point.
In padding, the target section is padded to a fixed length to satisfy the input format of the subsequent GAN. The noise amplitude is used to pad both frontward and backward equaly.
In normalization, each HRRP is normalized independently which can be expressed as:

B. GENERATION MODEL BUILDING
To alleviate the convergence problem, the proposed HRRP generation is accomplished with DCGAN by introducing a series of constraints. The generator and discriminator of the DCGAN are constructed with convolutional operators. In particular, DCGAN introduces the SC and FSC operators, which can observably improve the convergence ability. The HRRP generation architecture, which uses a 1-D DCGAN, is proposed as shown in Fig. 3. The training dataset is the amplitude of the HRRPs. Because an HRRP is a 1-D real vector, the convolutional operators in the generator and discriminator are implemented with 1-D operators. In this work, 1-D SC and FSC are applied.
As shown in Fig.4(a), the filters F 1 , F 2 , . . . , F m in SC consist of vectors sharing the same weight f 1 , f 2 , · · · , f L k : (11) where n represents the n-th element in the vector, L k denotes the length of the filter, and δ(·) denotes unit sample function. The filter slides across the inputs with a stride of S. The output of the SC operator can be expressed as follows: where x n represents the n-th element in the output vector, and y i represents the amplitude in the i-th range cell. The overall output of SC operator is As shown in Fig.4(b), the filters in FSC also share the same weights, and the products of the inputs and the filter are arranged with the stride S. The output is the summation of the products. The FSC operator can be expressed as follows: where x m represents the m-th output, L k is the length of the filter, and F i indicates the filter. S denotes the stride of the FSC, and Y m is the output. The final output of FSC  operator is

C. GENERATED SAMPLES EVALUATION
The evaluation is used to assess the similarity between the generated samples and the true HRRPs, and it provides a principle for sample selection of following data augmentation. The evaluation of the generated samples is a challenge due to the absence of standard meaningful evaluation metrics. In this paper, the generated samples are evaluated in both the raw HRRP level and the feature level. At the raw HRRP level, the object to be evaluated is a single sample. The similarity between the generated samples and true examples is determined based on expert experience. Fig.5 shows a true HRRP and two generated samples, where the horizontal axis denotes the range cell and the vertical axis means amplitude. These generated samples can be evaluated based on scattering center number, amplitude distribution, and the distance between scattering centers. As shown in Fig.5, generated sample 1 is similar to the true HRRP for sharing same scattering center location and similarly amplitude, while generated sample 2 has obviously difference with the true HRRP. The evaluation in the feature level is undertaken to assess the feature distribution similarity. This procedure can be divided into single-feature and multiple-feature evaluations.
The single-feature evaluation assesses the similarity of the statistical distribution. In this paper, the histogram is used to describe the statistical distribution, and the goodness of fit is used to evaluate the similarity. Here, a two-sample Kolmogorov-Smirnov test (KS-test) is used to distinguish whether the feature distribution of generated samples and true examples differ [48].
Considering the difficulty of evaluation in high dimensional space, dimensionality reduction is applied in this paper. After the dimensionality reduction, the distance between generated samples and true HRRPs can be assessed visually in two or three-dimensional space. In this paper, we use the t-SNE algorithm for dimensionality reduction which introduces joint probability matching.

A. EXPERIMENT SETUP
Real HRRP samples were acquired from a radar with a synthetic bandwidth of 1,250 MHz and a resolution of 0.12 m. The dataset consists of 6 classes of vehicles, including 5 majority classes and 1 minority class, as shown in Table 1. The HRRPs in each class were collected in different scenarios and from different aspects. The 1-D DCGAN was trained with the samples in the minority class. In the experiment, the Gaussian noise was used as the input of generator. The training epochs for 1-D DCGAN were set to 200, and the batch size was 256. The optimizer was stochastic gradient descent, and the learning rate was 0.0005. The training phase employed stochastic gradient descent with adaptive moment estimation and early termination was applied to halt the training process before overfitting occurred.
The generated samples are expected to be similar to the training HRRPs, and the samples are used to improve the target recognition performance. In this section, the quality of generated samples is evaluated using the following 2 methods.
(a) The generated samples are subjectively compared with true samples in raw HRRP level. (b) Three types of features are extracted from both the generated samples and true HRRPs. The generated samples are then visually compared with original samples using t-SNE in the feature domain. After compensating the original training dataset to achieve different minority ratios which indicate the ratio between the number of the minority and majority classes, the validation accuracy using SVM and CNN is evaluated. Fig. 6 shows the generated HRRPs after different epochs. The sample generated after 10 epochs looks similar to random noise. However, the generated sample after 20 epochs has a strong scattering point, and after 50 epochs the generated sample has acquired the target contour. Finally, after 100 epochs, the generated sample is enriched and has greater detail.

B. RAW HRRP LEVEL EVALUATION
The results are consistent with that of the generator loss function as shown in Fig. 7. After more than 100 epochs,     8 shows comparison instances which are randomly selected from the true and generated datasets. The generated sample has 3 strong scatters and a weak scatter the same characteristics as the true sample, and both the amplitude and the location of each scatter are very to those of the true sample. From a subjective viewpoint, the generated sample is similar to the true sample.

C. FEATURE LEVEL EVALUATION
In addition to subjectively visually judging the quality of the generated HRRP, statistical histograms of the probability VOLUME 8, 2020 density and the features are selected to evaluate the quality of the selected HRRP.
Before evaluation, the preprocessing is required for noise existing. The target section of HRRP are acquired at first. Then normalization processing is applied according to the maximum amplitude.  For the numbers of generated HRRPs and true HRRPs are different, we use the normalized histogram to represent the amplitude distribution. As shown in Fig. 9, the amplitude distribution of the generated HRRP after 10 epochs looks like random clutter. As the 1-D DCGAN is trained for more epochs, the amplitude distribution of the generated HRRP becomes more similar to the true HRRP. As shown in Fig. 10, the statistical histograms of generated and true HRRPs share the similar shape. The amplitude of HRRP mainly distributes between 0.04 and 0.16, and the histograms decreases as the amplitude increases, and there are some points distribute around 1.
The goodness of fit is also tested to evaluate the similarity between the generated HRRPs and true HRRPs. The KS test is a kind of the goodness of fit method based on the empirical cumulative distribution function under a certain confidence interval. Table 2 shows the statistical amplitude of the generated HRRPs and true HRRPs coming from the same distribution at a 95% confidence interval. Three types of features (e.g., geometric features, scattering features, and transform domain features) are extracted from both the true and generated samples. The results indicate that the differences among these features can initially reflect the distance between the generated HRRPs and the true HRRPs. As shown in Fig. 11, the target length of the generated HRRPs lies primarily in the spectrum between 2 and 2.6, and the normalized difference of the generated HRRPs is largely similar to the normalized difference of the true HRRPs. The KS-test is also used to evaluate the similarity between the features of generated HRRPs and true HRRPs. Table 3 shows that there are no significant differences between the features of the generated HRRPs and true HRRPs at a 95% confidence interval. Because feature vectors are high-dimensional data that are difficult to visualize, the t-SNE algorithm is applied to reduce the dimensionality to enable visualization. Fig. 12 shows the distribution of true HRRPs in minority class and generated samples. According to Fig. 12, in the feature domain, the area of the generated samples is close to that of the true samples.

D. RECOGNITION ACCURACY UNDER DIFFERENT IMBALANCED RATIOS
In this experiment, class 6 is the minority class, and we partitioned the original training set into imbalanced sample sets with different imbalance ratios (i.e., 0.2, 0.3 and 0.4. Then, all the HRRPs in the testing set were used.
The recognition accuracy was then validated using a CNN. A variant of LeNet-5 consisting of 2 convolution layers, 2 pooling layers and 2 fully connected layers was used. The sizes of the convolution kernels in both convolution layers were 1 × 5.   Fig. 14 and Fig. 15 show confusion matrices generated under different imbalance ratios. When the imbalance ratio is 0.2, the recognition rate falls as low as 0.01, and large numbers of the samples in class 6 are misclassified as class 1 and class 5. However, as the imbalanced ratio increases, the recognition rate of the minority class improves.
The dataset under imbalanced ratio 0.2 is used as the training dataset of 1-D DCGAN. The generated samples are used to expand the imbalanced training dataset, and the number   of minority class samples is gradually expended to equal the number of majority class samples. The recognition accuracy and relative improvements of our method under different imbalance ratios are listed in Table 4, which shows that the recognition rate reaches approximately 0.98 when the dataset is expanded complete balance. Note that only the accuracy and F1-scores (which is the harmonic mean of precision and recall) are shown in Table 5 and Table 6. When only the original samples are used, the accuracy and F1-score are low, but both scores increase as the compensation ratio increases, finally reaches a high level when the minority ratio exceeds 0.5. Due to their simplicity and robustness, data-level balancing methods are widely used, among which the synthetic minority oversampling technique (SMOTE) is preferred [49], [50]. Table 7 shows the improvement in recognition using different data-level methods. The 1-D DCGAN improves the recognition rate under different imbalance ratios. The performance of the SMOTE algorithm is similar to that of the 1-D DCGAN, however the SMOTE algorithm is always used for feature generation.

V. CONCLUSION
In this paper, we propose a novel radar HRRP data augmentation method based on a 1-D DCGAN to alleviate the imbalanced problem. In the proposed method, the 1-D DCGAN is used to generate HRRP samples for minority class. Then, the generated HRRP samples are used to compensate the minority class in the imbalanced dataset. The generated HRRP samples are similar to the original HRRP samples in the HRRP domain. The statistical histograms of features and the t-SNE dimensionality reduction results show that the generated HRRPs are also close to the true HRRPs in the feature domain. As the imbalanced dataset become increasingly compensated with generated samples, the recognition accuracies using either SVM or CNN are both improved. We also evaluate other oversampling methods including RUS, ROS and SMOTE, and the proposed method performs best. ment of Electrical Engineering and the Department of Biomedical Engineering, University at Buffalo, The State University of New York, USA. Since 2014, he has been a Lecturer with the School of Information and Electronics, Beijing Institute of Technology, where he has also been an Associate Professor, since June 2019. His research interests include high-resolution radar systems, signal processing, and radar automatic target recognition.
CHENG HU (Senior Member, IEEE) was born in Hunan, China. He received the B.S. degree in electronic engineering from the National University of Defense Technology, in July 2003, and the Ph.D. degree in target detection and recognition from the Beijing Institute of Technology, in July 2009. He was a Visiting Research Associate with the University of Birmingham for a period of 15 months from 2006 to 2007. Since September 2009, he has been with the School of Information and Electronics, Beijing Institute of Technology. He has also been a Full Professor since 2014. He has published over 60 SCI-indexed journal articles and over 100 conference papers. His main research interests include new concept synthetic aperture radar imaging, biological detection radar systems, and signal processing. VOLUME 8, 2020