SAR Target Image Generation Method Using Azimuth-Controllable Generative Adversarial Network

Sufficient synthetic aperture radar (SAR) target images are very important for the development of research works. However, available SAR target images are often limited in practice, which hinders the progress of SAR application. In this article, we propose an azimuth-controllable generative adversarial network to generate precise SAR target images with an intermediate azimuth between two given SAR images' azimuths. This network mainly contains three parts: 1) generator, 2) discriminator, and 3) predictor. Through the proposed specific network structure, the generator can extract and fuse the optimal target features from two input SAR target images to generate an SAR target image. Then, a similarity discriminator and an azimuth predictor are designed. The similarity discriminator can differentiate the generated SAR target images from the real SAR images to ensure the accuracy of the generated while the azimuth predictor measures the difference of azimuth between the generated and the desired to ensure the azimuth controllability of the generated. Therefore, the proposed network can generate precise SAR images, and their azimuths can be controlled well by the inputs of the deep network, which can generate the target images in different azimuths to solve the small sample problem to some degree and benefit the research works of SAR images. Extensive experimental results show the superiority of the proposed method in azimuth controllability and accuracy of SAR target image generation.

For the development of the theory and technology of SAR, researches have been carried out in many fields of SAR, such as SAR image despeckling [6], super-resolution [7], target detection, classification, recognition, and multi-sensor image fusion [8], [9]. All these researches are driven by SAR data among which the SAR target images are the most important. For example, in the field of SAR image despeckling [10], SAR images are necessary for the researches on the SAR speckle characteristics and the despeckling algorithm of both traditional methods and deep learning [11]. In the field of SAR multi-sensor image fusion [12]- [14] and pixel image fusion for SAR and optical images [15] all require highquality SAR images for better fusing the scene information and the interpretation of SAR scene images [16]. As the most representative, in the field of SAR automatic target recognition (ATR) [17]- [22], a great quantity of SAR target images is necessary for the acquirement of target features, improvement of the recognition ratio, and promotion of the practical application of SAR ATR [23].
However, in the actual situation, an abundant number of SAR images is lacking, and SAR image acquisition is difficult and consumes resources. Even if there are some SAR images, they are likely obtained by different imaging conditions, such as the band, platform, azimuth, and so on, and these SAR images can not contain enough information of the scene or target for the researches of SAR fields. The insufficiency of SAR image data or lack of the characteristics of scene or target in SAR images have become a great obstacle of almost all SAR fields and hinders the progress of SAR application [24].
To solve this problem, many kinds of research are carried out in recent years [25]- [31]. And there are mainly three types of SAR target image acquirement: measured data collection, electromagnetic simulation, and sample augmentation [32], [33]. First of all, measured data collection can obtain the SAR target images under different actual scenarios with different platforms. These acquired data are the most authentic and effective. However, this acquirement will consume massive resources of human, material and time, and the number of the acquired SAR target images in each experiment is often limited. The result is that it cannot be used as a cost-effective way for application to obtain enough SAR data.
Through the 3-D modelling of the target and electromagnetic calculation imaging, the electromagnetic simulation is, arXiv:2308.05489v1 [eess.IV] 10 Aug 2023 although not as accurate as the real SAR data, comparatively accurate [34]. The results of the simulation SAR target images are related to how precise the 3-D models and electromagnetic calculation methods are. But the more accurate the 3-D model and electromagnetic calculation method are, the greater the computation will be and the slower the compute process is, which can lead to the massive consumption of the time resource. Besides, when every single different radar parameter gets changed, the computation of electromagnetic simulation needs to start from scratch without using prior knowledge of existing simulation data.
As a result, although electromagnetic simulation is an alternative approach, it cannot also be an effective way to solve the lack of radar data amount. Last of all, sample augmentation, mainly employed in the field of SAR ATR, is to increase the diversity of the SAR sample and avoid overfitting of the classifier, such as translation, rotation, adding-noise, and so on [35]. Luo et al. proposed a synthetic minority class data method for improving imbalanced SAR target recognition using the generative adversarial network (GAN) [36]. However, these methods are only the augmentation from the view of image processing, and many augmented images do not conform to the law of radar imaging and do not contain new information. Therefore, it can not increase the intrinsic information of the target essentially.
In recent years, when deep learning has been applied in signal and image processing fields and demonstrated its superior performance, lots of excellent scholars mainly focus on the SAR image generation and have proposed several deep learning methods with outstanding results [37]- [40]. For example, Guo et al. [41] proposed a conditional generative adversarial net (CGAN) with a clutter normalization method to ease the model collapse during the generation of SAR images. Cui et al. [42] proposed a deep convolutional GAN (DCGAN) to generate SAR images with random azimuths and employed an azimuth discriminator to filter the desire generated images with the azimuths close to specific angles. Jiang et al. [43] proposed a Gabor-Deep Convolutional Neural Networks (G-DCNNs) which is a method of data augmentation with Gabor filter in DCNNs for SAR ATR. It overcame the severe overfitting due to limited SAR image training data when applying DCNNs. Zheng et al. [44] proposed a multidiscriminator GAN with a label smoothing regularization to generate SAR target images with unclear types. Du et al. [45] proposed a multiconstraint GAN (MCGAN) to generate high-quality multicategory SAR images to address the poor image quality problem. Mao et al. [46] combined Constrained Naive Generative Adversarial Networks (CN-GAN) with least squares generative adversarial networks and image-to-image translation to address the problem of low signal-to-clutternoise ratio, model instability and the excessive freedom degree of the output, which appeared in conventional GAN. Saha et al. [47] employed transfer learning framework using a cycleconsistent generative adversarial network (CycleGAN) to train for the suboptimal task of transcoding SAR images into optical images. These existing deep learning SAR image generators greatly promote the research of SAR image acquirement.
However, most of the current SAR image generation meth-ods are just the augmentation of the SAR dataset and generated from random noise, which means these methods can only generate abundant SAR images without controlling the azimuths. When the azimuth distribution of the SAR dataset is not balanced, the generated SAR image dataset by current generation methods are concentrated around certain azimuths. In practice, the features of the target in the SAR image are changing when the azimuth of the SAR image are different. And the lacking of azimuth in the SAR image dataset is actually equivalent to the lacking in the features of target, which can negatively influence the recognition results and other researches [48]. Therefore, the azimuth-controllable generation of the SAR target images is beneficial and necessary for the improvement of target recognition and other researches. It is still a gap between the practical demands and the current methods. Therefore, we proposed an azimuth-controllable generative adversarial network in this paper, which can generate precise SAR images with an intermediate azimuth between two given SAR images' azimuths. The azimuth-controllable SAR target image generation method mainly consists of two parts: the generator and the discriminator. First, through the proposed specific topological structure of the dual-input parallel SAR image input block, the generator can extract the target feature from the two input SAR images with different azimuths. Second, the specific topological structure of the similarity discriminator and the azimuth predictor is designed. The similarity discriminator can distinguish the generated SAR images from the real SAR images, and the azimuth predictor can obtain the distance between the generated image's azimuth and the desired image's azimuth. Then, the proposed azimuthcontrollable SAR target image generation network can generate precise SAR images from two input SAR images and the azimuths of the generated SAR images can be controlled by the azimuths of the two input SAR images. These generated SAR images of different azimuths can help to solve the small sample problem in some degree and benefit the research of target characteristic in SAR images. The main contributions compared with available works are the following: (1) We proposed a framework of azimuth-controllable GAN, which construct an adversarial game between the generator and similarity discriminator with the azimuth predictor, then learns the distribution of feature and azimuth from two input SAR images with azimuth, and generate precise SAR images with an intermediate azimuth between the two adjacent azimuths. Besides, the proposed corresponding data formation method make the azimuth-controllable GAN can achieve the general process of gradually learning the low-dimension manifold of SAR image azimuth.
(2) We proposed a generator of a special topological structure and, azimuth predictor and similarity discriminator. This generator can generate precise SAR images with azimuth controllability. And it can solve the model collapse to make the training stable, by the special structure and input SAR images. The azimuth predictor and similarity discriminator provide accurate information of optimization by the structure and module in discriminator and an adversarial mechanism in the aspect of similarity and azimuth separately. Furthermore, to make the training more stable, the loss of azimuth predictor takes the mean square error to replace the logarithmic error, and the loss of the azimuth predictor takes Wasserstein distance to relieve the model collapse.
(3) In the experiment, the generated SAR images not only achieve high similarity qualitatively and quantitatively but also promote greatly the recognition performance in a small dataset.
The remainder of this paper is organized as follows. An overview of the azimuth-controllable SAR target image generation network is presented in Section 2. Section 3 evaluates the performance of our proposed SAR target image generation network with experiments, and Section 4 gives a brief conclusion.

II. PROPOSED AZIMUTH-CONTROLLABLE GENERATIVE ADVERSARIAL NETWORK
In this section, the framework of the proposed azimuthcontrollable SAR target image generation network is presented. First, we elucidate the framework of the azimuthcontrollable SAR target image generation network. Then, the specific structures of the generator, discriminator, predictor in the proposed network will be presented.

A. Framework of Proposed Network
According to the manifold learning theory, data in the high-dimension space is often mapped from a low-dimension manifold and so do the targets in SAR image. The targets in SAR image in continuous azimuth can be mapped onto a low-dimension manifold [49], which means the distribution of target in SAR image are stable and learnable. Therefore, we proposed the azimuth-controllable SAR target image generation network, through designing the specific architecture and input form, to gradually learn the low-dimension manifold of the target in continuous azimuth and generate precise SAR target images in specific azimuth. The general process of gradually learning the low-dimension manifold by the proposed azimuth-controllable SAR target image generation network can be described as follows.
Given two SAR target images, I 1 and I 2 , with adjacent azimuths θ 1 and θ 2 , as the input of the generator G, the fake SAR target image I g with the azimuth θ 1 < θ g < θ 2 will be generated by G. However, in this initial state, the fake SAR target image I g has little same information as the real SAR target image I r with θ g . Then, the two images, fake I g and real I r , are inputs to the discriminator and predictor, D o and D a . The similarity discriminator D o will determine whether each image is real or fake, and the azimuth predictor D a will predict the azimuths of the two images. Through adversarial training, the generator and discriminator are both optimized [50]. Finally, the SAR target image I g with the azimuth θ 1 < θ g < θ 2 will be generated with high quality. After the rough process of describing how the proposed azimuth-controllable SAR target image generation network gradually learns the lowdimension manifold, the novel framework of the generator, discriminator, and predictor will be present to show how each part of the azimuth-controllable SAR target image generation network completes its function.
As shown in Fig.1, the framework of the proposed azimuthcontrollable SAR target image generation network consists of two parts, a generator, and a discriminator. For the generator, a specific topologic architecture with two parallel input block is designed to achieve the goal of extracting the target features from the two different input SAR images with adjacent azimuths, θ 1 and θ 2 . Besides, the cascaded residual blocks are also adopted in the architecture to preserve and fuse adaptively the extracted target features along the pipeline [51]. For the discriminator, similarity discriminator D o and azimuth predictor D a are designed. The similarity discriminator is designed to distinguish the generated SAR images from the real SAR images and provide the optimization direction of the image distribution. As for the azimuth predictor D a , it can acquire accurate azimuth of SAR target images and provide the optimization direction of the azimuth, which can lead to the azimuth controllability of the generated SAR image. The output fake SAR target image represents the generated image that is not good enough during the training process, and is considered as a fake image by the discriminator. When the training is basically completed and the features and azimuths of the generated image are fairly accurate, the generated SAR target image is the image generated by the network after sufficient training, which are the azimuth-controllable images we want to get.
Through the input form and architecture of the generator, discriminator, and predictor, the proposed azimuth-controllable SAR target image generation network can acquire a steadier training process than the other architecture of GAN for SAR image generation and generate more precise SAR target images with azimuth controllability.

B. Specific Implementation of Generator, Discriminator and Predictor
To generate the SAR target images with the controllable azimuth from the input SAR images with the adjacent azimuths, the structures and losses of the generator, discriminator, and predictor are designed as shown in Fig.2.
To extract the azimuth-specific information and learn the manifold of the target in continuous azimuth, a special topology structure is proposed with three parts, two parallel input blocks B pi1 (·) and B pi2 (·), an information-fusing block B if (·) , and a mapping block B m (·). Two SAR images I 1 and I 2 are input to the two parallel input blocks B pi1 (·) and B pi2 (·) separately and the information of the target feature is learned separately from the two input images with adjacent azimuths by where l pi1 and l pi2 denote the target features under different azimuths of the two inputs. Then the information-fusing block B if (·) is employed to obtain and retain the target information of the SAR target image with the azimuth θ 1 < θ g < θ 2 adaptively by mapping the information between l pi1 and l pi2 , which can be defined as where R int erp denotes the mapped information of the SAR target image with azimuth θ 1 < θ g < θ 2 . At last, combining the target information, the mapping block B m (·) can map the information from high dimension into the 2-dimension image space to generate the final precise SAR target images. Through this topology structure, the generator can obtain the capability of mapping the two input SAR images I 1 and I 2 with azimuth θ 1 and θ 2 to the SAR image I g with the azimuth θ 1 < θ g < θ 2 .
To maximize the retention of target information, the residual block is adopted in all three blocks, and batch normalization [52] is adopted in each layer.
Then, the similarity discriminator and azimuth predictor are designed in the similar structure. They both use the strided convolutions instead of pooling layers to allow the network to learn its own spatial downsampling and leaky rectified linear unit (LReLU) activation [53] for all layers. Meanwhile, batch normalization is adopted and all dense layers are removed for deeper architectures. For the similarity discriminator, the last convolution layer is flatted and then fed into a vector multiplication. After updating the parameter of the similarity discriminator, the weigh clipping is adopted to ensure the objective function as earth-mover (EM) distance [54]. For the azimuth predictor, considering the sensitivity for the azimuth, the deformable convolution [55] is adopted in the azimuth predictor. This deformable convolution has the capability of obtaining the target contour and direction by changing the shapes of the convolution kernels [55]. And the last convolutional layer of the azimuth predictor is flattened without activation as it is just a vector multiplication.
As described above, the training process of the proposed azimuth-controllable SAR target image generation network is a competition among the generator, discriminator, and predictor. More formally, the total value function of the proposed azimuth-controllable SAR target image generation network can be expressed as: where E denotes the expectation operator, Eu denotes the Euclidean Distance, I r denotes the input real SAR images of the discriminator and predictor, I 1 and I 2 denotes the two input real SAR images of the generator, and P data (I) denotes the distribution of the real SAR images, θ g means the azimuth of the input real SAR images of the predictor. The first two terms of F 1 are to ensure that the generated SAR images can be accurately distinguished by the discriminator. The third term of F 1 is for the generator to generate the images with the expected azimuth. As for the term of F 2 , it is an azimuth loss for the azimuth predictor, which intends to minimize the distance of the azimuth between the real SAR images and the generated SAR images.
Therefore, the loss of the similarity discriminator can be present by where E denotes the expectation operator, I r denotes the input real SAR images of the discriminator and predictor, I 1 and I 2 denotes the two input real SAR images of the generator, and P data (I) denotes the distribution of the real SAR images. For the azimuth predictor, the mean square error is proposed. The loss of the azimuth predictor can be present by where I r denotes the input real SAR images of the discriminator and predictor and θ g means the azimuth of the input real SAR images of the predictor.
As for the loss function of the generator, wasserstein-1 distance is used to replace the Jensen Shannon divergence of traditional GAN. It can be defined by As the proposed azimuth-controllable SAR target image generation network is training, the generator, discriminator, and predictor are updated alternately to be optimized. Therefore, while the discriminator and predictor can recognize the images and predict the azimuth more accurately, the generator can make the generated SAR images close to the real images in similarity and azimuth.
The exact steps of the data formation can be described as follow. For each target type, suppose the SAR images . , x n } as the half of input of the generator, with a set azimuth interval δ, the other half of input is set by finding a images x kf with the closest azimuth to θ k + δ in the sub-dataset {x k+1 , . . . , x n }. Then the input of the generator is set as {x k , x kf } and the input of the discriminator is set by finding some images x kd with the azimuths At the time, one combination of the inputs to the generator and discriminator are set as {x k , x kf } and {x kd }. Then next combination starts in the subset without {x k , x kd , x kf }.
Through the network design above and the training process, the parameters of the network are optimal. The proposed azimuth-controllable SAR target image generation network can generate precise SAR images and control the azimuth of the generated SAR images.

III. EXPERIMENTS AND RESULTS
In this section, the performance of the proposed azimuthcontrollable SAR target image generation network will be evaluated. The Moving and Stationary Target Acquisition and Recognition (MSTAR) is used to evaluate the whole network, and the image data of the training and testing will be firstly introduced in detail. Then, the performance of the azimuth-controllable SAR target image generation network for generating the stable and precise SAR target images will be evaluated with different azimuth intervals, the quality of the generated SAR target images will be presented as well. Finally, as the evaluation for an application of the generated SAR target images, the improvement and comparison of SAR ATR under both standard operating conditions (SOC) and extended operating conditions (EOC) will be presented [23].

A. Dataset and Configuration
The experiment dataset used to evaluate our proposed azimuth-controllable SAR target image generation network is collected from the MSTAR program. This dataset is released by the U. S. Defense Advanced Research Projects Agency 3. SAR images and corresponding optical images of targets at similar aspect angles. and the Air Force Research Laboratory. The dataset is collected using the Sandia National Laboratory STARLOS sensor platform. As a benchmark dataset for SAR ATR performance assessment, this dataset has a significant quantity of SAR images containing ten different classes of ground targets (tank: T62, T72, rocket launcher: 2S1, truck: ZIL131, armored personnel carrier: BTR70, BTR60, BRDM2, BMP2, air defense unit: ZSU23/4 and bulldozer: D7), which are captured as 1-ft resolution X-band SAR images with full azimuth coverage (in the range of 0°to 360°). These SAR images are collected under varying operating conditions, such as different aspect angles, depression angles, and serial numbers. The SAR images and corresponding optical images of the target at similar aspect angles are depicted in Fig.3 [23].
On the basis of the proposed azimuth-controllable GAN, a specific implement is employed to evaluate the proposed framework, which is present in Fig.2. The size of the input SAR images is 88 × 88, the stride size of every convolutional layer is 1 × 1 in the generator. Other hyper parameters in our network instances are shown in Fig 2. During the training, the range of weight clipping are −0.01 to 0.01 for the discriminator. To balance the adversarial game among the generator, discriminator, and predictor, the discriminator and predictor are optimized 25 times when the generator is optimized once.
The proposed methods is tested and evaluated on a computer with Inter Core I7-9700K at 3.6GHz CPU, Gefore GTX 1080ti GPU with two 16GB memories. The proposed method is implemented using the open-source TensorFlow framework.

B. Evaluation of Generation Ability of Proposed Network
In this subsection, after the introduction of dataset configuration for the evaluation of the generation ability, the generation ability of the proposed azimuth-controllable SAR target image generation network will be evaluated from the qualitative and quantitative respects. To show the visual similarity, the selected generated images of the ten targets will be present with the real images with close azimuth. For the quantitative similarity, the metrics of image similarity will be employed.

1) Dataset Configuration:
The SAR image at 17°depression angle is set as the training and testing dataset for the generation ability. And the original number of the SAR images in the MSTAR dataset is listed in Table I. Besides, the original dataset is divided into the training and testing dataset as 1:1, whose numbers are also listed in Table  I together.
As described above in Section II. B, the input form is combined separately as the azimuth interval are 5°, 10°, 15°, and 20°. After setting the azimuth interval as a certain value, and dividing the MSTAR dataset into the training and testing dataset, the training and testing dataset go through the data selection described above to get the training and testing combination which is listed in Table.2. For enough training data of the proposed GAN, the data listed in Table II is augmented 10 times by randomly sampling ten 88 × 88 SAR image chips from one original 128 × 128 SAR image, which ensures the target complete []. Finally, the training and testing data are used to evaluate the performance of the azimuthcontrollable SAR target image generation network.
2) Evaluation in Visual Similarity: To evaluate the generation capability of the proposed azimuth-controllable SAR target image generation network, the generated SAR images of 10 targets are randomly chosen in azimuth ranging from 0°to 360°, and the input real SAR images of the generator are presented in Fig.4. Besides, with the decreasing number of training and testing datasets, the generated SAR images will be presented together to evaluate the robustness under the limited training dataset.
As shown in Fig.4, by comparing the generated SAR images with the two input of the generator, it can be seen that the azimuths of the generated SAR images are intermediate between the azimuths of the input SAR images. When the target type is varying, the azimuths of the generated SAR images are in the desired range. Besides, when the azimuth interval is 5°, 10°, 15°and 20°respectively, the azimuths of the generated SAR images can still stay in the desired azimuth range. In conclusion, it is clear that the proposed network has azimuth controllability of the generated SAR images.
From Fig.5, by comparing with the real SAR images with close azimuth, it is quite clear that the generated SAR images can acquire accurate geometric features and morphological structures in real SAR images. When the target type and azimuth are varying, the generated SAR images can still maintain high quality. Furthermore, through comparing the generated with the real from top to bottom, it is clear that the quality of the generated SAR images can still preserve enough similarity to the real despite the image details are fading a little when the azimuth interval is increasing. As a result, when the azimuth intervals in the dataset are increasing, the size of the dataset is declining, and the proposed azimuth-controllable SAR target image generation network shows the resilience to the smaller datasets and still generates stably accurate SAR images.  We had carried out some experiments about the border line of the training azimuth interval that can generate highquality SAR images. From Fig.5, when the azimuth interval is increasing from 5°to 20°, the quality of the generated SAR images is decreasing obviously. And when the azimuth interval is 20°, the generated SAR images start to obviously have the problem of over-smoothing. Moreover, we had generated SAR images with the azimuth interval 30°in Fig.6 and Fig.7. The distributions of the generated SAR images are not similar to the real images, So, from the results of experiments, the border line should be between 20°to 30°.
In conclusion, the proposed azimuth-controllable SAR target image generation network can generate accurate SAR images with precise geometric features and morphological structures with the azimuth controllability, when the target type, azimuth, and the azimuth interval differ.
3) Evaluation in Quantitative Similarity: To evaluate the generated SAR images more objectively, three common metrics of image similarity are employed. The three metrics are mean square error (MSE) [56], structural similarity (SSIM) [57], and mean structural similarity (MSSIM) [58]. MSE is a direct distance between the real SAR images and generated SAR images. SSIM focus on the whole images, and MSSIM more focus on the local details in images. They can be calculated as follows.
where y denotes the generated SAR images and x denotes the real SAR images with close azimuth, m and n are the length and width of SAR images. The smaller MSE is, the higher the similarity is between real SAR images and generated SAR images.

SSIM =
(2µ x µ y + c 1 ) (2σ xy + c 2 ) µ 2 x + µ 2 y + c 1 σ 2 x + σ 2 y + c 2 where µ x and σ x denotes the mean value and the standard deviation of the real images x, c 1 and c 2 are two constants related to the dynamic range of the pixel values. SSIM ranges from -1 to 1, where 1 indicated perfect similarity.
MSSIM divided the SAR images into N blocks by sliding window, then calculate the weighted mean, variance, and covariance of all the blocks by w i,j , and i j w i,j = 1, the SSIM of each block is obtained. Finally, the average value of all the SSIM values of all the blocks is set as MSSIM.
where x k denotes the kth block of a generated SAR image, x k denotes the kth block of a real SAR image. Same as SSIM, the higher is better. Before the calculation of the three metrics, the generated images and real images are normalized to the range [0, 255].  Fig. 4. 10 randomly chosen generated SAR images and the input real SAR images of the generator. In ten combinations of three images in each subfigure, the left two images are the input real SAR images of the generator and the right one is the generated. (a) generated SAR images and the input real SAR images with 5°azimuth interval, (b) with 10°azimuth interval, (c) with 15°azimuth interval, and (d) with 20°azimuth interval.  value of SSIM is decreasing from 0.73, and the value of MSSIM are maintaining stable more than 0.99. In short, for the three metrics, the generated SAR images are of high quality and precise in visual similarity. For better verifying the effectiveness of the method, we compared the generation images by our method with other excellent generation methods, such as WGAN [54], DCGAN [59] and SARGAN [41]. The comparison of MSE, SSIM and MSSIM is listed in Table IV. The results summarized in Table  IV show that our model gets a better score than others. It is clear that the generated SAR images by our method are more similar to the real SAR images.

As shown in
Through the visual and numerical presentation above, it is clear that the proposed azimuth-controllable SAR target image generation network has the capability of generating precise SAR images with high quality between the two adjacent azimuths with preserving the geometry and local details when the target type and azimuth vary. When the azimuth interval of the training/testing dataset is increasing and the size of the training/testing dataset is declining, the proposed azimuthcontrollable SAR target image generation network can still generate precise SAR images based on the real SAR images with azimuth controllability.
To evaluate the applicative capability of the generated images further, the recognition experiment will be present under the declining training/testing dataset.

C. Evaluation of SAR ATR Performance Improvements
In this subsection, the improvement of the generated SAR images for the recognition performance will be evaluated under the standard operating condition (SOC) and extended operating condition (EOC) separately. SOC refers to that the serial numbers and target configurations of the train and test set are the same, but with different aspects and depression angles. We employed the recognition network proposed in [60] and it is denoted as A-ConvNets. This network is composed of five convolution layers, three max-pooling layers. And it has achieved high performance of recognition under the full MSTAR dataset. Therefore, it is employed to evaluate the performance of recognition without and with the generated SAR image by our proposed network.
1) Improvement of Recognition Results under SOC: In the SOC experiment, the training dataset is captured at 17°depression angle, the testing is 15°depression angle. The entire dataset of the training and testing has been listed in Table V. And under different azimuth intervals, the size of the dataset of training and testing will change. Therefore, for immediately apparent presentation, the summary of the training and testing dataset for the recognition performance is denoted as the azimuth interval listed in Table VI. For enough training data of CNN, the data listed in Table VI is augmented 10 times by randomly sampling ten 88 × 88 SAR image chips from one original 128 × 128 SAR image, which ensures the target complete [60].
The primitive recognition results are presented in Fig.8 with the blue bar, which show the training dataset is only the declined dataset without the generated SAR images. The evolved recognition results, whose training dataset is stacked by the declined dataset and the generated SAR images, are listed in Fig.8 with the red bar. And the targets of the histograms in Fig.8 are sequenced as BMP2-9563, BTR70-c71, T72-132, BTR60-7532, 2S1-b01, BRDM2-E71, D7-92, T62-A51, ZIL131-E12, and ZSU234-d08.
As (a) in Fig.8, the overall recognition rates are improved from 96.22% to 98.22% in the azimuth interval 5°. For most of the target types, the recognition rates are improved at least 2.00-3.00%, especially for I2S1 85.23% to 94.66%. And as for (b) in Fig.8, the overall recognition rate is improved from 95.05% to 97.74%. Same as the azimuth interval 5°, the recognition rates of most of the target types are improved obviously. Besides, the recognition rates of BMP2 and I2S1 are improved from 91.53% to 95.29% and from 83.02% to 90.64%, which greatly limits the overall recognition rates. Then as (c) in Fig.8, the overall recognition rate is improved from 93.88% to 97.14%. Although the recognition rates of the BMP2, I2S1, and T62 is not high enough, the others can  still get an obvious promotion. Finally, as (d) in Fig.8, the overall recognition rate is improved from 92.53% to 96.44%. The primitive recognition rates of BMP2, T62, and T72 are lower than 90.00% and the recognition rates of I2S1 are lower than 80.00%. After employing the generated SAR images, the recognition rates of BMP2, T62, T72, and I2S1 are improved to 93.33%, 92.68%, 97.71%, and 89.96% separately, whose improvements is 8.59%, 4.25%, 13.56%, and 8.88%. In conclusion, the comparison and improvement have demonstrated that through the employment of the generated SAR images, under different azimuth intervals, the stable distinguishable features for the recognition increase, and the recognition rates performance is improved obviously.
As the blue bar of the primitive results in Fig.8, the overall recognition rate is decreasing from 96.22% to 92.53% gradually. Therefore, it is clear that the recognition rate is decreasing when the azimuth interval is increasing and the size of the training dataset is declining. From the different improvement in different azimuth interval, 2.00% at 5°, 2.69% at 10°, 3.26% at 15°, and 3.91% at 20°, it can be summarized that the recognition rate can be improved more with the more recognition information provided by the generated SAR images when the size of the dataset is declining. From all the comparison and improvement above, it can be demonstrated that the recognition results can be promoted through the employment of the generated images in the SOC, and the stable distinguishable features for the recognition is increased among different targets. It can demonstrate the superiority of the proposed azimuth-controllable SAR target image generation network.
We have added some comparison experiments with different numbers of real images. The fake SAR images are generated in the 5°azimuth interval and the real images are chosen from these real images which were used to generate the fake SAR images. The recognition performances are as follow. In this Fig 1, the red bar denotes the training sample contains real SAR images and generated images, called evolved. And the powder-blue bar denotes the training sample only contains real SAR images, called primitive. As shown as X-axis in Fig.9, we set the training samples as 80.00%, 50.00%, 30.00% and 20.00%. To keep the number of training samples the same between evolved and primitive, the evolved training samples are augmented 10 times, and the primitive training samples are augmented 20 times. From the results in Fig.9 , the evolved recognition performances of SOC are obviously higher than the results of only real SAR images. From 100.00% to 20.00% training samples, the recognition ratios of primitive are decreasing prominently, but the evolved performances are robust against the decreased training samples. At the 20.00% situation, the primitive result seems to fail to recognize, but the evolved result still stays around 89.00%. It is clear that the generated SAR images can promote higher recognition 2) Improvement of Recognition Results under EOC: In the practical application of SAR ATR, there are many limitations in the recognition operation, such as the variances of the depression angle and target type. Therefore, it is a quite important aspect of evaluation in EOC. In this section, the performance of the SAR images generated by the proposed azimuth-controllable SAR target image generation network will be assessed in the variances of the depression angle, target configuration, and version, which is denoted as EOC-D, EOC-C, and EOC-V, respectively. The variance of the depression angle can extremely aggravate the performance of the recognition. Firstly, the performance of the generated SAR images will be assessed. Limited by that MSTAR dataset only contains four targets   Fig.10. In Fig.10, the primitive recognition rates of EOC-D are decreasing gradually from 90.22% to 87.52%, when the azimuth interval is increasing from 5°to 20°. Besides, the recognition rates are improved by 2.00-5.00% from the primitive to the   The recognition performances at the variance of target configuration and version (EOC-C and EOC-V) are also evaluated. The training datasets for EOC-C and EOC-V include four targets(BMP-2, BRDM-2, BTR-70, and T-72) at a 17°depression angle listed in Table V. The numbers of the training data of the four targets are listed in Table IX augmented by the same method as in SOC. The testing datasets for EOC-C and EOC-V are listed in Table X and Table XI. From Table X, two different serial types of BMP2 and five different serial types of T72 captured at 17°and 15°depression angles are employed to evaluate the recognition performance under the EOC of the target configuration varieties, EOC-C. From Table XI, there are four different serial types of T72 in the testing dataset captured at 17°and 15°depression angle and utilized to evaluate the recognition performance under the EOC of the target version varieties, EOC-V.
The recognition performances of EOC-C under different azimuth intervals are listed in Fig.11. And The recognition performances of EOC-V under different azimuth intervals are listed in Fig.12.
By analyzing and comparing the performance of EOC-C under different azimuth intervals with crosswise and lengthwise in Fig.11, it is clear that the evolved performance is slightly improved from the primitives. Although the performance is improved little when the azimuth interval is small as 5°or 10°, it can be improved around 3.00% when the azimuth interval increases to 15°or 20°. In short, the generated SAR images can still be useful for the recognition of EOC-C.
From Fig.12, it can demonstrate that the overall recognition rates of EOC-V under different azimuth intervals can be promoted 2.00-4.00% by the employment of the generated SAR images. As the azimuth interval is increasing, the improving capability of the overall recognition rates is decreasing from 4.00% to 2.00%, which can result from the sensitiveness to the azimuth interval. Although the improvements of recognition are limited under high azimuth interval, it is obvious that the recognition rates are improved from 94.20% to 98.07% at 5°interval and 93.45% to 97.36% at 10°interval. In conclusion, the generated SAR images are meaningful for EOC-V.
From the four experiment results of SOC, EOC-D, EOC-C, and EOC-V, with superior recognition performance, the generated SAR images are beneficial for all the recognition performance. It demonstrates that the proposed azimuthcontrollable SAR target image generation method has the capability of acquiring the precise distribution of SAR images in continuous azimuth and reconstructing the images from two adjacent images, which has great prospects in the applications of SAR ATR.

D. Comparison with Other Augmentation Methods
In this section, the proposed method will be compared with other augmentation methods in recognition. DNN1 [61], DNN2 [62] and CNN+matrix [63] used simple augmentation methods, like crop, rotate, and shift. WGAN-GP focuses on image data augmentation to generate new samples for SAR ATR [64]. DCGAN is employed to reduce the negative impact of the incorrectly labeled samples in SAR ATR [59]. MGAN is for semi-supervised SAR ATR and aims to improve the performance of SAR ATR under a limited training dataset [65]. SSDTL employs a variety of unlabeled samples for training a GAN [66]. IGAN achieves semi-supervised generation and recognition simultaneously [67]. DNN2(PoseSy) means the recognition performance with the augmentation of pose synthesis, Multiscale [68] employs randomly rotating and flipping, Weakly [69] employs randomly rotating in the recognition process.  The recognition performances are listed in Table XII under SOC. In Table XII, the number in parentheses is the number of the training samples for each method, the numbers of the labeled SAR images used for training the recognition networks are denoted as a range of numbers and the exact numbers of the labeled are marked between parentheses after the recognition rates.
From Table XII, it is clear that our proposed method outperforms the others with a limited sample size under SOC. In particular, under the condition that the total training samples are only 214, our method can generate effective SAR images for the recognition and still has a recognition rate of 96.44%, which is a significant improvement compared to the recognition performance of other methods under the condition that the training samples are around 200. Therefore, it can conclude that our proposed algorithm is superior to other augmentation methods in the SAR image generation or augmentation.

IV. CONCLUSION
In this paper, the proposed azimuth-controllable SAR target image generation network works for the problem of insufficient SAR target images, which is proved by the experimental results. Through the specific topological structure, the generator extracts and fuses optimally the target feature to acquire the precise generated SAR target images with the optimization information of similarity and azimuth distance provided by the similarity discriminator and azimuth predictor. The proposed azimuth-controllable SAR target image generation network obtain the capability of generating precise SAR images from two input SAR images and the azimuths of the generated SAR images can be controlled by the azimuths of the two given SAR images.
Extensive experiments have been carried out on the MSTAR dataset, and the results show clearly that not only the generated images by the proposed network are similar to the real SAR images, but also the azimuth of the generated SAR images is controllable. Besides, the generated SAR images can greatly benefit the performance of SAR ATR, especially in small sample situation. By employing SAR dataset of different imaging conditions and research demands with the proposed azimuth-controllable SAR target image generation method, it can make some contributions to the practical development of most SAR researches.