Simulation of GPR B-Scan Data Based on Dense Generative Adversarial Network

Urban subsurface infrastructures, e.g., pipelines and roads, are aging with the expansion of modern cities. Benefiting from the capability of nondestructive detection, ground penetrating radar (GPR) has been widely applied to underground objects or disasters detection, and GPR B-scan images are employed by manual interpretation. This way of high subjectivity and uncertainty inevitably results in failure of detection. Meanwhile, the shortage of labeled images greatly impedes the automatization and intelligentization of underground disaster detection based on GPR. Many data simulation techniques, e.g., forward modeling, were used to augment images for training; however, the generated forward images were not similar enough to the real B-scan data, which makes recognition a challenging task. To address this problem, we proposed a novel B-scan image simulation method based on a generative adversarial network to generate synthetic images for training detection networks. Our network utilizes DenseNet as the backbone network of the generator to extract image features, and a weighted total variation regularization term to regularize the loss function of the network. The comparison and ablation experiments verified that our network could generate simulation images with high similarity to real GPR B-scan images. We believe that this work contributes to the intelligent processing and analysis of GPR data and improves the efficiency of underground disaster detection.


I. INTRODUCTION
U RBANIZATION essentially is a population migration from rural to urban areas, which brought us towns, cities, and even supercities in the last hundreds of years. In this progress, the urban infrastructure gradually getting age and overdue service is very common. Conventional detection techniques, however, are expensive when taken as routine detection techniques and not practical for modern cities due to heavy traffic. The new techniques and equipment with high efficiency and low laborious cost are in dire need of the development of modern cities. Being a new technology for the detection of underground objects and disasters, ground penetrating radar (GPR) with the advantages, e.g., nondestructivity and high efficiency, has been widely applied in various fields to detect pavement structures [1] or underground objects, e.g., underground pipe networks [2], buried explosion hazards [3], [4], and underground disasters [5]. GPR emits high-frequency electromagnetic waves to the ground, receives different echo waves varying with the characteristic of the underground medium, and forms two-dimensional images called GPR B-scan images. By artificially interpreting these B-scan images, ones could recognize the underground objects and potential disasters in a nondestructive manner. However, this manual recognition often results in uncertainty and subjectiveness in detection and the high risk of missing potential disasters. Deep learning, a popular technology in academic and industry circles, provides a potential solution for processing, interpreting, and analysis of GPR data. It, however, still faces a barrier, i.e., the shortage of labeled B-scan images, which greatly confines the performance of deep learning algorithms and consequently impedes the research and application of artificial algorithms for GPR data. The forward model is a common technique to generate synthetic images from the parameter configuration of radar and mediums, and it also became a tool for researchers to study the recognition of underground disasters. Giannopoulos [6] developed an open-source software Gprmax that could be used to simulate electromagnetic wave propagation in fields with different mediums. Aydin et al. [7] trained a convolution network by using the simulated GPR images generated by Gprmax to detect buried objects. Aydin et al. [8] also used Gprmax to simulate the electromagnetic waveforms of the wires buried in soils with different moisture and realized wires detection based on transfer and multitask learning. Pham et al. [9] realized the detection of underground objects by using Faster-RCNN [10], which is trained by a mixed dataset containing the synthetic images generated by Gprmax and real GPR images. Real GPR B-scan data are influenced by many factors, e.g., underground environment, electromagnetic condition, and atmosphere, but the forward model, e.g., Gprmax, could not completely model the influence of these factors and the generated forward images are pretty simpler than and not similar enough to the real ones, as shown in Fig. 1. This gives a great challenge for training neural networks of detection and limits the application of deep learning in the field of GPR images.
Being a kind of popular deep neural network, a generative adversarial network (GAN) [11] provided a new thought for many intelligent applications. Hammami et al. [12] augmented the computed tomography (CT) dataset by using cycle-GAN [13] Fig. 1. GRP B-scan data and its simulation images of disasters. The first column is the real GPR B-scan images, and the last two columns are synthetic images by using the forward modeling algorithm and ours, respectively. and trained a deep neural network by the dataset for detecting multiple organs. Guo et al. [14] generated synthetic aperture radar (SAR) images to retain the original characters of real images. Sun et al. [15] proposed a multisensor fusion and explicit semantic preserving-based deep Hashing method named MsEspH to deal with the discrepancies between VHR and SAR images. Lin et al. [16] proposed MARTA-GAN based on DCGAN to realize scene generation of remote sensing images. Guo et al. [14] simulated SAR images by using GAN, and Gao et al. [17] proposed a semisupervised method based on DC-GAN for recognizing objects in SAR images. Lebedev et al. [18] and Peng et al. [19] made improvements based on CGAN and original GAN, respectively, and applied them to change the detection of remote sensing images. Jiang et al. [20] proposed EEGAN to enhance edges of remote sensing images, and Yu et al. [21] proposed CDGAN to realize superresolution reconstruction of remote sensing images. Considering the promising capability of GAN generating images, Veal et al. [22] used GAN to simulate GPR B-scan images of buried explosive targets. Akçali and Erden [23] first employed Gprmax to generate synthetic GPR images, which are fed to DCGAN to simulate GPR B-scan images; then, the data are mixed with real GPR B-scan data to train a detection network for buried targets.
To improve the quality of synthetic images for augmenting the GPR dataset, we proposed a deep GAN called Dense-GAN by combining DenseNet [24] and GAN together. The proposed network generates synthetic GPR images more similar to the real ones than forward modeling does. The main superiority of our network is as follows.
1) Introducing GAN to simulate GPR B-scan data greatly improves the simulation quality. 2) Using DenseNet instead of ResNet in various GANs greatly promotes the expression ability for details in the B-scan image. 3) Adding a weighted-TV (w-TV) term into the loss function is helpful to preserve the edges of synthetic GPR images.
The associated qualitative and quantitative experiments verified that our network could efficiently generate synthetic images more similar to the real ones.
The remainder of this article is organized as follows. Section II briefs the works closely related to ours. In Section III, we expound the proposed Dense-GAN including the network architecture and loss function. Section IV provides qualitative and quantitative experiments to show the performance our network. Finally, Section V, concludes the article.

II. RELATED WORKS
A typical GAN consists of a generator G and a discriminator D and approaches convergence when getting Nash equilibrium. G captures the distribution of images and generates forged images; being a binary classifier, D estimates the probabilities of the images belonging to the real or forged images. Training GAN is essentially a min-max game process, and its loss function [11] is defined as (1) where x denotes the real images; P data is the distribution of the real images; z is the forged image starting off with a random noise image; P z is the distribution of the forged images; E z∼P z (z) is its expectation. The training of GAN is realized in an alternate style as follows. 1) Fixed G, train the discriminator D for one or more epochs to max D V (D, G). 2) Fixed D, train the generator G for one or more epochs to min G V (D, G). When the network converges, D could not distinguish the forged images from the real ones, and the forged images generated by G are similar enough to the real ones.
Being a convolutional neural network (CNN) with an additional dense connection between layers, DenseNet improves the performance by using feature fusion and bypass instead of increasing the number of layers or neurons. In DenseNet, every two layers are connected, in such a way, the input of every layer is the union of previous layers in the network. The output of the ith layer could be formulated as where x l is the output of the ith layer; [x 0 , x 1 , . . . , x l−1 ] denotes the concatenation by all feature maps; H l (·) is a nonlinear function. DenseNet combines features in different scales by a nonlinear function instead of a concatenation operator used in ResNet.

III. PROPOSED NETWORKS
A. Network Architecture 1) Generator: As a variant of DenseNet, DenseNet-BC has fewer parameters and reduces overfitting of training, so we take it as the backbone network of the generator for feature extraction. As shown in Fig. 2, DenseNet-BC is mainly composed of convolution layers, dense blocks, and transition layers. Each dense   block consists of the batch norm, ReLU, and convolution operations arranged as BN + ReLU + Conv 1×1 +BN + ReLU + Conv 3×3 to extract features. Herein, transition layers adjust the scale of feature maps in order to cascade multiple dense blocks as shown on the left panel in Fig. 3, and each of them is realized by BN + ReLU + Conv 1×1 +Polling avg as shown on the right panel in Fig. 3.
2) Discriminator: The discriminator in Dense-GAN has two parts as shown in Fig. 4. The first part of the discriminator is composed of a Conv-ReLU leaky combination and 7 Conv-ReLU leaky combinations, where the size of the convolution kernels is 7 × 7; the convolution steps are 1, 2, 1, 2, 1, 2, 1, and 2, respectively; the number of convolution kernels is 64, 64, 128, 128, 256, 256, 512, and 512, respectively. The last part of the discriminator consists of activation functions ReLU leaky and Sigmoid. The output of the discriminator is the probability of the forged image belongs to the real ones.

B. Loss Function
The loss function of our network is designed based on the perceptual loss proposed by Ledig et al. [25] and has three terms, i.e., where E g (G) is to measure the similarity between the forged images and the real images by using a VGG network. VGG network, proposed by Karen Simonyan et al., is a deep CNN composed of convolution layers and pooling layers; it is used here to extract the features of real and forged GRP B-scan images; subsequently, the extracted features are used to calculate E g (G) by where f i,j (·) denotes the feature map of the jth convolution layer before the ith max pooling layer. W i,j and H i,j are the dimensionality of the feature map f i,j . The adversarial loss term is used to measure how similar the forged image is to the real image Total variation (TV) regularization [26] is a common technique to make sure the forged image is smooth while it does not discriminate the edges and the flat regions in images and further results in edge blurring of forged images. Meanwhile, considering that underground objects and disasters in GPR where g(·, ·) is essentially an edge indicator defined as where G σ is a Gaussian kernel function with the standard derivation σ, and g(·, ·) herein indicates the edges in real GPR B-scan images. In such a way, the image edges in real GPR B-scan images could be well retained in the forged images; meanwhile, the flat region keeps smooth.

IV. EXPERIMENTS
To verify the performance of our network, we compared it with GAN, DCGAN, and SRGAN, since the first two networks, i.e., GAN and DCGAN, were used in [22] and [23] to augment GPR data, and SRGAN [25] inspired our network, i.e., Dense-GAN. Considering that there are four underground disasters, i.e., void, cavity underneath pavement (CUP), loosely infilled void (LIV), and water-rich void (WRV), defined in the standard for comprehensive detection and risk evaluation of underground disasters in urban areas, we compared the simulation performance in terms of the four disasters. Limited by the difficulty of acquiring and labeling real GPR B-scan images, we have 1193 GPR B-scan images on hand of which 751 ones (including 293 void, 272 CUP, 106 LIV, and 80 WRV) are selected as the training set and 442 ones (including 180 void, 202 CUP, 35 LIV, and 25 WRV) are used to test the simulation performance. Once the networks are finished with training, we use the trained generators to simulate the different B-scan images and then make comparison on 442 forged B-scan images.
The structures of the four networks provided in Table I show that GAN basically applies fully connected layers (FC layers) and the activation functions (i.e., tanh and Sigmoid) to build its generator and discriminator; DCGAN replaces the fully connected layers with CNN due to its better performance on feature learning; SRGAN takes ResNet as the backbone of the generator to achieve deeper features; our network takes DenseNet as the backbone network of the generator to better fuse shallow and deep feature.
In addition, the parameter scale, memory size, and computation cost of the fourth network are also provided in Table II. The statistics show that GAN is the most efficient and light one of four networks; both the number of parameters and computation cost of DCGAN increase since CNN is applied; SRGAN got a dramatic increment both in memory size since it has to store the output of the previous layer and in computation cost since more convolution operations are needed to compute; our network takes DenseNet as the backbone network of the generator that need more parameters and cost more memory since DenseNet needs to store more feature maps to fuse.

A. Comparison Experiments
To provide a visual evaluation, we randomly selected two forged images from each underground disaster, i.e., void, CUP, LIV, and WRV, respectively.
As shown in Fig. 5, the forged GPR B-scan images by GAN basically could not well simulate the real ones, particularly, on the boundaries between different mediums. The forged ones by DCGAN are smoother than the real ones and missed a great deal of wave details that is important for disasters classification.  5. Forged images generated by four networks, respectively. The first column is the real GPR B-scan image, the second column is the forged image generated by GAN, the third column is the one generated by DCGAN, the fourth column is the one with SRGAN, and the fifth column is the one with our network, i.e., Dense-GAN. Besides, the first two rows are of void. The second two rows are of CUP, the third two rows are of LIV, and the last two rows are of WRV.
Compared with the previous two methods, SRGAN could realize GPR data simulation while there still exist the phenomena of oversmoothed and details missing in the forged images. Among these methods, our network, i.e., Dense-GAN, could achieve the most similar visual simulation result to the real GPR data. In addition, we provided the objective comparison with respect to PSNR and SSIM on forged B-scan images shown in Table III. These quantities show that our network achieved the best PSNR and SSIM scores for each kind of disaster.

B. Ablation Experiments
To augment the GPR B-scan data and further improve the quality of the simulation images, we proposed a Dense-GAN by introducing DenseNet-BC and a w-TV regularization term.
Considering that these are main contributions of our network,  we provided ablation experiments to show the necessity and effectiveness of these two improvements.
Since ResNet is a backbone network widely used in various GANs, e.g., SRGAN, we compared it with DenseNet. To make a comparison, we replaced DenseNet with ResNet in our network to form a network that is applied to simulate GRP B-scan images in terms of four underground disasters. In Fig. 6, we found that with the number of epochs increasing, the simulated disasters become more similar to the real ones; however, due to applying DenseNet as a generator, the forged images by our network could achieve better performance with clearer echo wave response by fewer epochs than the network with ResNet as a generator. In addition, to verify whether our network could generate qualified forged images with fewer epochs, we also compared these two networks in terms of SSIM and PSNR with same epochs. Fig. 7 shows the PSNR and SSIM curves of forged images generated by using DenseNet are all higher than the curves of ResNet after same epochs.
To keep more image details in forged images, which are crucial for recognizing underground disasters, we designed a w-TV regularization term, i.e., (6), for the loss function of our network. To verify its effectiveness, we formed three loss functions by removing the w-TV term, i.e., E wtv from our loss function, by replacing E wtv with a traditional TV regularization term, i.e., E tv = |∇G θ G (I True )| 2 , and by keeping our loss function, respectively, further compared the forged images by the networks trained with these three loss functions, respectively. Fig. 8 shows that the forged images generated by the network with the w-TV regularization term of four disasters are more similar to the real GPR B-scan images. This experiment suggests that the w-TV Fig. 8. Ablation experiment about the w-TV regularization term in the loss function. The first column is the real GPR B-scan image, the second column is the image patch in the red rectangle, the third column is the image patches generated by our network removing the w-TV term from our loss function, the fourth column is the one generated by our network that replaces the w-TV term with the TV term, and the fifth column is the one generated by our network. regularization term contributes to generating higher qualified simulation B-scan images with image details.
In summary, the comparison experiments suggest that our network could generate the simulation GPR B-scan images more similar to the real ones; the ablation experiments illustrate the effectiveness of DenseNet for feature fusion and the w-TV regularization term for keeping details in simulation images are promising.

V. DISCUSSION
Thanks to the nondestructive, GPR is a proper modern detection technique for underground disasters of urban infrastructure, e.g., roads and pipelines; however, the shortage of labeled GPR data results in difficulty when artificial intelligent methods, particularly deep neural networks, are applied to automatic underground disaster detection. To augment GPR B-scan data for training networks, we proposed a network based on GAN by introducing DenseNet as the backbone of the generator and designing a w-TV regularization term for the loss function. The associated comparison results suggest that our network could achieve promising simulation performance and obtain high-quality GPR B-scan simulation images. The ablation experiments also verified the necessity of our improvement. In the future, we will focus on generating simulation B-scan images of subsurface disasters and objects and use the simulation data mixed with real B-scan images to train various deep networks for the detection of underground disasters and buried objects.