Restoration of Single Sand-Dust Image Based on Style Transformation and Unsupervised Adversarial Learning

Since dust particles in the air scatter and absorb light, images captured in sand-dust weather mostly show low contrast, color deviation and blurriness, seriously affecting the reliability of visual tasks. Currently, pixel-level enhancement and prior-based methods are used to restore sand-dust images. However, these methods cannot accurately extract semantic information from the images due to the loss of information and the complexity of the scene depth, which may lead to color distortion and blurred textures of the restored image. We thus presents a two-stage restoration method based on style transformation and unsupervised sand-dust image restoration network (USDR-Net). In the first stage, the grayscale distribution compensation (GDC) method is used to transform the style of the sand-dust image. After transformation, color shift is eliminated and potential information is restored in the balanced image. In the second stage, USDR-Net firstly employs the dark channel prior and the transmission map enhancement network (TME-Network) to generate and refine the transmission map of the balanced image to improve the accuracy of scene depth. Then, it reconstructs a clear image with actual color and high contrast via adversarial learning with unpaired sand-dust and clear images. Extensive experimental results show that our method outperforms state-of-the-art algorithms based on both qualitative and quantitative evaluations. The mean average precision for the target inspection datasets has increased from 16.79% to 68.82%.


I. INTRODUCTION
During sand-dust weather, a large number of dust particles with large radii are suspended in the air, absorbing and scattering the light reflected from objects before it reaches the imaging equipment. As shown in Figure 1, blue and green light is absorbed by dust particles much faster than red light. The absorption seriously attenuates the grayscale distribution in the blue and green channels of the RGB image, leading to overall color deviation and image distortion. Additionally, the imaging equipment receives sunlight scattered from the surfaces of the suspended particles. The scattered light adds noise to the image, causing loss of detail and reduction The associate editor coordinating the review of this manuscript and approving it for publication was Li He . in contrast, and resulting in dark and blurry images. These problems seriously affect the reliability of outdoor vision-related applications, such as intelligent monitoring, autonomous navigation, vehicle tracking, and monitoring and control systems [1], [2], [3], [4]. Therefore, it is of great practical significance to study visibility restoration algorithms for sand-dust images, eliminate the influence of sand-dust weather on images, and improve the reliability of outdoor vision tasks.
At present, there are mainly two categories of image restoration methods for addressing color deviation and low contrast, namely image enhancement methods and image restoration methods based on physical models. Image enhancement methods improve the contrast and correct the color of images only by enhancing pixels. There are mainly two categories of image enhancement methods: spatial domain image enhancement and frequency domain transformation image enhancement. Spatial domain image enhancement directly adjusts pixel grayscale values. Frequency domain transformation enhances the definition of an image by digital filtering, to transform the spatial domain into the frequency domain.Although these methods can improve contrast without regard to scene depth or atmospheric conditions. However, they have limited ability to remove dust from an image, and they suffer from local dust residues.
In restoration methods based on physical models, sanddust imaging mechanisms and the physical principles of light propagation are analyzed to establish an effective degradation model. In such models, the key parameters are derived from prior knowledge. Fog and dust can be effectively removed using these methods. The contrast of the image is, however, excessively enhanced by these filters, which may result in halo, color block distortion, and image distortion.
Deep learning has shown a promising way on underwater image restoration [5], [6], image dehazing [7], [8] and raindrop removal [9]. Instead of selecting parameters manually, automatic learning is possible with the use of network models. For example, a convolutional neural network (CNN) can be applied to estimate the parameters of the atmospheric scattering model, or a generative adversarial network (GAN) can be used to restore the sand-dust degradation images. Texture details and color fidelity can be significantly improved using these methods. However, these methods have serious limitations. First, these models require paired label data for training, but it is difficult to obtain paired clear and dust images in real outdoor scenes. Second, due to the complexity of sand-dust image features, serious information loss, and limitation of domain knowledge, existing models cannot accurately and comprehensively extract semantic information from sand-dust images, resulting in poor restoration effect.
To address these problems, we propose a two-stage sanddust image restoration algorithm. In the first stage, the sanddust image style is transformed into a balanced image without color cast by compensating the grayscale distribution of the heavily attenuated blue and green channels. Then, in the second stage, dark channel prior and unsupervised adversarial network are combined to optimize the transmission image of the balanced image and reconstruct a clear image with real color and high contrast. Figure 2 shows the restoration result of a sand-dust image by using the proposed algorithm.
The main contributions of this paper are as follows: 1. A GDC method is proposed based on information loss constraints that converts the style of a sand-dust image into a balanced image, eliminates color deviations, and recovers the potential feature information.
2. For the first time, unsupervised adversarial learning is applied in the task of sand-dust image restoration, and more natural results are generated by training on a large number of real images.
3. Compared with the deep learning-based restoration methods trained on paired synthesized images, our method is trained on real images. Paired data of sand-dust and corresponding clear images is not needed, which eliminates the dependence on sand-dust ground-truth images.

II. RELATED WORK A. IMAGE ENHANCEMENT METHODS BASED ON NONPHYSICAL MODELS
Classical algorithms with pixel-level enhancement for treating sand-dust images include histogram equalization (HE) [10]), gray world assumptions [11] and gray edge assumptions [12]. Xu [13] proposed a tensor optimization model to enhance the edges and details of sand-dust images. Cheng et al. [14] proposed an optical compensation method to correct the color of dust-degraded images. To enhance the quality of dust-degraded images, Fu et al. [15 ]proposed a fusion algorithm that corrects images through gamma parameters. Based on spatial and transformed domains, Yan et al. [16] proposed a method for enhancing dust images. Yang et al. [17] combined depth estimation, color analysis, and visibility recovery modules to address color distortion in sand-dust images. Park et al. [18] proposed a pixel-adaptive color correction method to achieve color correction and improve the clarity of sand-dust images. Alameen [19] proposed a rapid processing method for dust-degraded images, using optimized fuzzy enhancement operators in the histogram statistics of the R, G, and B channels of the sand-dust image. VOLUME 10, 2022

B. IMAGE RESTORATION METHODS BASED ON PHYSICAL MODELS
The atmospheric scattering model has been commonly used in computer vision applications, especially in processing images captured under bad weather conditions [2]. The Koschmieder model is defined by the following equation: where x is the position of the pixel in the image in the image, I (x) is the observed image, J (x) is the clear image, and A is the ambient light value in the atmosphere. t(x) is the transmittance, which represents the fraction of light rays that can pass through the haze medium to reach the imaging equipment. Assuming A is homogenous, where β is the scattering coefficient of the atmosphere and d(x) is the scene depth.
He et al. [8] proposed a method to remove fog based on dark channel knowledge. They estimated the atmospheric light and transmission for foggy images and applied an atmospheric scattering model to remove fog from the images. Peng et al. [20] estimated transmission by calculating the difference between the observed intensity and the ambient light in the sand-dust image. Based on an atmospheric scattering model and information loss constraints, Yu et al. [21] proposed an image restoration algorithm.To realize sanddust image restoration, Kim et al. [22] calculated transmittance and atmospheric light based on image saturation. Shi et al. [23] transform a sand-dust image to a haze image in the LAB space and restored sand-dust images by DCP fog removal and contrast enhancement. Cheng et al. [24] proposed an algorithm that combines white balance, gamma correction, and the saliency map to restore sandstorm-degraded images. Gao et al. [25] used the RBCP algorithm to estimate the atmospheric light and transmission map, and applied gray world theory to restore sand-dust images. Due to the complex and degraded characteristics of sand-dust images, these methods cannot effectively extract the semantic components in the image. The efficiency of feature extraction and the generalization ability of these models are low. To solve the above problems, we presents a two-stage restoration method based on style transformation and unsupervised sand-dust image network to generate high-quality restored images

III. PROPOSED METHOD
We decompose sand-dust image restoration into two-stage: sand-dust image style transfer and image restoration based on unsupervised adversarial learning. Our motivation is twofold. First, sand-dust images have complex features, and the information loss is severe. As a result of inaccurate and incomplete semantic information extraction, direct processing of sand-dust images is likely to yield poor results. Therefore, it is necessary to perform style transfer preprocessing first to recover lost information in the sand-dust images. Second, supervised learning methods need paired clear and sand-dust images, which are difficult to obtain in real-world outdoor scenes, whereas unsupervised methods do not require paired images and can overcome the issue.
As shown in Figure 3, in the first stage, the grayscale distribution method is used to convert the style of the sanddust image into a balanced image. In the second stage, USDR-Net adopts the DCP and TME-Net to refine the transmission map of the balanced image and reconstruct the final clear images via unsupervised adversarial training. Where TME-Network represent the refining networks. D refers to the discriminator. J rec is the reconstructed dehazed image via Eq. (13). I rec is the reconstructed hazy image via Eq. (1).

A. SAND-DUST IMAGE STYLE TRANSFER
Blue and green light are more readily absorbed by dust particles than red and orange light, resulting in a significant gray attenuation of blue and green channels in sand-dust images. To recover the lost details in the images, we propose a GDC algorithm based on information loss constraints, which transforms the distribution of a sand-dust images. Based on the grayscale distribution of the unattenuated red channel as a reference, the algorithm compensates and adjusts the grayscale distributions of the attenuated channels iteratively within the range of [0, 255], so that the grayscale distribution curves of the attenuated and unattenuated channels approximate each other. The algorithm works as follows.
(1) The sand-dust image is analyzed in the RGB space to collect the histogram statistics of the two-channel grayscale distribution.
(2) Samples are taken uniformly from the grayscale histogram of the red channel to generate a two-dimensional target point cloud P (Gray value and number of pixels with the same gray value); the target point set is denoted by P i ∈ P.
(3) Samples are taken uniformly from the grayscale histogram of the attenuated channels to generate a two-dimensional source point cloud Q (Gray value and number of pixels with the same gray value); the source point set is denoted by Q i ∈ Q.
(4) The average distance between the source and target point clouds is calculated as follows: where n is the number of sample points in the grayscale interval [0, 255] and T j is the compensation amount, whose initial value is T 0 = 0. (5) The grayscale distribution function of attenuated channels is iteratively compensated by T j units along the grayscale distribution direction of the red channel. The grayscale distribution after compensation is defined as: The grayscale compensation amount is updated as: (8) Return to Step 4 and use the entire grayscale sampling space to calculate the average distance between the source point cloud and the target point cloud, {d(T 0 ), d(T 1 ), . . . .d(T n−1 )}, as well as the information loss. Attenuated channels are compensated based on their average distance and the information loss constraints. When information loss is lower than the threshold, T is selected as the compensation amount that minimizes the distance between the target and source point clouds. Otherwise, the iterative update is not performed, and T is selected as the compensation amount for information loss at the threshold.
where E th is the pixel loss threshold. If E loss is greater than the threshold, the pixel overflow and information loss are large. The threshold is selected as: The final grayscale distribution after compensation is obtained from: The gray distribution of the red channel is used as the benchmark to compensate the blue and green channels, respectively. As shown in Fig. 4, the gray values of the blue and green color channels have been effectively recovered after style transformation, and the image's color appearance has been significantly enhanced. Potential color information of blue billboards and green bushes in Fig. 4(b) and green canals in Fig. 4(d) have been recovered, which provide more details for subsequent image restoration.

B. UNSUPERVISED SAND-DUST IMAGE RESTORATION NETWORK (USDR-NET)
By embedding the DCP model into USDR-Net, we propose a mixed framework that enables end-to-end training and inference. The proposed USDR-Net consists of DCP, TMEnetwork and a discriminator. DCP is used to generate ambient light A, a preliminary restored image J DCP , and a transmission map T DCP . TME-network is used to enhance the transmission map. The discriminator is used for unsupervised training, to distinguish whether a generated sample is true or false.

1) DCP PREPROCESSING
In most local areas of clear images, excluding the sky, at least one color channel has very low values for some pixels; the values may even be close to 0. For any input clear image J, its dark channel can be expressed as follows: where J c (y) represents the intensity observed at the clear pixel y in the color channel c, and (x) represents a window centered on pixel x. The size of the window is set as 15 × 15 in this paper. The Koschmieder's law is usually used to analyze image degradation, as shown in Equation (1). The goal of image restoration is to restore a clear scene J (x) from I (x). The transmission function T (x) and the atmospheric light A need to be estimated.
We minimize both sides of the Koschmieder's law twice as follows: T (x)). (11) According to the dark channel theory, J dark (x) = 0, and the transmission calculation is given by We select the brightest 0.1% of the pixels in the dark channel I dark (y) of the color-balanced image,and their values in the color channels of I (x) are averaged as A.

2) UNSUPERVISED ADVERSARIAL LEARNING
Since the T (x) estimated by DCP is rough and low precision, there might be a loss of details and texture in the restored image. To improve the quality of the restored image, we use the unsupervised TME-network to generate high-precision transmission images (T En ). The network adopts the standard encoding-decoding structure [26], which includes two convolution layers with a step size of 2, nine residual layers, and two upper sampling layers with a step size of 1/2. Convolution layers (except the first layer) are followed by batch normalization and ReLU. With the three components (T En , A(x) and I real ), we generate the restored image J restored by reformulating the Koschmieder's law as: To make the restored image closer to the clear image, we send unpaired restored images and clear images to the discriminator simultaneously. This discriminator consists of five convolution layers whose basic operations are convolution, batch normalization, and LeakyReLU activation. A symbolic function is used in the last layer of the discriminator to normalize the probability values to [0, 1].

3) TRAINING LOSSES
The training losses of USDR-Net includes 3 terms, i.e., the adversarial loss L a , the reconstruction loss L rec , and the total variation loss L t .
(1) Adversarial loss The adversarial loss is used to guide the network to reduce the difference in data distribution between the restored image and the clear image, which is defined as: (14) where J retored is the restored image, and J clear is the unpaired clear image.
(2) Reconstruction loss We reconstruct the hazy input as I rec using T En , J DCP , and A based on Eq. 1. The TME-Network is updated by minimizing the distance between I real and I rec .
(3) Total variation loss Total variation loss is applied to preserve structures and enforce the unsupervised network to generate images that have the same statistical properties as clean images.
where k and v represent the horizontal and vertical differential operation matrices, respectively. (4) Overall loss function: We use the following three loss functions to train the proposed network: where α, λ and β are positive weights.

IV. EXPERIMENTAL RESULTS AND ANALYSIS
Datasets and network training details are presented in this section. In addition, to demonstrate the effectiveness in restoring sand-dust images, quantitative and qualitative comparisons between the proposed method and other state-of-the-art approaches are conducted.

A. DATASETS
As the whole framework is unsupervised,we collected 1500 real sand-dust images and 2300 unpaired clear images through the Internet to train the network.We adopted random cropping as a pre-processing step to perform data augmentation, to improves the robustness of our method.
To better illustrate the effectiveness of our method for processing real-world images, we collect 136 real-world sanddust images (i.e.,RWSDI) with different concentrations with no corresponding clear images as test datasets. Three noreference metrics are used to evaluate the quality of the restored images, namely the natural image quality assessment (NIQE) [27], the number of visible edges of the enhanced images (e-score), and the visible edge gradient ratio (r-score) [28]. A lower NIQE value indicates better image quality. The e and r metrics are blind contrast indicators, and larger values indicate better image contrast.

B. TRAINING DETAILS
Our network is implemented using the TensorFlow framework. The model is trained and tested on an NVIDIA GeForce RTX 2080 Ti GPU. In the training process, we empirically set α = 0.02, β = 1, λ = 10-1, and employ the Adam optimizer with a learning rate of 0.0002. All the training samples have been resized to 512 × 512 and the network trained for 50 epochs.
According to the results in Table1,e was taken as quantitative valuation metric,and the results of our algorithm were better than those of the other algorithms, which shows that our algorithm can effectively recover the number of visible edges of the sand-dust image.The results of r metric were also better than those of the other algorithms,which shows that our algorithm can effectively generate high contrast images with more visible edges. In addition, the NIQE scores of the proposed algorithm are lower than those of other algorithms, indicating that our method can produce more natural enhanced images well.

2) QUALITATIVE EVALUATION
To better illustrate the effectiveness of the proposed method, we tested it on various sand-dust images derived from the real sand-dust environments.
As illustrated in Fig. 5, we perform a qualitative comparison of our results with different state-of-the-art methods on processing sand-dust images. Pixel-level enhancement methods such as GW [11], VRB [17], and CCH [18] can correct color deviation by compensating for attenuated color channels and matching histograms. However, these methods have a limited capability for dust removal. The images processed by GW [13] are still low-contrast and blurry, and the sharpness is still low. VRB [17] and CCH [18] have improved the clarity of the images, but color deviation has only been locally corrected. Prior-based methods such as DCP [8] and STME [22] are able to restore details of the scenes and objects, but the results suffer from over-enhancement. Images restored by DCP [8] are too dark, with color distortions in the sky area. STME [22] always fails to recover sand-dust images with severe blue artifacts. These algorithms are based on hand-crafted priors, and thus have an inherent problem of overestimating transmission. Due to the complexity of sanddust images, the inefficient feature extraction of Dehaze-GAN [7] results in dark images. In contrast, our results retain sharper contours with less color distortion and are visually closer to clear images.

3) VISUAL ANALYSIS OF CONTOURS
The edge is an important structural feature in an image, which often exists between the target and the background. Using the Sobel operator, we extract and visualize the edges of an image in both horizontal and vertical directions. As shown in Fig. 6, the white lines in the second-row images represent the detected edges. Due to the low contrast of sand-dust images, only a few edges can be detected. Compared with other methods, more edges can be extracted from the image restored by our method. The outlines of trees and basketball stands in the image are clearly visible.

D. ABLATION STUDY 1) ANALYSIS OF IMAGE STYLE CONVERSION AND IMAGE RESTORATION NECESSITY AND EFFECTIVENESS
Our sand-dust image restoration framework includes two modules: sand-dust image style conversion module and image restoration module based on unsupervised adversarial learning. To verify the effectiveness of the framework, we conduct two ablation experiments: 1) performance test after style conversion (module 1); 2) performance test after removing the image restoration module based on unsupervised adversarial learning (module 2). Table 2 compares the e, r, and NIQE scores between the restoration framework and the above two ablation schemes by using the RWSDI datasets. As shown, our method achieve higher scores than the others, proving the necessity and effectiveness of the framework.  Visual comparisons are shown in the Fig 7. From Fig 7, we make the following observations: 1) solely using La tends to introduce unwanted structures that don't exist in the original scene, mainly due to the lack of structural constraints of L rec ; 2) solely using L rec tends to introduce undesirable artifacts into the restored images. The main reason is that L rec has no idea on how to improve the result when L a is not present; 3) combination of L a and L rec can depress the unwanted structures, which can be explained by the L rec loss instructing USDR-Net to generate similar outputs as its inputs;4)combination of L a , L rec and L t can significantly improve the quality of the restored image with respect to contour details, contrast and color fidelity.

E. IMPROVEMENT OF OBJECT DETECTION ACCURACY BY IMAGE RESTORATION
To verify whether image restoration by the proposed method improves the performance of advanced computer vision tasks, we use the Faster R-CNN [29] recognition model to test on sand-dust images, restored images, and corresponding clear images pre-trained with the COCO datasets. The average precisions and mean average precision (mAP) are calculated to evaluate the restoration effect.
The statistical results in Table 3 show that the average precision of the detection rates of cars, bicycles, pedestrians, buses, and birds in restored images have increased by 50.32%, 53.5%, 56.51%, 52.09%, and 53.23%, respectively, compared with those in sand-dust images. The mean average    precision has increased from 16.79% to 68.82%. There is little difference in average precision and mean average precision between restored images and clear images. These results demonstrate that image restoration by the proposed algorithm is practical and effective for object detection.
According to Fig.8, sand-dust can cause missing detections,inaccurate localizations and unconfident category recognitions. Our algorithm has greatly increased the numbers of detectable cars, pedestrians, and buses, improving the accuracy of object detection.

V. CONCLUSION
For sand-dust images with serious detail loss and complex features, we propose a restoration method based on style transfer and USDR-Net. First, the sand-dust images are preprocessed with style transformation to recover potential information while eliminating color deviation. Then, USDR-Net is trained with a large unpaired dataset to generate final clear images through adversarial learning.Numerous experimental results confirm the superiority and efficiency of our algorithm. Moreover, we also present the improvement in the object detection.