FastDerainNet: A Deep Learning Algorithm for Single Image Deraining

Existing neural network-based methods for de-raining single images exhibit dissatisfactory results owing to the inefficient propagation of features when objects with sizes and shapes similar to those of rain streaks are present in images. Furthermore, existing methods do not consider that the abundant information included in rain streaked images could interfere with the training process. To overcome these limitations, in this paper, we propose a deep residual learning algorithm called FastDerainNet for removing rain streaks from single images. We design a deep convolutional neural network architecture, based on a deep residual network called the share-source residual module (SSRM), by substituting the origins of all shortcut connections for one point. To further improve the de-raining performance, we adopt the SSRM as the parameter layers in FastDerainNet and use image decomposition to modify the loss function. Finally, we train FastDerainNet on a synthetic dataset. By learning the residual mapping between rainy and clean image detail layers, it is able to reduce the mapping range and simplify the training process. Experiments on both synthetic and real-world images demonstrate that the proposed method achieves increased performance with regard to de-raining, in addition to preserving original details, in comparison with other state-of-the-art methods.


I. INTRODUCTION
Pictures taken in the rain are more likely to lose highfrequency information because of the existence of rain streaks; thus, the quality is degraded. However, most high-level computer vision tasks, such as image classification [1], [2], object detection [3], [4], and image segmentation [5] require high quality images. Therefore, rain streak removal has become essential in the field of computer vision. Fig. 1 exhibits an example of single image rain streak removal.
In the last few years, numerous methods have been proposed for single image rain streak removal. These methods either employ models with different priors [6]- [8], or sparse dictionary learning [9], [10]. Although these models have achieved extensive progress in some cases, there is still substantial room for improvement.
The associate editor coordinating the review of this manuscript and approving it for publication was Md. Asikuzzaman . Owing to powerful feature representation [1], several deep learning models have been proposed for removing rain streaks [11]- [22]. However, most existing methods suffer from two significant limitations. First, some methods [14]- [16] directly apply deep convolutional neural networks to eliminate the complex influence of rain streaks. However, actual results are dissatisfactory, as they do not take into consideration that extraneous information included in images could interfere with the training process. Second, when an object's shape and size are similar to those of rain streaks, such deep learning methods [17], [18] are unable to effectively propagate the features, which can generate over-smoothed results.
In view of the above issues and inspired by a residual network (ResNet) [23], [24] and a dense convolutional network (DenseNet) [25], we designed an SSRM specifically for de-raining. Then, we combined the SSRM with image processing domain knowledge using a structure similar to that of DerainNet [12], to create a deep residual learning algorithm called FastDerainNet, as shown in Fig. 2.
The main contributions of our work are summarized as follows: 1) We propose a modified residual architecture, the SSRM, in which all shortcut connections share the same starting point. This change from the original ResNet can accelerate convergence and help to obtain the de-rained result more rapidly. 2) We introduce FastDerainNet, which directly learns the nonlinear relationship between the detail layers of rainy and clean images. FastDerainNet decomposes the image and trains the SSRM on the high-frequency detail layer, thereby reducing the mapping range. This further improves de-raining performance. 3) Further, using real-world clean images as the ground truth, we synthesize a dataset to train our network. Experimental results demonstrate that the proposed method can effectively and efficiently perform rain removal.

II. METHODOLOGY
In this section, we describe and analyze the framework of the proposed FastDerainNet from four aspects: the SSRM architecture, image processing, negative residual mapping [13], and the loss function. Fig. 3 presents the ground truth image and rainy image of a flower, along with de-rained images produced by different components of the proposed algorithm, and the proposed algorithm itself. We can express the similarity between each image and the ground image as the structural similarity index (SSIM) [28], and use the peak signal-to-noise ratio (PSNR) [29] as another quality metric. These results, shown in Table 1, demonstrate the effect of each component of the proposed algorithm.

A. SSRM ARCHITECTURE
Owing to its shortcut connections [23], ResNet significantly increases image classification performance. However, it requires a large number of calculations because these identity mappings [24] are connected one by one. Thus, we introduce the SSRM to further improve the information flow between layers and accelerate convergence. Specifically, this method will use the same source and identify a variable in advance, to optimize the calculation of each residual block.  As shown in Fig. 4, both ResNet and the SSRM with N layers include (N − 2)/2 residual blocks and shortcut connections when compared with a standard convolutional neural network (CNN). In ResNet, all residual blocks are assembled end-to-end; in other words, these shortcut connections do not strongly depend on each other, even though they are trained jointly. However, the integrity and adaptability of the foundation in the SSRM are strengthened by making all shortcut connections share the same origin. Hence, all residual blocks are different but closely linked.
Through these shortcut connections, both feedforward and feedback signals can be directly transported. Traditional feedforward architectures pass on information that needs to be preserved, whereas ResNet propagates information between layers by building additive identity mappings. In this sense, the SSRM preserves the above two characteristics. Thus, compared with a CNN, the SSRM improves the flow of information and strengthens feature propagation. When compared with ResNet, the SSRM can better ensure the integrity of propagated information, which can guarantee the effectiveness and superiority of our algorithm.
As shown in Fig. 4 (c), if we set I 0 as the input, I n in and I n out respectively represent the input and output of the SSRM in the n th layer. Following this, the structure of the SSRM can be described as follows: where n = 1,2,. . . ,(N −2)/2 with the total number of layers N , σ (·) is the rectified linear unit (ReLU) [26], * is the convolution operator, W indicates weight parameters, b is the bias, and BN (·) refers to the batch normalization function [27].

B. IMAGE PROCESSING
Based on the literature [12], the rainy image can be considered to consist of two parts: the base layer and the detail layer, which can be expressed mathematically as where R is the rainy image and the subscripts 'detail' and 'base' represent the detail and base layers, respectively. Because there are only rain streaks and partial object structures remaining in R detail , the detail layer is sparser than the rainy image. Thus, we train the SSRM on the detail layer instead of the image domain. Instead of randomly choosing filters, we always employ the same principle that DerainNet employs [12] to reduce the solution space by compressing the mapping range. As shown in Fig. 2, we choose two filters to minimize R neg−mapping1 , which are formulated as First, via a low-pass filter, we obtain the optimized R * base which is formulated as Likewise, based on the known R * base , we obtain R 0 detail via a high-pass filter Finally, according to Eqs. (7)-(9), we determine the optimal R neg−mapping1 , which greatly simplifies the training process.
To measure the similarity between two images, the structural similarity index (SSIM) [28] and peak signal-to-noise ratio (PSNR) [29] are used for quantitative evaluation.
In addition, we adopt non-reference metrics, a blind/referenceless image spatial quality evaluator (BRISQUE) [38], and an integrated local naturalness image quality evaluator (IL-NIQE) [39], which measure the visual quality in different ways. We decompose the image into a high-frequency part R detail and a low-frequency part R base using a bilateral filter [30] and a guided filter [31], respectively. These results are shown in Table 2.

C. NEGATIVE RESIDUAL MAPPING
When compared with clean image C, the pixel values of rainy image R are higher because rain streaks tend to appear white in images. Thus, most C−R values tend to be negative; this is known as ''negative residual mapping.'' As shown in Fig. 2, R neg-mapping1 and R neg-mapping2 , two negative residual mappings, are built to improve the efficiency of image decomposition. First, by learning the residual values between output image C and the base layer R base , we obtain the de-rained residual layer R neg−mapping2 , which can be expressed as Following this, by learning the residual between R neg−mapping2 and detail layer R detail , we obtain the detail residual layer R neg−mapping1 , formulated as, where R neg−mapping1 represents the output of the SSRM, and f b w (·) denotes the SSRM.

D. OBJECTIVE FUNCTION
According to Eqs. (10)- (12), the objective function of Fast-DerainNet can be defined as the mean square error (MSE), which is formulated as where M is the total number of training samples, m indexes the image, and || · || F is the Frobenius norm.

III. EXPERIMENT A. DATASET
As it is difficult to obtain a ground truth corresponding to real-world rainy images, we collected 2000 clean images and synthesized 12 000 rainy images. For example, Fig. 5 (a) is used as the ground truth to generate six rainy images with three directions and two magnitudes. To improve the reliability of the model, we took various measures to perfect our synthetic data. First, to create the greatest number of training samples possible, we randomly selected 9000 images to generate 3 million 64 × 64 rainy and clean patch pairs with a mini patch size of 20. The remaining image pairs were used for validation and evaluation. Furthermore, we adopted partially synthetic data to optimize the data sets. Specifically, we used real data as the initial sample, with the synthetic data filled. Therefore, the model differs between various executions because of the differences in the components of the data set. Finally, by comparing the metrics, we found that the difference between the models is small, which proves that our synthetic dataset can be generalized to the real world.

B. PARAMETER SETTINGS
We set the network depth to N = 18 and used stochastic gradient descent (SGD) [32] to minimize Eq. (13) with a weight decay of 10 −10 and momentum of 0.9. The learning rate was initialized at 0.2 and multiplied by 0.998 per 100 iterations, with training terminating at 120 000 iterations. We set filter sizes to 3×3, and the channel was set to 3; filter numbers and more concrete configurations are presented in Table 3. We did not use pooling with respect to possible adverse impacts on the accuracy of the network, even where overfitting and dimensionality reduction were anticipated. Fig. 6 shows visual comparisons for four synthetic rainy images; two regions of interest are selected to highlight the effects. Notably, DerainNet fails to remove the rain streaks in heavy rain, whereas joint rain detection and removal (JORDER) [13] and the density-aware single image de-raining multi-stream dense network (DID-MDN) [15] are able to remove most of the heavy rain streaks,   while also generating obvious artifacts. However, both recurrent squeeze-and-excitation context aggregation network (RESCAN) [22] and FastDerainNet efficiently distinguish most of the rain streaks and achieve better visual effects. To further illustrate this improvement, Table 4 lists the calculated metrics for each of the three methods, using PSNR and SSIM for quantitative evaluation. Our results for the proposed method are both subjectively and objectively superior to those of the other methods evaluated.

D. RESULTS ON REAL-WORLD IMAGES
Since a ground truth corresponding to real-world rainy images was not available, it is impossible to quantitatively compare the results. As shown in Fig. 7, JORDER still contains rain streaks, and the results from DerainNet contain obvious unprocessed artifacts. Furthermore, DID-MDN, tends to blur image details and generate over-smoothed images. In contrast, RESCAN and FastDerainNet produce far fewer artifacts, thereby preserving more textural detail. However, the results of RESCAN seem darker, which could  produce negative effects on visual perception. The BRISQUE values are shown in Table 5. We recruited participants to rate each group of results. For each result, 30 participants were required to respond to each of the two questions shown in Fig. 8, using a Likert scale [40] ranging from poor to excellent. Results are shown in Fig. 8, where each subfigure shows five rating distributions for the methods on a specific question. The distributions across methods show that FastDerainNet is preferred by human observers; our method received more ''excellent'' and fewer ''poor'' ratings compared with other methods. Table 6 compares the runtimes of the different state-of-the-art methods. All methods were implemented in MATLAB and pycharm on a GPU, following the original settings of all  the released codes. Note that the DID-MDN method [15] requires the input image size to be fixed to 512 × 512; therefore, we tested only 512 × 512-sized images for all methods. To balance the trade-off between performance and computation efficiency, we chose 16 parameter layers for the above experiments.

E. COMPUTATIONAL COMPLEXITY
It was observed that DerainNet [12] and RESCAN [22] consume a substantial amount of time because of a complex optimization process. The computation time of JORDER is limited by redundant network parameters [13]; regardless, we found it to be faster than RESCAN. The batch normalization layers in DerainNet [12] cause this method to consume more test time than other methods, although it uses fewer network parameters. The average running time and parameter number results show that FastDerainNet is an impressive method quantitatively, in spite of employing a light-weight framework.

IV. CONCLUSION
In this paper, we proposed a novel end-to-end network named FastDerainNet to remove rain streaks from single images. To enhance the effects, we designed a novel residual architecture, called the SSRM. Compared with other CNNs, the SSRM shared more information between layers, thereby making more relevant information available. In addition, by using the SSRM on high-frequency components, Fast-DerainNet directly learned the residuals between rainy and clean images. Experiments on both synthetic and real-world rainy images demonstrated that our method achieves superior SSIM, PSNR, BRISQUE, and NIQE performance compared with state-of-the-art methods.
Finally, it should be noted that, although we proposed FastDerainNet for de-raining problems, this method can be applied equally to other image-to-image transformation tasks, such as denoising [33], dehazing [34], and de-blurring [35].  ZHIYUAN TIAN was born in 1996. He is currently pursuing the M.S. degree with the Shanghai University of Engineering Science. His research interests include control systems, image processing, and machine learning.
YUANHONG REN received the B.S. degree in information and computing science from the Wuhan University of Science and Technology and the M.S. degree in computer application technology from the Wuhan University of Technology. Her research interests include image processing and intelligent theory and control.
WUNENG ZHOU received the M.S. degree from Central China Normal University, in 1982, and the Ph.D. degree in control science and engineering from Zhejiang University, in 2005. He is currently a Professor with Donghua University. His research interests include system theory and control theory, control engineering, and robust control theory and application.