Image Enhancement Algorithm Based on GAN Neural Network

Deep underwater color images have problems such as low brightness, poor contrast, and loss of local details. In order to effectively enhance low-quality underwater images, this paper proposes an enhancement method based on GAN (Generative Adversarial Network). This paper studies low-light image enhancement algorithms, aiming to improve the quality of low-light images by studying some technical means and methods, and restore the original scene information of low-quality images, so as to obtain natural and clear images with complete details and structural information. In order to verify the effectiveness of this method, image databases such as DIARETDB0 and SID are used as the research object, combined with multi-scale Retinex color reproduction contrast-constrained adaptive histogram equalization to compare the performance of the enhanced algorithm. The results show that the processed image is better than other image enhancement methods in terms of color protection, contrast enhancement, and image detail enhancement. The proposed method significantly improves the indicators proposed in the article.


I. INTRODUCTION
The 21st century is an era of information, and human demand for information is increasing day by day. In the information demand, visual information accounts for a large proportion. An image is a storage medium for visual information, and the types of electronic imaging devices used to collect image information are increasing with the continuous development of electronic technology. The choice of electronic imaging equipment is often related to its application environment, and its own hardware conditions will also affect the imaging quality. For example, low visibility weather such as rain and snow, turbid and fuzzy underwater environment, etc., all make the imaging results show blurry images, color distortion and other problems [1]- [5]. Although imaging equipment with better hardware conditions can improve the image quality to a certain extent, its adaptability is poor, that is, under different environments: the quality of the acquired images will be different, which cannot fundamentally offset the impact of the environment [6]- [8]. It can be seen from this that blindly seeking to improve the hardware conditions The associate editor coordinating the review of this manuscript and approving it for publication was Zhen Ren . of imaging devices cannot meet human demand for image quality [9]- [11].
The ocean area accounts for more than 70% of the total area of the earth, and the ocean system plays an important role in the global ecosystem. At present, the world's major territorial sea powers can implement and accelerate the strategy of using marine resources and space [12]- [15]. General image processing applications are designed for normal illumination images [16]. Such low illumination images will greatly reduce the performance of image processing applications such as image classification, target recognition, image understanding and analysis. The road monitoring system detects road conditions through image transmission [17]- [19]. Li et al. used a neural network to generate a High-Dynamic Range (HDR) image of the input image, by fusing the HDR image and the original image, the information lost between stacks was used to learn and update the weights of the network, Finally, the trained network is used to enhance the image brightness. The image enhanced by this algorithm retains the brightness characteristics of the original image, while suppressing the problem of image boundary blurring [20]. Xiao et al. use a fully convolutional network to learn the weighted histogram of the input image, randomly add lighting to the input image, simulate the image with uneven illumination, and then learn the potential bad lighting information of the pixels to construct a better image weighted histogram, so as to effectively enhance the poor contrast area of the image while retaining the color and detail information of the image [21]. In practical applications, the low-illuminance image obtained when the lighting environment of the road section is poor will not be recognized by the human eye, and even cannot be successfully detected by the application, resulting in failure to detect reasonable traffic accidents and other problems in time [22]. 1) The attenuation degree of light of different wavelengths is different when propagating in the water, which makes the underwater image produce serious color deviation, mainly blue-green; 2) The scattering of the water body causes the underwater image to have low contrast and blurred texture; 3) Underwater Impurities and suspended matter cause images to introduce noise. In addition, low-light images often appear in medical images, military reconnaissance and other scenes.
In order to ensure that the collected low-light images can be effectively used, the research on low-light image enhancement technology has become computer vision One of the research hotspots in the field. Low-light image enhancement technology aims to restore the original low-quality image scene information through some technical means and methods, by improving the overall and partial contrast of the image, denoising, adjusting the image background and edges, etc. Obtain more complete details, clear and natural images with structural information. The improvement of lowilluminance image quality is not only a daily demand of human beings, but also facilitates the subsequent intelligent analysis of computer vision, so it is of great significance to the study of how to more effectively improve the quality of low-illuminance images [23]. This paper analyzes the problems of existing image enhancement methods, and proposes a low-quality underwater image enhancement method based on an improved GAN neural network. The low-light image enhancement algorithm is studied, aiming to improve the quality of low-light images and restore the original scene information of low-light images by studying some technical means and methods, so as to obtain natural clear and complete image details and structural information.

II. THEORETICAL BASIS A. METHOD BASED ON IMAGE FUSION
Image fusion technology is an important technical means of digital image processing. It collects information from different images in the same scene, and then reorganizes and fuses the extracted information as required to obtain an image. Through the complementation and enhancement of multiple image information So as to achieve the improvement of image quality [24]. In the field of low-illuminance image enhancement, image fusion technology mainly uses multi-exposure images for fusion. Multi-exposure image fusion technology is a method of extracting and fusing detailed information from a series of sub-images with different exposure levels in the same scene to obtain a fusion map with better image quality [25]- [28]. Ying et al. proposed an exposure fusion model and an enhancement algorithm to provide accurate contrast enhancement. This method first uses light estimation technology to design a weight matrix for image fusion, and then uses the camera response model to find the best exposure rate so that the composite image is well exposed in the underexposed area of the original image, and finally the input image is synthesized according to the weight matrix The images are fused to obtain an enhanced effect. In the same year, Zhao et al. [29] proposed a new enhancement method using camera response characteristics. First, the camera response model was obtained by studying the relationship between two images with different exposure levels, and then the exposure rate image is estimated with the help of illumination estimation technology, and finally the pixel points of the low-illuminance image are adjusted by the corresponding camera model and exposure estimation map. Cai et al. [30] established a large-scale multiple exposure image data set, Which contains low-contrast images with different exposure levels and their corresponding high-quality reference images.

B. BASED ON THE LEARNING METHOD
In recent years, deep learning technology has made great progress in the research of advanced computer vision fields such as image recognition, target detection, and image understanding, which has triggered a research boom among people from all walks of life [31]- [35]. The representative model in deep learning is Convolutional Neural Networks (CNN) [36]- [38]. The advantages of the three major structures of local receptive field, weight sharing, and pooling not only reduce training parameters, but also reduce training difficulty [39]. It also has good robustness to various distortion-invariant images such as zoom, rotation and translation, so that the extracted deep-level features have stronger generalization ability [40]. Therefore, some researchers apply convolutional neural networks to the field of low-light image enhancement. Lore et al. [41] used a stacked sparse denoising autoencoder to learn from the synthesized darkened and noise-added synthetic data set to adaptively perform low-light enhancement and noise reduction, and proposed a stacked sparse denoising autoencoder Device Guo et al. based on the MSR algorithm, using different Gaussian convolution kernels to simulate multi-scale Retinex [42], and proposed a low-light image enhancement network (MSRNet) based on CNN and Retinex theory. The network uses CNN to perform four-scale logarithmic transformation on low-contrast regions, then uses residual CNN to refine the boundary regions of the probability map, and finally performs color contrast weights on the results to obtain enhanced results. Zhen et al. proposed a CNN-based low-light image enhancement method, which used the inception module and residual learning to design a special convolution module to the use of multi-scale feature maps can avoid the problem of gradient disappearance, and can adaptively improve the brightness and contrast of the image [43]. Sun et al. [44] Chuanwen collected a low-light data set (LOL) containing low/normal light image pairs, and proposed an enhanced network based on Retinex theory learned on this data set.

C. BASIC THEORY OF UNDERWATER IMAGE PROCESSING 1) ATTENUATION OF LIGHT IN WATER
In an underwater environment, different wavelengths of light have different attenuation degrees. Generally, red light has the longest wavelength and the weakest penetrating power in water, so it disappears first. Blue light has the shortest wavelength in water. The penetrating power is the strongest, so it travels the farthest in the water. This wavelength-dependent light propagation is the main reason for the color deviation of underwater images [44]. According to the selective attenuation characteristics of light when propagating in water, Optical type radiance transmittance diagram of seawater show as Fig.1. If assuming that the background light is known, the difference between the red channel and the blue-green channel can be used to estimate the media transmittance graph [45]. The attenuation difference between the channels can be compared with the maximum value of the red channel and the maximum value of the blue-green channel. arrive: where: x is the pixel point, the difference between the maximum value of the red channel and the maximum value of the blue-green channel, the pixel value of the red channel, and the pixel value of the blue-green channel. The calculation formula of the red channel media transmittance estimated: From equation (3), the red channel attenuation can be obtained A r (x):

2) PRINCIPLES OF OCEAN IMAGING
These components contained in the sea water will cause the sea water to be uneven, which will affect the camera's light when the underwater imaging system is shooting. Therefore, the quality of the imaging results in this environment is often very poor, and the details of the image cannot be seen directly.
Research is of little help. Images taken in an underwater environment usually have two main problems, namely image blur and image color distortion. The main reason for the blurring of underwater images is that when light is transmitted in the water, water molecules and suspended particles in the water have a scattering effect on the light [46]. The problem of color distortion is due to the different attenuation rates of light in water, resulting in mostly blue-green images during imaging. Therefore, only in-depth analysis of the causes of these two main problems can we study a suitable underwater image enhancement algorithm, which is also the first step of the research.

D. GAN
In the literature, Goodfellow et al. proposed GAN [47]. GAN is a method of modeling data distribution, which consists of a generator and a discriminator. The goal of the generator is to generate an image that can deceive the discriminator, and the goal of the discriminator is to distinguish the true and false of the image as much as possible. According to the idea of zero-sum game, the generator and the discriminator continue to play against the game, so that the generator learns the distribution of real data. Since GAN does not pre-model the data, this leads to too free GAN-based methods show as Fig.2. and poor controllability. Therefore, the literature [48] proposed a conditionally constrained GAN, namely DE-GAN. DE-GAN is an improvement made on the basis of GAN, which guides the direction of GAN generation by incorporating condition information into GAN. Due to the addition of conditional information, DE-GAN is more controllable than the original GAN training and has a stronger learning ability. The overall objective function of DE-GAN is defined as follows: Here, G is the generator and D is the discriminator. I is the initial image, J is the real image, and G(I ) is the fake image generated by the generator. DE-GAN learns the mapping from the initial image I to the real image J by optimizing the above objective function, that is, G: I → J . The process framework of DE-GAN is shown in Fig.3. The initial image I in this article is a synthetic underwater image, the real image J is the corresponding Ground Truth, and the generated image G (I ) is the enhanced underwater image. Among them, the objective function of the discriminator part is: 36768 VOLUME 10, 2022  The objective function of the generator part is: For the discriminator D, its input is two image pairs composed of I, G(I ) and D is trained to distinguish these two image pairs, and its output is a scalar in the range [0,1]. This value reflects the probability that the input is a real image. When the value is greater than 0.5, it is considered true, and when it is less than 0.5, it is considered false. In the actual training process, G and D are not trained at the same time, but alternate training to form a mini-max game process, and finally Nash equilibrium is reached, that is, the output probability of D is 0.5, and it is impossible to distinguish the true or false of the image. At this time, the image G(I ) generated by G is the most similar to the real image J. In the actual test process, only the trained G is used for underwater image enhancement, and D does not participate in the test.

E. SID DATA SET
SID dataset: Chen et al. [49] constructed a low-light-extremelow-light image set called SID See-in-the-Dark data set by adjusting the exposure duration in a low-light environment, which contains 5094 images obtained through short exposure. Very low illuminance image, and each image has a corresponding low illuminance image acquired by long exposure. The SID data set contains 424 indoor and outdoor scenes.
In each scene, images with different levels of illuminance are obtained by adjusting short exposures with different lengths of time. DIARETDB0 dataset: contains 1467 images with 14095 images. Five scenes were captured through 10 cameras, including standard 200 training and test slices: 100 randomly selected identities were used as test data; 100 were used for evaluation, and the other 1268 images were used as training samples [50]. However, most images in real life use data storage types such as jpg or png, but the images in DIARETDB0 use the raw camera data for storage. At this time, if a network model trained on the DIARETDB0 dataset is used for images such as the effect of enhancement is poor. Therefore, this paper does not use the DIARETDB0 data set to train and test the proposed network model.

III. METHOD OF THIS ARTICLE
In order to solve the problem that it is difficult to obtain the paired training set required by the underwater image enhancement algorithm of supervised learning, an underwater image enhancement algorithm with an improved GAN model is proposed. On the basis of the traditional GAN adopting a global discriminator, local discrimination is introduced. It can effectively avoid the problem of local distortion. The generated image and the real image are input to the global discriminator, the generated image block and the real image block are input to the local discriminator, and the two discriminators jointly determine whether the generated image is true or false. If the generated image is recognized as a real image, the generated image is directly output (Enhanced underwater image), otherwise the discrimination result is fed back to the generator, and the generator continues to generate images to deceive the two discriminators, until the two discriminators can not distinguish the true and false of the generated image and the real image. For the discriminator network D. The discriminator uses the Patch GAN method, which calculates the probability for each N × N patch of each output image, and then averages these probabilities as the overall output. Such a result is usually achieved through a convolutional layer. Each element in the final output matrix actually represents a relatively large receptive field in the original image, which corresponds to a Patch in the original image. This method can speed up model convergence and calculation speed. The PatchGAN structure is shown in Fig.4.

A. OBJECTIVE FUNCTION
Since GAN proposed, the original GAN has always had problems such as difficulty in training, disappearance of gradient VOLUME 10, 2022 or explosion, which makes generator training insufficient. In order to stabilize training, many improved versions have improved GAN from the perspective of loss function, such as LSGAN, WGAN, WGAN-GP [51], etc., among which WGAN-GP loss shows the best image generation performance. Therefore, this article adopts WGAN-GP loss and modified to DE-GAN's setting as the counter loss, as shown below: Among them, G is the generator and D is the discriminator. I and J are the underwater image and the corresponding Ground Truth respectively. I is the sampling along the straight line between J and the image G(I ) generated by the generator. GP is the weighting factor. Compared with the traditional GAN loss function, WGAN-GP loss does not take the logarithm of the loss function. It uses Wasserstein distance instead of JS divergence to measure the distance between the probability distribution of real data and the probability distribution of generated data. The advantage of Wasserstein distance is that even if the support sets of the two distributions do not overlap or overlap very little, they can still reflect the distance of the two distributions. In this case, the JS divergence is a constant, that is, the gradient disappears. At the same time, the WGAN-GP loss also introduces an additional gradient penalty loss item λ GP EˆI [(||∇ˆI D(Î )|| 2 − 1) 2 ] to realize the k-Lipschitz limit of the discriminator, that is, to limit the gradient of the discriminator to a certain range, effectively avoid the problem of gradient disappearance and explosion, and accelerate the convergence of the traditional loss (L1 or L2) to measure the problem from the pixel perspective. Literature pointed out that adding L1 or L2 loss to the objective function can make G learn to sample from a globally similar space in the sense of L1 or L2, which is helpful for the image-to-image conversion task. Among them, the L2 loss can better reconstruct the high-frequency information of the image, but it will retain artifacts. Although L1 loss can remove artifacts at the cost of clearer edge reconstruction, it cannot effectively capture the high-frequency information of the image. For example, literature [13] only adds L1 loss to the objective function, and literature [12] only adds L2 loss to the objective function. The results are analyzed in detail. Therefore, in order to better obtain the high-frequency and low-frequency information of the image without producing artifacts, this paper adopts a combined method to introduce both L1 and L2 loss into the objective function: After combining the above losses, the final objective function of the network in this paper is obtained: The global discriminator distinguishes the authenticity of the entire image, and can only achieve global enhancement of the image. When the input image has a local area that needs to be enhanced differently from other parts, the global discriminator cannot perform specific enhancements to the local area. Therefore, in order to adaptively To enhance the local area of the image, a global-local discriminator structure is adopted, that is, in addition to the global discriminator, a local discriminator is added, as shown in Fig. 3(b). The local discriminator can learn from the enhanced image and Randomly select local small blocks from the real image, and distinguish whether they are from the real image or the generated image. The global-local discriminator structure ensures that the local area of the enhanced image is more real and natural in subjective vision, which is essential to avoid local distortion Important. For the global discriminator, Alexia [52] proposed a relative discriminator. The structure is divided into two parts: Estimate the real data Probability that is more real than the generated data. Estimate the probability that the generated data is more unreal than the real data. In an ideal state, I hope to infinitely approach 1, and infinitely approach 0. The standard function of the relative discriminator is: where C is the discriminator of the network x r x f are the real image distribution and the generated image distribution, P r is the probability distribution of the real image, P f is the probability distribution of the generated image, and E is the expected value, σ Is the Sigmoid activation function. Using the loss function proposed by Mao et al. [18], the final loss function of the global discriminator D and generator G is: For the local discriminator, 6 small blocks are randomly cut out from each generated image and the real image, and the size is 32 × 32. The loss functions of the local discriminator D and generator G are: 36770 VOLUME 10, 2022 In the formula: P rp is the probability distribution of the real image block, and P fp is the probability distribution of the generated image block. The calculation process of the global discriminator: first through five downsampling operations and then activation function processing to achieve image discrimination. Every time downsampling includes one convolution layer and one batch normalization layer. The Leaky Relu activation function is used. After the fifth downsampling operation, the final result is output through the Sigmoid activation function. Local discriminator the calculation process is: first through four down-sampling operations and then through Sigmoid activation function processing to realize the local image discrimination. Among them, each down-sampling operation is the same as the down-sampling operation of the global discriminator.

C. BUILDER
This article mainly uses the DE-GAN network structure. In order to preserve the original information of the picture as much as possible, the generator G uses the U-Net network instead of the traditional CNN. Its structure is similar to the codec, but with the traditional codec The difference is that the U-Net network uses a jump link technique, and the structure is shown in Fig.4. U-Net network combines low-level and highlevel feature map information, which can effectively retain more image details. The model generator G contains a total of 15 layers, of which the first 8 layers are convolutional layers, and the last 7 layers are deconvolutional layers. The first half of the generator G can be regarded as an encoder, which performs convolutional down-sampling operations on the input original underwater image, performs feature extraction, and uses batch normalization (BN) processing on the extracted features while using the Leaky-ReLU function as Activation function. The second half of the generator G is equivalent to the corresponding decoder. It uses a deconvolution network to up-sampling the input low-dimensional features to restore the low-dimensional features to their original size to obtain a clear underwater image. Achieve end-to-end learning. In order to prevent the occurrence of over-fitting, the BN and Droupout operations are used for the features of each layer of deconvolution, the Relu function is used to activate the network, and finally the tanh function is connected. The network model of the generator G in this paper is shown  in Fig.5, and the network characteristic flow chart is shown in Fig.6.

D. ADFF MODULE
Although the use of U-Net structure for underwater image enhancement is a very effective and extensive method, it still has a lot of room for improvement due to the lack of cross-scale connections and the inability to effectively integrate multi-level features in the U-Net structure. Inspired by the adaptive feature fusion in the literature [53], this paper proposes that the ADFF module is applied to the U-Net architecture of the generator. Different from the fusion method in the literature [54], the ADFF module focuses on the features of each level that have been extracted currently, and provides rich information for the extraction or restoration of the next level of features by adaptively fusing the features of all current levels. Helps the image reconstruction process. The core is to combine the dense connection to summarize the features of all previous levels to the current level, and input it into the ADFF module to adaptively learn the spatial weight of each level feature, so as to give more weight to important features for fusion. In the generator, the ADFF module structure of the n (2, 3, 4, 5) n = level of the coding module is shown in Fig.7,  FIGURE 8. Network architecture of ADFF module at the n-th level of the encoder of the generator. and the adaptive dense feature fusion is defined as follows: i n xy = α 1 xy · i 1→n xy + · · · + α n−1 xy · i (n−1)→n xy + α n xy · i n xy (17) I represents the output feature map fused by the ADFF module at the nth level of the encoder, i n xy Represents the feature vector at the (x, y) position on the output feature map in. i n (î 1 = i 1 ) Shows the potential feature map that is not fused by the ADFF module at the nth level of the encoder,î n xy Latent feature mapî n The feature vector at position (x, y) on the top. i 1→n Shows the feature map after adjusting the scale from the lth (l = 12,3) level of the encoder to the nth level, On the feature map i 1→n The feature vector at position (x, y). α l xy Represents the spatial weights of the feature map after scale adjustment, these weights are all obtained by the network adaptive learning. It's here, α l xy Represents a simple scalar variable, shared among all channels of the feature map. in α l xy + · · · α n xy = 1 and: α l xy , . . . , α n xy ∈ [0, 1] And define: αxy + · · · + e λ n−1 αxy + e λ n αxy (18) Here, by using the softmax function and using λ 1 α xy , . . . , λ n α xy Defined as a control parameter α 1 α xy , . . . , α n α xy . This article uses 1 × 1 convolutional layers from i 1→n xy , . . . , i (n−1)→n xy andî n xy Scalar graph λ 1 α , . . . , λ n α . Using this module, it is possible to adaptively summarize the important information of the characteristics of all previous levels at each level of the generator. In this paper, a convolutional layer with a step size of 2 is used for downsampling, and a deconvolutional layer with a step size of 2 is used for upsampling. In the generator, the ADFF module in the decoding module has the same structure as the ADFF module in the encoding module, but the down-sampling operation is replaced with an up-sampling operation. The specific details are shown in Fig.8.

E. STRUCTURAL SIMILARITY LOSS
Structural similarity [56] defines structural information from the perspective of image composition, reflecting the structural attributes of the object. The mean is used to estimate the brightness, the standard deviation is used to estimate the contrast, and the covariance is used to estimate the degree of structural similarity. The structural similarity can be expressed as where: p is the central pixel of an image block, x is an underwater image block with a size of 11 × 11, y is a generated image block with a size of 11 × 11, is the mean value, the standard deviation of x, and x mean value is the standard deviation of y and the covariance of y, C 1 =0.02, C 2 =0.03. When the structural similarity is known, the global structural similarity loss between the input underwater image and the generated image is: For the local discriminator, the local structure similarity loss between the local small blocks randomly cropped from the underwater image and the corresponding local small blocks of the generated image is: The overall loss function of the network model is: According to the training data and experimental results, the weights λ 1 , λ 2 , λ 3 , and λ 4 . L is the loss and G is the generated image. The heuristic experiment based on the training set found that the structural similarity loss is as important as the generator loss, so the sum of all structural similarity losses and the sum of all generator losses occupies the same weight ratio. At the same time, in order to avoid the phenomenon that the local loss is too high and the global effect is not ideal, the weight of the local structure similarity loss and the local generator loss is appropriately reduced. To achieve a balance between global loss and local loss.

IV. EXPERIMENT ANALYSIS A. EXPERIMENTAL EVALUATION AND INDICATORS
The training set includes 3 800 underwater images and 3 800 clear land images. These images are from the data set provided by Li et al. [57]. All training images are adjusted to a size of 256 × 256. Adam optimizer is used. The learning rate is set to 0.0001, and the batch processing size is 16. The network implementation is based on the Pytorch framework and uses NVIDIA1080Ti GPU. The algorithm in this paper is compared with the four classic underwater the image enhancement algorithm is compared and analyzed from subjective evaluation and objective indicators. The test set is 3 600 underwater images, and the images used are from the data set provided by the Internet and Islam [58].

1) SUBJECTIVE EVALUATION ON THE SID DATASET
The comparison results of the algorithm in this paper and the existing four classic underwater image enhancement algorithms are shown in Fig.4. The image processing effect after the algorithm processing is not ideal, the distant view still exists, and some images still have colors. Oversaturation problem. The algorithm introduces too many parameters, which makes the algorithm not robust. Literature [59] The overall image processed by the algorithm is yellowish, and the local visual effect is not natural enough. Literature [60] The image noise processed by the algorithm Obviously, the overall image is dark, and the local definition is not high. The image processed by the document [61] algorithm has local color casts that are not removed, and the local contrast is low. These four algorithms only enhance the global image and cannot achieve local areas. Specific enhancements result in poor overall visual effects. By comparing with these 4 algorithms, the algorithm in this paper corrects the color shift of the image, and realizes the specific enhancement of the local area of the image, the overall brightness is improved, and the visual effect is clearer and natural. In order to highlight the algorithm in this paper It can effectively retain more image details, and the image enhanced by the algorithm in this paper is the same as the image enhanced by the four classic underwater image enhancement algorithms in the literature [62], [63], [64], and [65]. The comparison result of the magnification processing on the SID dataset is shown in Fig.9. After partial magnification, the image processed by the algorithm in this paper is clearer than the original underwater image and the image enhanced by the four classic underwater image enhancement algorithms. Texture structure. The algorithm in this paper can repair more details of the image and has a better image enhancement effect.

2) OBJECTIVE INDICATORS ON THE DIARETDB DATASET
Four types of underwater image quality evaluation indicators are used, namely, underwater image quality measure (UIQM) [66], underwater image contrast measure (UIConM), and underwater image quality evaluation metric. (Underwater color image quality evaluation, UCIQE) [67] and information entropy (entropy). UIQM and UCIQE are currently recognized two comprehensive underwater image quality evaluation indicators. UIQM is measured from color measurement indicators (UICM) and definition Index (UISM) and Contrast Metric Index (UIConM) are three aspects to objectively evaluate the enhanced image. The higher the value of UIQM, the better the overall quality of the image, and the higher the value of UIConM, the better the contrast of the image. UCIQE is the enhanced underwater image is comprehensively evaluated from the three aspects of chroma, saturation and contrast. The higher the value, the better the overall effect of the enhanced image. Information entropy measures the amount of information contained in the image, and the higher the value, the better the overall effect of the enhanced image. The more information the image contains, the richer the content. Fig.10 shows the visualization results of our algorithm compared with traditional methods on the DIARETDB dataset.

B. EXPERIMENTAL VERIFICATION AND ANALYSIS 1) EXPERIMENTAL VERIFICATION
In order to reduce the random error, the objective evaluation index results of 3 600 test images are averaged, and the average value is accurate to 0.0001. The comparison results with each algorithm are shown in Table 1. The algorithm in this paper comprehensively measures UIQM, contrast measurement UIConM and information The Entropy index of the quantitative measurement is higher than other algorithms, indicating that the overall effect of the enhanced image of the algorithm in this paper is better, the contrast is higher, and the content details are richer. The algorithm in this paper has achieved sub-optimal results on the UCIQE index, combined with the UIQM evaluation index The result shows that the saturation of the underwater image enhanced. But the higher the saturation, it does not mean that the quality of the image is higher. When the saturation is too high, the image will be unrealistic. Unnatural effect. Generally speaking, the objective indicators of the algorithm in this paper perform better in contrast, comprehensive effect and information volume, and the image is clearer. Comprehensive subjective evaluation and objective indicators, the algorithm in this paper can effectively remove color cast and has better performance. High contrast, restore more image details and content information, improve the brightness of the image, and have a more natural and clear visual effect. It is superior to other algorithms in both subjective vision and objective indicators.

2) ABLATION RESEARCH AND ANALYSIS
In order to prove the effectiveness of each component of the model in this paper, a series of ablations were carried out to analyze the algorithm of this paper, and compared with the following generator structure: 1) Basic U-Net; 2) No ADFF module, only residual error Block-based U-Net-based generator (Ours-woADFF); 3) Add the ADFF module generator (the algorithm in this paper) on the basis of Ours-woADFF. For the above models, the same parameters as the algorithm VOLUME 10, 2022  in this paper are used for training on the synthetic data set for fair comparison. The qualitative and quantitative results are shown in Table 2 and Fig.10.
Comparing the structure of U-Net and Ours-ADFF from Table 2, we can see that the generator composed of residual blocks can improve performance, and the residual blocks can improve the learning ability of the network. Comparing the structure of this article with the Ours-woADFF structure, it is found that the adaptive dense feature fusion is very effective. The ADFF module adaptively retains and accumulates the important features of all previous levels. The qualitative results in Fig.11 are also consistent with the quantitative results in Table 2. The residual block is helpful for the extraction of underwater image features and can improve the color correction ability. The ADFF module contributes to feature fusion, which can prompt the network to use key and effective information to generate clearer images. All in all, the algorithm structure in this paper reflects the best performance of underwater image enhancement in terms of SSIM and PSNR indicators and visual effects, which also shows the effectiveness of the network model in this paper. Table 3 shows the objective indicators of the enhancement effect of the contrast enhancement algorithms FGF, Patch-net and the improved algorithm. It can be seen that the method proposed in this paper can effectively improve the AG (Average Gradient), EN (Entropy) and EME (Extended Maximum Entropy) of the image.

3) MODEL SIMPLIFICATION TEST
This paper uses the model to simplify the test and evaluation of the proposed related structures, which are called components here, including k-path feedforward structure (MF),    stage feature fusion unit (SFF) and pixel encoding module (PE). The network depth is 22 layers, and the test set uses the SID dataset. Fig.12 shows the changes in the entire training process from the beginning of training to convergence. The black line represents the VDSR structure; the blue line represents the use of multi-path connection. From Fig.12, it can be seen that: During the convergence process, the PSNR value fluctuates greatly, and it is not easy to converge. The structure of the pixel coding network can stabilize the convergence of the network, and has a certain effect.

4) MODEL LOSS CURVE
The parameter settings of the network structure in this paper are: variance 0.1, mean 0, dropout 0.5, bias 0.1, and initial learning rate 0.01. Fig.13 shows the verification loss and the training loss function change curve of the method in this paper when it is trained on the data set. It can be seen that the method  in this paper has a faster convergence speed on the data set. When the overall number of iterations reaches about 20, the verification loss and training loss reach a low level, indicating that the network model is well trained and can effectively perform target detection. Convergence curves of loss value as Fig.13.

V. CONCLUSION
This paper proposes an effective adaptive feature fusion underwater image enhancement method. First, a generator based on the U-Net structure is constructed; Secondly, it is proposed that the ADFF module is embedded in the generator. This module can adaptively learn the spatial importance weights of the current and previous level features, thereby using important features in different levels of features for effective fusion. Compared with the experimental results of other underwater image enhancement algorithms, the algorithm in this paper is significantly ahead of other algorithms in both subjective and objective evaluations, reaching the best, and can restore underwater images with fine details and natural colors. After comparing the algorithm with the enhancement algorithm proposed in recent years from the subjective vision and objective parameters, it is found that the algorithm can better simulate the lighting environment of the real scene when improving the brightness of the image, and further improve the detailed information of the image by enhancing the image. image quality. This algorithm is mainly suitable for enhancing images with uneven illumination intensity, and cannot restore the details of overexposed areas well, which greatly limits the universality of the algorithm. The next stage of work will focus on solving the details of image enhancement.