DEDU: Dual-Enhancing Dense-UNet for Lowlight Image Enhancement and Denoise

In this paper, we propose an innovative image enhancement algorithm called “Dual-Enhancing-Dense-UNet (DEDUNet)” that simultaneously performs image brightness enhancement and reduces noise. This model is based on Convolutional Neural Network (CNN) algorithms and incorporates innovative techniques such as Decoupled Fully Connection (DFC) attention, skip connections, shortcut, Cross-Stage-Partial (CSP) and dense blocks to address the brightness enhancement and noise removal aspects of image enhancement. The dual approach to image enhancement offers a new solution for restoring and improving high-quality images, presenting new opportunities in the fields of computer vision and image processing. Our experimental results substantiate the superior performance of the proposed algorithm, showcasing significant improvements in key performance indicators. Specifically, the algorithm achieves a Peak Signal-to-Noise Ratio (PSNR) of 19.17, Structural Similarity Index (SSIM) of 0.71, Learned Perceptual Image Patch Similarity (LPIPS) of 0.30, Mean Absolute Error (MAE) of 0.09, and a Multiply-Accumulate (MAC) of 0.696G. These results highlight the algorithm’s remarkable image quality enhancement capabilities, demonstrating a considerable advantage over existing methods. Experimental results demonstrate the superior performance and efficiency of the proposed algorithm in terms of image quality improvement compared to existing methods.


I. INTRODUCTION
Image enhancement is a cornerstone in computer vision and image processing, playing a crucial role in generating high-quality visuals and extracting essential information.Traditionally, the emphasis in image enhancement has revolved around two key aspects: improving image brightness and reducing noise.Traditionally, these tasks were often addressed separately, with dedicated models for each, resulting in a cumbersome processing workflow.However, our model, designed to simultaneously address both brightness enhancement and noise reduction tasks, demonstrates significantly improved efficient operational performance compared The associate editor coordinating the review of this manuscript and approving it for publication was Yi Zhang .to the traditional approach [1].This paper introduces the 'Dual-Enhancing-Dense-UNet' algorithm, a groundbreaking approach poised to overcome these challenges by integrating image brightness enhancement and noise reduction into a unified, efficient process.Meticulously crafted for computational efficiency, this consolidated methodology aims to preserve or even enhance the quality of the resultant enhanced images.
Harnessing the power of skip connections, our algorithm adeptly mitigates information loss and manages network complexity, effectively reducing the risk of overfitting.Strategically employed shortcut connections preserve the integrity of the original image data while contributing to the generation of superior images.Additionally, we leverage DFC technology to further elevate the effectiveness of image enhancement [3].DFC fine-tunes interactions among feature maps in deep neural networks, yielding superior performance in the image enhancement process.The amalgamation of these pioneering techniques is poised to significantly augment the performance of image enhancement, rendering it an invaluable tool across a myriad of application domains.
The primary motivation behind this research is to provide potent image enhancement solutions capable of seamlessly operating under diverse real-world environmental conditions.Whether enhancing images captured in dimly lit settings or those besieged by substantial noise, our algorithm aspires to offer a versatile and adaptable solution.These technological advancements have the potential to empower a wide spectrum of applications, encompassing but not limited to medical imaging, autonomous driving systems, security surveillance, the mining industry [2], and beyond.In the following sections, we delve into a comprehensive exploration of the existing literature in Section II, presenting related works that contextualize our research.Section III unfolds the intricacies of our proposed model, providing a detailed exposition of its architecture and functionality.Moving forward, Section IV focuses on the verification of our model's effectiveness, outlining the methodologies employed for rigorous evaluation.Section V delves into a nuanced discussion, analyzing the implications and findings derived from our model's performance.Finally, we encapsulate our study with a conclusive summary and draw meaningful insights in the Conclusion section.

II. RELATED WORKS A. HISTOGRAM EQUALIZATION-BASED METHODS
Traditional methods often relied on histogram equalization techniques for image enhancement.These approaches aimed to redistribute pixel intensities to achieve a more uniform histogram, improving overall brightness and contrast [4].While effective in certain scenarios, they struggled to handle complex issues like noise reduction and adaptability to diverse environmental conditions.

B. RETINEX MODEL-BASED METHODS
Retinex model-based methods, inspired by human vision, address image enhancement by decomposing images into illumination and reflectance components.These methods, including Single-Scale Retinex (SSR) and Multi-Scale Retinex (MSR) [5], have shown promise in simultaneously enhancing brightness and reducing noise.However, challenges persist in adapting these models to real-world scenarios with varying lighting conditions.

C. GAN-BASED METHODS
Recent advancements in Generative Adversarial Networks (GANs) have spurred interest in using adversarial training for image enhancement [6].GAN-based methods leverage a generator-discriminator framework to produce realistic and enhanced images.While successful in some applications, GAN-based approaches often face challenges related to training stability and fine-tuning for specific tasks.

III. PROPOSED MODEL: DUAL-ENHANCING-DENSE-UNET
Our image enhancement algorithm, the ''DEDU,'' stands as a pioneering creation that unites various technical components and architectural innovations to optimize the efficacy of image enhancement.In the following sections, we introduce the fundamental elements and technical intricacies that underpin this algorithm, all of which have been developed from the ground up.

A. ARCHITECTURE DESIGN
The 'DEDU' architecture, depicted in Figure 1, is derived from the renowned U-Net architecture-a cornerstone in image processing that has inspired many frameworks in the field [7].It enhances the smooth flow of information through the use of skip connections.The elegance of this architecture lies in gracefully traversing layers of varying depths, starting from the input image, to perform sophisticated image enhancement tasks.The journey begins with the stem block, employing 2D convolutions with kernel sizes of 1 and 3 and strides of 1 and 2, followed by direct downsampling using maxpool2d with a kernel size of 2 and a stride of 2. The main blocks of encoder levels 1, 2, and 3 consist of Dense blocks with DFC attentions, CSP, maxpool2d, Mish, and batchnorm2d, as showcased in Figure 2. Decoder levels 1, 2, and 3 mirror the encoder but use Convtranspose2d with a kernel size and stride of 2 for upsampling instead of maxpool2d.The DFC attention's kernel size was set to 1 and 5 with a stride of 1, and the CSP block had a kernel size of 1 for both.
The transpose block starts with a Bottleneck composed of Conv2d with a kernel size of 1 and Conv2d with a kernel size of 5. A significant departure from the traditional U-Net structure is the use of skip connections to link the encoder and decoder, replacing concatenation with skip connections and utilizing 1 × 1 convolutions and related weights to carry information collected from intermediate layers to their corresponding counterparts in the upper levels [8].This ingenious mechanism preserves both advanced functionalities and intricate details crucial for the enhancement.Please refer to Table 1 for a concise model summary.

TABLE 1. Model summary:
The number of channels gets halved and doubled as data passes through the down and up transpose layers.
We employed ReLU and Mish activation functions, L1 loss function, and Gradual Warmup scheduler [9] for training.The hyperparameters were set to 1000 epochs, an initial learning rate of 1e-4, weight decay of 1e-4, and a batch size of 16.The learning rate starts at 1e-4 and stabilizes around 800 epochs during the training process.

B. DENSE BLOCK CONSTRUCTION
The construction of dense blocks forms the cornerstone of the algorithmic performance [10].It intricately integrates advanced technical elements such as modified DFC attention, shortcut connections, and innovative CSP techniques [3], [11].The dense block configures the DFC attentions akin to Densenet, incorporating depthwise convolution in shortcuts and reducing model parameters through hidden channels.The combination of the Main block and DFC Dense block, consisting of 1 × 1 convolutions and four layers of CSP, constitutes the dense block.The biggest difference from traditional DFC attention is that the spatial structure was preserved using Conv2d rather than Linear function on the Fully connect part.Leveraging these technical components, the dense block optimizes interactions between layers, significantly enhancing feature extraction across the entire image enhancement process.This ultimately raises image quality and significantly contributes to taming noisy levels that often plague images.For a deeper understanding, explore the architectural representations of each DFC attention and DFC dense block in Figure 3 and Figure 4.

C. ACTIVATION FUNCTIONS
A pivotal facet of our algorithm's prowess lies in the judicious selection of activation functions.We employ a versatile set, including ReLU, Mish activation [12], [13].These activation functions infuse invaluable non-linearity into the network's fabric, equipping the model with the capacity to learn intricate patterns and features essential for image enhancement.

D. LOSS FUNCTION
Throughout the training phase, we harness the power of the L1 Loss function [14].L1 Loss endeavors to minimize the absolute disparity between the actual improved image and its predicted counterpart.This strategic choice serves as a guiding beacon for the model, compelling it to learn in a direction that converges the improved image towards an optimal similarity with the original image while manifesting the desired enhancement effects.
E. NORMALIZATION To fortify the overall stability of the image enhancement network, we introduce the stalwart technique of Batch Normalization (BatchNorm2d) [15].BatchNorm2d diligently standardizes the inputs of each layer, thereby bestowing stability upon the training process and expediting convergence.This normalization technique assumes a pivotal role in sustaining consistent and robust performance.
In summation, ''DEDU'' orchestrates image enhancement tasks through the harmonious fusion of a myriad of technical components and architectural innovations.This synergy empowers the algorithm to conjure high-quality images and extract indispensable information, even in the face of adversity posed by low light conditions or noisy environments.We anticipate that this versatile algorithm will find resonance across diverse application domains, leaving an indelible mark on the realm of image enhancement.

IV. VERIFICATION OF MODEL EFFECTIVENESS
In this section, we present the results of our experiments aimed at assessing the effectiveness of our proposed model.We conducted these experiments using the Low-Light (LOL) dataset to evaluate its performance in brightness enhancement and simultaneous brightness enhancement and noise reduction tasks [16].For the evaluation of our proposed model, we employed various performance metrics, including PSNR, SSIM, LPIPS, MAE, and MAC.The computational resources used for these experiments included a GTX 3050 GPU and an Intel i7-11700 CPU.All model implementations and experiments were conducted using Python and the PyTorch framework.

A. DATASET
The dataset utilized in this study is the LOL dataset, an openly available open-source dataset.The LOL dataset is designed for research on image enhancement and noise reduction under low-light conditions, providing suitable image samples for such studies.The dataset contains 500 images and images in the dataset have dimensions of 600 pixels in width and 400 pixels in height.The dataset can be downloaded and accessed from the LOL dataset source.
The LOL dataset was obtained from the https://daooshee.github.io/BMVC2018website/,specifically designed for handling images under low-light conditions.The data collection followed the guidelines provided by the LOL dataset source.The dataset contained low and normal-light image pairs.To introduce noise into the dataset, we added Gaussian random noise with a noise level of 10.Subsequently, we labeled low-light images with noise as 'lown' and normal-light images as 'high'.Our dataset was divided into training, validation, and test sets in an 8:1:1 ratio to ensure robust model evaluation.

B. PERFORMANCE IMPROVEMENT
The DEDU incorporates three key feature elements: DFC, CSP, and SKIP Connection.We conducted a performance comparison of metrics between versions where each of these elements was individually removed and the original version.Firstly, examining the performance difference between the model without DFC and the original version revealed a contribution of 0.9 from DFC.This underscores the significance of DFC, providing insights into how it contributes to image enhancement.DFC utilizes an FC layer for attention map generation, simplifying global information capture compared to self-attention.In DFC, the feature map is computed in vertical and horizontal directions, reducing overall complexity and improving PSNR values.Shared transformation weights enable efficient implementation with concurrent reductions in MAC operations and parameter counts [3].
Secondly, in the comparison between the model without CSP and the original version, there was a notable improvement of 3.64 in performance, highlighting the importance of CSP.CSP aims to reduce heavy inference computations caused by duplicate gradient information.To achieve this, it divides the feature map of the base layer into two parts and fuses them through cross-stage fusion.This process introduces significant differences in the correlation of gradient information, and by applying CSP to the main block, it reduces computational load while increasing performance [11].
Lastly, analyzing the performance difference between the model without SKIP Connection and the original version indicated a substantial impact on performance, with a 0.94 enhancement attributed to SKIP Connection.This emphasizes the essential role of SKIP Connection in image enhancement.Within the U-Net architecture, the incorporation of skip connections entails merging the weights associated with downsampling and upsampling.This integration plays a crucial role in mitigating gradient loss and significantly enhancing the overall performance of the model.However, it is imperative to acknowledge that this performance boost is accompanied by a trade-off-there is an increase in computational costs due to the additional operations required for the management and combination of these skip connections.
We applied SKIP Connection to the model, resulting in a slight increase in MAC and parameters.However, we successfully addressed this increase by employing CSP and DFC techniques.The results are presented in Table 2.

TABLE 2. Ablation study of key feature elements in DEDU.
Through these experimental comparisons, we gained clear insights into the significance of each feature element and their respective contributions to enhancing the model's performance.

C. MODEL COMPARISON
We evaluated the experimental results on various aspects of the dataset, employing several performance metrics for comparison with different baseline models.The baseline models we compared against were Retinex [17], Zero-DCE [18], Zero-DCE++ [19], RUAS [20], Uformer [21], HEP [22], HWMNet [23], IAT [24], LLFlow [25], RetinexFormer [26], and WaveNet [27].In cases where pre-trained models were unavailable, we trained some of these comparison models from scratch on custom datasets.On the LOL dataset, our proposed model demonstrated superior performance across all metrics, notably securing the top position in both SSIM and LPIPS.Additionally, it achieved high rankings, attaining 3rd and 2nd positions in PSNR and MAE, respectively.Moreover, our model exhibited notable efficiency, showcasing significantly lower MAC and parameter count compared to other top-performing models.For detailed insights, please refer to Tables 3 and 4.These results unequivocally validate efficacy and potential of our model in enhancing low-light images in the LOL dataset.

D. PERFORMANCE METRICS EXPLANATION
For clarity, here are explanations of some of the performance metrics used: ➢ PSNR measures image quality.➢ SSIM assesses structural similarity between images.➢ LPIPS quantifies perceptual image similarity.➢ MAE indicates the absolute difference between actual and predicted images.

E. MODEL EFFICIENCY
Efficiency was also a focus of our evaluation, where we compared MAC operations, model parameters, and inference times across different models in Table 4.In comparison with other models, our model ranked third lowest in MAC computations and fifth highest in terms of parameter count.This issue arose due to the concurrent execution of both brightness enhancement and noise reduction tasks, leading to an increase in both MAC operations and parameters.Moreover, it's noteworthy that our model exhibits remarkable efficiency, with MAC operations approximately 1300 times smaller than the model with the highest MAC count and about 5 points higher in PSNR than the model with the smallest MAC count.
In comparison to models providing similar performance, our model significantly outperforms in terms of reduced MAC operations while maintaining comparable results.

V. DISCUSSION
Our model effectively addresses the challenging task of simultaneously removing noise and enhancing brightness in images.This dual functionality does introduce higher computational complexity compared to some models focused on single tasks.However, it offers the significant advantage of handling both tasks simultaneously while maintaining superior performance compared to other models.This versatility makes our model a robust solution for real-world scenarios demanding complex image enhancement.
The proposed algorithm excels in handling both tasks simultaneously, showcasing superior performance when compared to other models.The versatility of our model positions it as a robust solution for real-world scenarios that demand complex image enhancement.Despite its success, it is essential to delve into the unresolved aspects, providing a more in-depth discussion on the challenges stemming from the complexity of the network structure and the associated parameter increments.This nuanced discussion aims to transparently convey the scope and intricacies of our research to the readership.

A. REASONS FOR PERFORMANCE IMPROVEMENT
Several key factors contribute to the outstanding performance of our model.Firstly, our novel convolutional DFC attention mechanism employs innovative computational techniques to efficiently extract essential image features.DFC attention utilizes diverse kernel sizes and depth-wise convolutions, optimizing feature extraction while minimizing computational costs.Additionally, we employed various techniques, such as partitioning fully connected layers both vertically and horizontally, to reduce computational complexity.As a result, our model operates with greater efficiency, ultimately achieving enhanced performance [28].

B. FUTURE WORKS
In our future research directions, we intend to further enhance the efficiency and learning speed of the model.We plan to explore methods for fine-tuning the combination of complex computational techniques and optimizing the model's complexity by prudently integrating parallel and serial operations.Furthermore, our approach of concurrently addressing brightness enhancement and noise reduction aims to maximize the effectiveness of both tasks.To achieve this, we will develop lightweight models tailored for individual brightness enhancement and noise reduction tasks, creating a pipeline that seamlessly integrates these processes to generate the final result [29].This approach is anticipated to enhance the synergy between the two tasks while boosting overall model efficiency.Additionally, we recognize the importance of ongoing research in model lightweighting to expand the practical applicability of our approach in real-world environments.

VI. CONCLUSION
In summary, our study introduces a groundbreaking model designed to concurrently enhance image quality and reduce noise, deviating from traditional methodologies.By incorporating DFC attention, shortcut techniques, and the CSP methodology, our model exhibits exceptional performance, as supported by objective evaluation metrics.The experimental results affirm the model's effectiveness in extracting image features and accomplishing both brightness enhancement and noise reduction simultaneously, presenting a paradigm shift in these domains.
However, it is crucial to acknowledge the inherent complexities of our model, stemming from its intricate structure and the resultant abundance of parameters.Furthermore, the comparative analysis indicates that our model's performance lags behind dedicated single-task brightness enhancement models.This underscores the necessity for further research to optimize its efficiency.In light of these findings, we intend to explore avenues for maximizing efficiency and streamlining training in future investigations.Additionally, we aim to develop a dedicated pipeline for independently training brightness enhancement and noise reduction tasks, aiming to enhance both efficiency and performance [30], [31].
Our study pioneers a novel approach to image enhancement and noise removal, opening new avenues for improved real-world applications.Ongoing research commitments are geared towards refining our models and technologies to contribute meaningfully to the continuous evolution of this field.
Despite the success of our proposed model, it is imperative to address the challenges associated with the network's complexity and the proliferation of parameters.Resolving these challenges necessitates dedicated efforts in model optimization and exploring lightweight model structures.Furthermore, the discussion has shed light on potential applications of our algorithm in computational intelligence (CI) domains, including medical imaging, autonomous driving systems, security surveillance, and beyond.This forwardlooking perspective underscores the broader impact and relevance of our research in diverse CI application fields.
In conclusion, our study not only advances the state-ofthe-art in image enhancement but also lays the foundation for future research aimed at overcoming the identified limitations and expanding the practical applications of our innovative model.

FIGURE 1 .
FIGURE 1.The architecture of our main DEDU.Skip connection has 1 × 1 with weight.

FIGURE 2 .
FIGURE 2. The main block and transpose blocks of DEDU (a) main block, (b) down transpose, (c) up transpose.

FIGURE 4 .
FIGURE 4. DFC dense block with CSP in main block.

FFigure 5 8 )
Figure 5 illustrates the relationship between Multiply-Accumulate MAC operations and PSNR on the LOL dataset, providing additional insights into model performance.MAC = Multiply + Accumulate (8) Multiply= Number of multiplication operations.This operation involves the multiplication between the input and weights.Accumulate= Number of addition operations.This operation involves summing up the results of all the multiplication operations to compute the final output.

FIGURE 6 .
FIGURE 6. Images comparison.Both are LOL data.

TABLE 4 .
Comparison of MAC operations, model parameters of each architecture.