Dark Channel Based Multiframe Super-Resolution Reconstruction

Multiframe super-resolution (MFSR) can obtain a high-resolution image from a set of low-resolution images. The performance of super-resolution is affected by the image prior information. The current super-resolution algorithms typically use total variation prior and its improved version, restoring the image edges well. However, it will produce artifacts and stair effects in the smooth region of the image. Therefore, we propose a dark channel-based MFSR algorithm to achieve edge-preserving and noise-suppressing. Firstly, the total variation prior is used to ensure the edge-preserving ability of the algorithm. Secondly, the dark channel prior is added to suppress artifacts and stair effects. Finally, the weights of the prior terms are iteratively adapted to obtain the final high-resolution image. Experiments show that the proposed algorithm can achieve a better result in objective and subjective visual evaluations.


I. INTRODUCTION
Resolution is a basic feature of an image, and it is also the major factor limiting the scope of image application [1], [2]. Increasing the resolution of images can effectively expand the scope of image applications. Super-resolution (SR) can improve the resolution of images via mathematical methods without changing the hardware of imaging system [3]. This technology has great advantages in cost and it is widely used in remote sensing, medical imaging, and other fields [4]- [7].
The SR methods have been developed for decades. In 1984, Tsai and Huang proposed a frequency domain based multiframe super-resolution (MFSR) algorithm [8], which proved the theoretical feasibility of image SR. The algorithm reconstructed the high-resolution (HR) image by the existing sub-pixel displacement between the low-resolution (LR) images, and was successfully applied to the LANDSAT images. Kim et al. [9] improved the MFSR algorithm by analyzing the noise and blur characteristics of the images. Bose et al. [10] further optimized the MFSR algorithm by The associate editor coordinating the review of this manuscript and approving it for publication was Sudhakar Radhakrishnan .
analyzing the registration error in the reconstruction process, which expanded the application range of the algorithm. Rhee and Kang [11] used discrete cosine transform (DCT) instead of discrete Fourier transform (DFT) in the MFSR algorithm, which can effectively improve the computational efficiency of the algorithm. Because the MFSR algorithm based on the frequency domain was simple in principle, fast in calculation speed, and low in computing hardware requirements, it can be easily implemented in engineering. However, in this type of algorithms, only the global transformation motion could be handled, and the prior information could not be used.
In order to solve the aforementioned problems, a series of MFSR algorithms have been proposed. These algorithms use the sub-pixel shifts between multiple LR images to provide additional information for reconstructing HR images, including nonlinear interpolation method [12]- [14], iterative back projection (IBP) [14]- [17], projection onto convex sets (POCS) [18]- [20], maximum likelihood (ML) estimation [21], [22], maximum a posteriori probability (MAP) [23]- [27], adaptive filtering [28], [29] and other methods. The image MFSR problem is an ill-posed problem. By adding image prior information, the ill-posed inverse VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ problem is constrained as a well-posed problem, so that the reconstruction result converges to obtain a stable solution. Generally, the reconstruction results of the spatial domain algorithm are better than the frequency domain algorithm, but the spatial domain algorithm is sensitive to image prior knowledge and registration information. The image prior information has a decisive effect on MFSR, and different prior information will lead to different reconstruction results. The earliest image prior is the Laplace prior [30], [31]. In this prior, the natural images are assumed to be smooth, and noise will lead to image smoothness characteristic loss. By imposing penalty constraints on the high-frequency components of the reconstructed image, the noise can be limited, and a smooth solution can be obtained. However, it will result in blurred edges and loss of texture information. In the Gaussian Markov Random Field (GMRF) prior [32], by analyzing the distribution of each image pixel and its neighboring pixels information, the details and texture can be distinguished. In the Huber Markov Random Field (HMRF) prior [33], the image is assumed to be block smoothed. Compared with GMRF prior, the HMRF prior uses the Huber function, instead of the smoothness measurement function, to measure the image spatial characteristics. Then, the details in the edges can be preserved, and the noise in the smooth regions can be suppressed.
The total variation prior (TV) [34]- [38] is widely used in image deblurring, image denoising, and image MFSR. However, this prior will produce stair effects in the smooth area of images under strong noise. Farsiu et al. [30] proposed a bilateral total variation (BTV) prior. In the BTV prior, by comparing the original image with the translated image, a larger weighted coefficient was assigned to the edge areas, and a smaller weighted coefficient was assigned to the smooth areas. Then a more robust result with more details can be achieved.
Since different image priors have different effects on image reconstruction, combining image priors became an effective way to improve the quality of reconstructed images. Chantas et al. [39] combined the TV prior and the Product of Expert (PoE) prior [40] to a new image prior, which was able to simultaneously enforce the properties on the image. Villena et al. [41] applied a combination of the sparse TV prior and l 1 prior, and the non-sparse simultaneous auto-regressive (SAR) prior to the MFSR. This prior could integrate the strong edge preservation of the sparse prior and the smoothness of the non-sparse prior. In [42], a spatially adaptive linear filter prior was proposed for the MFSR. In [43], a filter bank and l 1 norm based prior combination method was presented. However, these methods still produce artifacts and stair effects because the combined priors were both based on the gradient properties of the image.
The dark channel is introduced by He et al. [44] for single image dehazing, then Pan et al. [45] modify the prior that the dark channel of natural images is sparse instead of zero and enforce the sparsity for kernel estimation. Inspired by the work in [45]- [47], we note that the DC pixels are less sparse after the degradation process. Therefore, we introduce the DC prior into MFSR algorithm.
In order to overcome the aforementioned shortcomings, in this paper, we proposed a new algorithm to improve the quality of the reconstructed image, and the main contributions are as follows: 1) Introduce the dark channel (DC) prior to superresolution, in order to suppress the artifact and stair effects during the image edges recovering. 2) Propose a new MFSR algorithm by combining the DC and TV prior, which can effectively improve the MFSR effect. 3) Experiment results verify that the above algorithm can achieve better performance in both PSNR and SSIM. The rest of this paper is organized as follows. In Section II, we build the entire super resolution model. In Section III, we give the detailed description of the DC-TV based MFSR algorithm. Then, in Section IV, we represent the experiment results and evaluate the proposed algorithm in comparison with other representative benchmarks. Finally, we concluded this paper in Section V.

II. THE SUPER-RESOLUTION RECONSTRUCTION MODEL
The super-resolution reconstruction model is composed of image observation model, image noise model, image prior model, and image registration model.

A. OBSERVATION MODEL
The observation model is established to describe the relationship between HR images and LR images. An LR image with N = N 1 × N 2 pixels (where N 1 and N 2 are the row number and column number of the image )can be obtained from an HR image with P 2 N = PN 1 × PN 2 pixels(P is the up-sampling factor) through a series of operations such as rotation, displacement, blurring, down-sampling, and mixed noise. The corresponding observation model is represented by the Fig. 1. According to this observation model, a mathematical model can be established in matrix-vector form where s is the number of the LR images, I H ∈ R P 2 N ×1 and I L l ∈ R N × 1 are lexicographically ordered vectors used to represent HR and LR images respectively. W M l is the motion transformation matrix with size P 2 N × P 2 N , and the value is affected by the motion transformation parameters M l = {θ l , d hl , d vl }. K l is the blurring matrix with size P 2 N × P 2 N , and S is the down-sampling matrix with size N × P 2 N . n l represents the additive noise vector related to the lth LR image.

B. IMAGE NOISE MODEL
The noise model of the image corresponds to the degradation relationship between the LR images and the HR image. The relationship can be expressed as the conditional probability density function p(I L l | I H ) of the LR image. Assuming that the noise is additive white Gaussian noise (AWGN) with the variance η −1 l , according to (1), the noise model of the image can be expressed as Since the LR images are captured independently, the noises can be assumed to be statistical independence. Therefore, we can obtain the following expression where the I L = {I L 1 , I L 2 , . . . , I L s } is the set of LR images and M = {M 1 , M 2 , . . . , M s } is the set of registration parameters.

C. IMAGE PRIOR MODEL
Two image prior model is used in this paper, the TV prior model p(I H | η tv ) controlled by hyperparameter η tv and the DC prior model p(I H |η d ) controlled by hyperparameter η d .

1) TV PRIOR MODEL
The TV prior is widely used because of the property of edge preserving [48]. The mathematical expression of the image TV prior is where TV (I H ) has two different expressions, i.e. and In (5) and (6), hi (I ) = I (i) − I (r(i)) and vi (I ) = I (i) − I (b(i)) respectively represents the horizontal and vertical gradient of the image. Equation (5) is called the isotropic TV model and (6) is called the anisotropic TV model. We use the isotropic TV model in this paper, because compared to the anisotropic TV model, the isotropic TV model is more robust to image rotation, image reflection, and image position transformation, and the reconstruction effect of the isotropic TV model is also better than the anisotropic TV model [49].

2) DC PRIOR MODEL
The mathematical expression of the image DC prior model is where the D(I H ) represents the DC of the image. The standard form of DC is expressed by where i and j are image pixel indexes and N (i) is an image patch centered at i. Note that the nonlinear operation D(I i ) cannot be directly used in the subsequent solution. Hence, a linear operation D c which is equivalent to this nonlinear operation is applied in the following form The D c in (9) is a matrix with size P 2 N × P 2 N and the elements in ith row is acquired from the below formula where q is the position index of the minimum value in the given neighborhood N (i).

D. IMAGE REGISTRATION MODEL
The registration model aims to describe the motion transformation relation between the target images and the reference image. The registration parameters are generally obtained by using the first frame of the LR images as reference.
In contrast, we use the HR image as reference to obtain more accurate registration parameters. Using a 3-parameters motion model, the position relationship between the reference image and the lth warped image is where (x, y) are the coordinates of the reference image and (x l , y l ) are the coordinates of the warped image. M l = (θ l , d hl , d vl ) respectively correspond to the rotation angle, horizontal displacement, and vertical displacement in the motion transformation.
Since the coordinates (x l , y l ) are generally not integer values, the grid of the lth warped image should be calculated by resampling (see Fig. 2). By using the bilinear interpolation method to approximate the lth warped image, the value of the pixels can be calculated by the weighted sum of four adjacent points, that is where I se , I ne , I sw , I nw respectively represent four adjacent points in different directions around the target point. E is the identity matrix, D a x and D a y are diagonal matrices with a x , a y as diagonal elements.
[a x , a y ] T represent the distance

III. DC-TV BASED MFSR ALGORITHM
The MFSR is an inverse processing of the image degradation.
where the conditional distribution can be expressed as The mathematical derivation is given in this section, and the flowchart is shown in Fig. 3. Since the main contribution of the proposed algorithm is the combination of the DC and TV prior, we will first describe the image prior optimization steps as follows.

A. THE PRIOR OPTIMIZATION
A combined prior named DC-TV is proposed to improve the effect of the MFSR algorithm. The TV prior could preserve the image edges by distinguishing between gradient change areas and image smooth areas, but cannot get rid of the noise. In addition, the DC prior have the ability to suppress the noise by the sparse property of the l 0 norm. Thus, the DC-TV prior can effectively combined the edge preserving capability of the TV prior and noise suppressing capability of the DC prior. The expression of the DC-TV prior model is, where p(I H | η tv ) and p(I H |η d ) represent the TV prior and the DC prior, respectively.

1) TV PRIOR OPTIMIZATION
The TV prior equation (5) cannot be directly tackled in the proposed algorithm, so the majorization-minimization (MM) approach [50] are used. Considering the follow inequality where u > 0, z > 0, and the equation is established when u = z.
For the ith pixel of the image, we introduce a variable Thereby, Q tv (I H i , u i ) has a minimum value equal to ∇I H , i.e.
Hence, the TV prior model can be approximated as where the U is a diagonal matrix of size P 2 N × P 2 N with diagonal elements calculated

2) DC PRIOR OPTIMIZATION
In the DC prior equation, the half-quadratic splitting L 0 minimization approach [51] is proposed to tackle the L 0 norm. By introducing an auxiliary vector g, D(I H ) 0 can be approximated in the following form where η g is a penalty parameter. When η g tends to infinity, D(I H ) 0 and Q D (I H , g) are approximately equal. Calculate the auxiliary vector g by minimizing (23), that iŝ Thus, the solution of g iŝ This optimization updates g and D c iteratively by amplifying η g so that Q D (I H , g) approximately equal to D(I H ) 0 . Hence, The DC prior model can be approximated as

B. REGISTRATION ESTIMATION
The sub-pixel registration is an important part of MFSR. It is mainly obtained by interpolating between pixels. The denser these pixels of the HR image are, the more realistic the interpolation effect is, and the more accurate the registration result becomes. Assuming that the registration parameters obeys the multivariate Gaussian distribution [52], i.e., whereM l is the preliminary registration parameters and ζ l is a priori covariance matrix. After the MFSR image estimation I H is obtained, the parameter M l can be estimated by minimizing the following equationM which can be simplified by a logarithmic form, i.e., Since W M l I H is nonlinear with respect to M l , we perform a first-order Taylor expansion on W M l I H at the valueM l , i.e.

W M l I H
where JM l is the Jacobian matrix and it can be calculated by The three parts of (31) can be calculated by (11), (12) and (13), respectively. Hence, the final value [53] of JM l is in the form where we have with (ζ n+1 where n is the number of iterations and the initial ζ l = η l (SK JM l ) T SK JM l is calculated by the initial registration parametersM l .

C. ESTIMATION OF THE HYPERPARAMETERS
Generally, the distributions of hyperparameters {η l } and η tv are Gamma distributions [54], i.e. (39) where w denotes the hyperparameter. Meanwhile, a w > 0 and b w > 0 are the shape and scale parameters of the Gamma distributions, respectively. In our experiments, these parameters are settled with a w = 1 and b w = 0.1. Therefore, the hyperparameters {η l } can be calculated by the following equation Substituting (2) and (39) into (40), the estimate of η l iŝ The hyperparameter η tv can be calculated in the following equation,η tv = arg max Substituting (4) and (39) into (42), the estimate of η tv iŝ The η d is an empirical hyperparameter, and its value in our experiments is

D. ESTIMATION OF HR IMAGE
After obtaining the image registration parameters M, and the hyperparameters , the HR image I H can be estimated by the following equation Substituting the optimization expression (19) and (23) into (17), the MFSR solution formula can be simplified in the form where The reconstruction process of the proposed algorithm includes the registration parameters estimation, the prior term optimization, the hyperparameters estimation, and the high-resolution image estimation. The stable solution of the algorithm can be obtained by looping the reconstruction process, and the loop termination condition is where t is the iteration counter, and I H 0 is a zero vector. The pseudo code of the algorithm is given in Algorithm 1.
In Algorithm 1, the first loop describing the reconstruction process of the algorithm, and the second loop describing the DC prior optimization. η g is the parameter used during the DC prior optimization process, and the value of η g is between η init g = 0.001 and η max g = 8. η g = η init g . 8: while η g < η max g do 9: Estimate D c byÎ H and (10). 10: Estimate g byÎ H and (25).

IV. EXPERIMENTAL RESULTS
In the proposed DC-TV based MFSR algorithm, the TV prior and DC prior were combined to simultaneously enhance the edge preservation and stair effects suppression capability. To evaluate the performance of the proposed algorithm, multiple sets of simulated images and real data are used in the experiment. The experimental results on simulated images can be evaluated by using objective evaluation criteria, such as Peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM). The experimental results on real data can intuitively reveal the advantages of the proposed algorithm.

A. EXPERIMENTS WITH SIMULATED IMAGE
Four images (Fig. 4) are employed in the simulation experiments. Each original image is utilized to create 5 synthetic images by image degradation model according to (1). The degradation process includes the following steps.
• Warp images with rotation and translation. The rotation angles are and the translations are pixels, respectively.
• Down-sample images with a factor of 2.
• Add independent identically distributed Gaussian noise. Five different SNR levels (5dB, 15dB, 25dB, 35dB and 45dB) are selected for our experiments. For the evaluation of our proposed algorithm, we compare it with 1) bicubic interpolation, 2) the variational MFSR method with SAR prior, 3) the variational MFSR method with TV prior, 4) the nonstationary prior combination method [42] with the filter combination NF3, and 5) the l 1 norm based prior combination method [43] with the filter combination NF2. The bicubic interpolation algorithm is simple and easy to implement, but the results are relatively poor. The MFSR method with SAR prior has robust noise suppression effects, while the MFSR method with TV prior have strong edge preservation capabilities. The combined prior NF3 and combined prior NF2 are the best selection in the MFSR method [42] and the MFSR method [43], respectively, which are denoted as NS_NF3 and NL1_NF2 to make a distinction.
When quantifying the gain of these priors, the PSNR and SSIM are exploited as our indicators. With different prior information and SNR levels, the PSNR of four reconstructed images are shown in Fig. 5, and the SSIM of four reconstructed images are shown in Fig. 6. Both of them are obtained by all the MFSR algorithms we need to compare with. As shown in Fig. 5, the proposed DC-TV prior outperforms other comparative methods at five different SNR levels. And in Fig. 6, it can be observed that the proposed DC-TV prior creates distinct advantages on both noise suppression and edge structure preservation.
To demonstrate the performance in the case of low SNR, the experiment results of 5dB image SNR is shown in Fig. 7. Obviously, both the SAR and NS_NF3 prior algorithm performed better than the bicubic interpolation in Fig. 7, while the TV and NL1_NF2 prior algorithm are inferior to the bicubic interpolation. Specially, our proposed DC-TV prior algorithm significantly outperformed the others. It indicates that the SAR and NS_NF3 prior algorithm can effectively suppress noise and reconstruct images when noise becomes the main factor in image quality degradation. When the image SNR is 25dB, as shown in Fig. 8, the SAR, TV, NS_NF3, and NL1_NF2 prior algorithms achieved similar reconstruction effects, all of which are worse than the bilinear interpolation. In Fig. 8, the PSNR of SAR and NS_NF3 is worse than TV and NL1_NF2, while the SSIM of SAR and NS_NF3 is better than that of TV and NL1_NF2. Moreover, the performance of DC-TV prior algorithm obviously surpassed other algorithms.
When the image SNR is 45dB, as shown in Fig. 9, noise has less influence on the image quality. In this situation, most of the listed algorithms have achieved relatively good image reconstruction results, except for the bilinear interpolation. The performance of the DC-TV algorithm is still better than other comparative algorithms. It proved that the DC-TV algorithm could suppress noise without compromising the edge preserving capability of the TV prior.

B. EXPERIMENTS WITH REAL DATA
In addition to the aforementioned simulation experiments, two sets of real image sequences have been used to verify our proposed algorithm. These two sets were collected from http://www.soe.ucsc.edu/ milanfar/software/sr-datasets.html and are representative, widely used for experiments in many papers on the MFSR algorithms.
The first set contains 35 frames of real images about word, and the second set contains 20 ones about disk. Both of them approximately follow the global translational motion model. In our experiment, these two image sets were reconstructed with a magnification of 2, with the results shown in Fig. 10 and Fig. 11.
Since there are no original HR images corresponding to the real LR images, objective indicators such as PSNR and SSIM cannot be used for experimental evaluation anymore. Thus, subjective observation is employed to evaluate the reconstruction effect based on real data.
For the word image sequence in Fig. 10, the result of bicubic interpolation algorithm is too blurry to recognize. The image edges of the SAR prior are smooth but blurred. And that of the TV prior is seriously affected by the stair effects. In contract, the NS_NF3 algorithm can effectively suppress the stair effects in the smoothing area of the image, and the NL1_NF2 algorithm has clear edges with less stair effects. Moreover, the reconstructed image of our proposed algorithm has the best visual performance with clear edges and no stair effects.
Similarly, for the disk image sequence in Fig. 11, the reconstructed image of bicubic interpolation algorithm is still blurred, and the SAR prior is affected by artifacts. The results of both the TV and NL1_NF2 prior are affected by the stair effects. As a contrast, the NS_NF3 algorithm has the best noise suppression effect, but over-smooths the image which leads to the loss of edge information. As expected, the result  of our proposed algorithm has clear image edges, with less influence of artifacts and stair effects, and the visual effect is significantly better than other comparative algorithms.

V. CONCLUSION
In this paper, we introduced the DC prior to super-resolution. By analyzing the sparse characteristics of the image DC prior, this can be utilized to suppress the artifact and stair effects during the edges recovering. Based on this superiority, we proposed an MFSR algorithm by combining the DC and TV prior. The experiments with simulated and real images verified the effectiveness and robustness of our proposed algorithm. For different SNR levels, our algorithm have higher PSNR and SSIM values, especially in the case of low SNR. For real images, our proposed algorithm have the better visual effect. In the future work, we will optimize the algorithm to reduce its computational complexity, so that the DC prior can be extended to the hyperspectral image processing for better noise suppression.