A Novel De-Ghosting Image Fusion Technique for Multi-Exposure, Multi-Focus Images Using Guided Image Filtering

In this paper, a novel de-ghosting image fusion technique is presented, which enhances the quality of low dynamic range images using multi-level exposures taken from the ordinary camera and also removes the ghosting artifact. In the proposed algorithm, first, the source images, taken under different exposure settings, are decomposed into base and detail layers using two-scale decomposition. The base and detail layers contain small and large-scale variation details of the source images, respectively. The Laplacian-of-Gaussian filter is applied to the source images to get the edge information. Afterward, the saliency map of the edges is computed. To remove the ghosting artifacts, a weight matrix is calculated by applying the median filter on the histogram equalized source images. The weight matrix is combined with the saliency map to generate more accurate weights. The separate weights for the base and detail layers are calculated using guided image filters. Finally, the base and detail layers’ weights are fused with the source images to generate a vivid and enhanced image without any artifacts. The proposed technique is evaluated both qualitatively and quantitatively. The comparison of our technique in terms of Yang’s Metric (<inline-formula> <tex-math notation="LaTeX">$Q_{Y}$ </tex-math></inline-formula>), Quality Mutual Information (<inline-formula> <tex-math notation="LaTeX">$Q_{MI}$ </tex-math></inline-formula>), Gradient-based Fusion Metric (<inline-formula> <tex-math notation="LaTeX">$Q_{G}$ </tex-math></inline-formula>) and Chen Blum’s Metric (<inline-formula> <tex-math notation="LaTeX">$Q_{CB}$ </tex-math></inline-formula>) with other state-of-the-art techniques proves that the proposed technique outperforms existing techniques.


I. INTRODUCTION
The images captured by ordinary digital cameras do not contain the entire details of the real-world scenes [1]. This is due to the fact that the dynamic range of the real-world is large; whereas, the sensors deployed in ordinary cameras can capture only a tiny range of it [2]. The difference between the highest and lowest pixel values of an image is called its dynamic range. The resultant image captured by an ordinary camera loses the details because of the underexposed area of the image, which is dark due to low exposure, and some portion appears over bright due The associate editor coordinating the review of this manuscript and approving it for publication was Inês Domingues . to high exposure. Therefore, digital images do not look as realistic as a human eye sees them, and so much detail is lost. There are two approaches to overcome this problem of significant difference between the High Dynamic Range (HDR) of the real-world and the Low Dynamic Range (LDR) of digital images, namely: the hardware-based approach and the software-based approach. In the hardware-based approach, cameras are equipped with sensors having HDR imaging capabilities. However, the state-of-the-art CMOS or InGaAs sensors that have the capability of capturing the HDR of the real-world [3], [4] are not affordable for everyday users. The software-based approach, on the other hand, is way less expensive and thus can be considered a more practical solution.
In the software-based approach, multiple images that are taken by ordinary cameras under different exposure settings are fused together to produce an HDR image that looks more realistic and contains more details [5]. There are two image fusion methods: tone mapping (TM) and multi-exposure image fusion (MEF). The TM method consists of two essential steps: HDR reconstruction and tone mapping. As first step, multiple LDR images of the same scene are combined to obtain an HDR image [6], [7]. However, the obtained image cannot be displayed on an ordinary camera or any LDR device directly due to its high fidelity. Therefore, tone mapping is applied to the HDR image to make it suitable for display. On the contrary, the MEF technique is simple and does not require any processing before displaying, which makes it more suitable [8], [9]. We have implemented the MEF technique in which multiple exposures of images are taken under different levels of brightness and fuse the informative part of the image into a single resultant image.
Although MEF techniques are considered more reliable and efficient than their counterpart tone mapping techniques, they have their specific limitations. In the case of static scenes, i.e., when there are no moving objects in the LDR images, MEF techniques work fine. However, MEF techniques results are not promising if the LDR images are taken under different exposure settings that may contain moving objects, called ghosting artifacts. In such cases, the resultant HDR images contain the shadows of ghosting artifacts, hence the ghosting artifact is removed in this paper. In addition to the above limitations, there are some other issues as well. For example, in practice, due to ripples or handheld camera, small-displacement also occurs. However, this small-displacement of camera movement could be tackled either by deploying a tripod or implementing some registration methods [10]- [13], and similarly for object motion, many available methods require pixel and patch level execution [14]- [18]. Therefore, in this research work, the camera movement is not considered.
The major contributions of this paper can be summarized as follows: 1) A single comprehensive method is developed which works on focused, flashed and multi-exposure images and also removes ghosting artifacts from the fused image. 2) Reducing the computational complexity in image fusion process with the help of guided image filtering.

3) A novel method of initial weight construction using
Laplacian-of-Gaussian and a weight map with maximum pixel among all images is devised. 4) Resulting images are refined using the optimal radius and the regularization parameter of the guided filter. 5) The proposed technique is evaluated through comparison with various other state-of-the-art techniques. The results show finer performance superiority of the proposed technique. The rest of the paper is structures as follows. In Section II, we review the existing multi-exposure image fusion, tone mapping and de-ghosting methods. Section III describes the proposed fusion method in detail. In Section IV, we compare our experimental results with state-of-the-art methods and conduct further discussions. In Section V, we conclude the paper.

II. RELATED WORK
An important aim of digital photography is to reproduce the natural scene with good contrast, vibrant color and rich imagery. Nevertheless, the captured images are sometimes under-exposed or over-exposed due to poor lighting environments and the restricted dynamic range of imaging equipment. The captured LDR images degenerate the performance of numerous computer vision and image analyzing techniques/algorithms. Thus LDR to HDR enhancement is an important step towards improving the efficiency of the captured images and making the detail of the image more rich and visible. An extensive amount of dedicated research studies on HDR imaging issues can be found in the existing literature. Due to the advantages of the HDR image a number of HDR imaging techniques have been proposed by [19]- [24]. Nayar and Mitsunaga [11] proposed the method to produce HDR image from different exposure after considering the global motion of camera to register images. It uses the Computed Response Function (CRF) to fuse multiple source images of different exposures into single HDR radiance image. Although, it is a good option for the infrared detector but in this technique for each pixel, an error function is required to be defined which increases the computation complexity. Ward [10] proposed the alignment method which uses percentile threshold bitmaps to speed up image operations and prevent problems of different levels of exposure used in photography with HDR. Cost of this method is linear in terms of the number of pixels and is completely independent of the total translation. Global strategies are therefore very effective, but they are unreliable in the presence of independently moving objects while zooming or tiling. Chen et al. [13] addresses the transformation of multi-exposure images taken by LDR device, using key points (or feature points) those are obtained by applying SIFT technique on source images. Content of image does not affect its performance and also works well for the under-and over-exposed images. However this method fails when camera motion is not centered and produce an artifact in output image. Reinhard et al. [25] presented a technique to develop an operator of tone reproduction, with the help of local contrast measurement of the multi-exposure images and constructing the the density of luminance for HDR image. Their technique provides an effective way to compress the dynamic range and also reduce halo artifact simultaneously. Nevertheless, its quality is restricted by the circular surrounding. Kuang et al. [26] introduced a HDR image rendering model named as iCAM06 that is based on the TM operator and worked with the strained color image. Their architecture accounts for rod in low light conditions, enabling accurate simulation of the HDR scene for the user view. However, this technique has difficulties for high level VOLUME 8, 2020 luminance predication and loss the detail of spatial acuity at darker area. Shan [19] developed the TM operator which performs the local linear modification over all source images using a small overlapping window. This technique helps to synthesize the resultant image from LDR images. The typical problem with some TM methods such as this is halo artifact because of contrast reversals.
MEF techniques provide an alternate way of producing informative and perceptually appealing HDR images, which directly obtain the fuse image for LDR devices. MEF method using block based approach by Goshtasby [20] divides the image into non-overlapping blocks and select the blocks that have maximum information from all blocks also known as region based MEF technique. Optimal block size is define by the gradient-ascent algorithm to get the highly informative fuse image. If the color information within highlight area is comparatively high, it will preserve the scene information. However, if the block size is not sufficiently small, this technique may produce artifacts on the scene boundaries. Wang and Zhang [27] presented a novel multi-exposure image fusion method which works on patch segmentation. In this method they have used super-pixel segmentation then it is decomposed into three components and fused independently that is based on the human vision. Finally, the guided filter is used for optimized results. Ma et al. [28] introduced the MEF method which works with low-resolution of source images. They have implemented the fully conventional network and then through guided filter done the up-sampling and get the fused image after applying the weighted average. This method works for static scenes. Zhu [29] represented the novel MEF method, which is based on multi-modality, and image cartoon texture decomposition is used to preserve the structure of every source image. This technique preserves the details of the source image in the final resultant image. Mertens et al. [9] presented a persuasive weight based multi-exposure image fusion method. Which works on pixel level by calculating and combining the 3 superiority measures including saturation, contrast and well exposedness of each exposure image.It is computationally efficient and blending is reliable in preventing seams as it incorporate objects features. But the performance of these types of techniques could be unsatisfactory if the decomposition level is too small or too large. To overcome this issue, the proposed method uses two-scale image decomposition technique. Both the Song et al. [21] and Shen et al. [22] separately presented their probabilistic model-based multi-exposure image fusion methods. However method [22] works with two quality measures contrast and color-consistency and [21] calculates the level of image luminance then the images are embedded by gradient. These techniques are suitable for preserving the details, when an HDR image has a very large contrast ratio. But, because of multiple iterations these techniques may over smooth the resultant image. Gu et al. [23] fused gradient field which is obtained from the tensor structure of source based images on multi-dimensional Riemannian geometry then for modified gradient field iteration take twice mean filtering and for multi scale non-linearly compressing. No human interaction is needed in the execution to reproduce the results. Multi-scale decomposition used in technique slow down its performance, to overcome this issue we have use two scale decomposition. Ocampo-Blandon and Gousseau [30] used the Poisson editing technique to handle the saturated parts of multi-exposure images after that use the reference image by applying many patches for fusion and finally to emphasize stacking, non-local method implements. However, this technique seems suitable when there is less difference in saturated parts between multi-exposure images. Although it may produce artifacts on the edges of the fused image. Cai et al.'s [24] represented a different method for the image fusion of a single image by using the image contrast enhancement function which made the results that can be compared with multi-exposure image fusion. But still this method have some limitations for the darkest and brightest area of image and difficult to manage the large data-set. Also there is no consideration of moving objects.
In most current MEF methods, the basic assumption is that during various captures of multi-exposures the scene is static. But when fusing the images captured in dynamic scenes which involves moving objects, the above mentioned methods can produce ghosting artifacts. Various solutions are proposed such as Sidibe et al. [31] introduced one of the pixel level image fusion method using pixel order relation that helps to detect the moving objects with high sensitivity. If there is no motion object in pixel then the value of pixel intensity will be same as in exposure. After getting the moving objects it uses the static tool to get artifact free image and also work for the small background motion. Zhang and Cham [15] detected the ghost region by using gradient based approach. The two quality measures are introduced: Consistency and visibility, which is based on the change in the gradient of different exposures. This technique has lower computational complexity and also work for flash and non flash images. It may produce unsatisfactory results when the source images contain focused or un-focused images. Pece and Kautz [18] proposed a method for removing ghosting artifact by using motion-region based technique called BMD (Bitmap Movement Detection). They first extract the contrast, well-exposedness and saturation for each source image then detect the moving object by applying median bitmap. That help to impose relation from each exposure image. However it may produce some unrealistic effects on the fused image as it ignore image structural information. An et al. [32] modified the weights of [9] Mertens et al. for presenting the patch based correlation which converts the formulation of weight and represent as a motion pointer using photo-metric relation. The technique produce good texture detail and enhanced colored fused image. However this technique require more computation and also produce artifact on the shadow. Li and Kang [16] presented the method which collects three image features: brightness, local-contrast and color-dissimilarity to construct the weight map and enhance weight maps by using recursive filters. To avoid moving object in fused image this technique use motion recognition function, which is accomplished by selecting the image as reference by using median filter and getting the histogram of remaining of static one. But it can not preserve the details when the source images severally have the brightest region.
Vanmali et al. [33] proposed MEF method for dynamic scene used four stages e.g weight map creation of source images, detection of moving object, than weight map adjustment and finally multi-exposure fusion via changed weights map. Method removes the ghosting artifact but unable to preserve the texture detail of darker area of image and also lost the color. To preserve the details of texture and natural color, proposed paper used the saliency map at each pixel level. Mertens et al. [9] came up with a method that works with contrast and saturation technique to fuse multiple exposure image sequence. This method produce good fusion efficiency also for the flash images but it cannot handle object boundaries well and some area of darker and brighter image lost the details. This issue is solved in proposed method by Vonikakis et al. [34], which works on illumination estimation based method in which well exposedness is calculated by a filtering method for illumination estimation. Then to create weight map, calculated estimates are combined with fuzzy membership functions in resultant well exposed fused image is generated. Ma et al. [14] presented the structural based patch decomposition method, which is disassembled in to three separate components, signal structure, mean intensity and signal strength, where each patch is fused independently and desired patches are placed in fused image. This method have very low computational cost as it also not required post processing steps. This method removes the ghosting artifact, however lost the details in darker area and produce blurred result. Lee et al. [35] worked with adaptive weight, relative pixel value and global gradient that calculate weight map for each exposure image and then obtain the compared intensity between the global gradient and source image. This method preserve the low luminance details better but often create unwanted visual artifacts. Paul et al. [36] proposed MEF method used gradient domain for color images, that is based on the luminance of source image using higher gradient magnitude at every intensity location and then via Haar wavelet technique obtain the fuse image. However, this method does not work with flash images. A proposed method by Bavirisetti et al. [37] covers this issue by combining information of image by multi scale image decomposition and perform visual saliency detection on it than finally construct weight map pixel by pixel on each scale. With the excessive brightness in fused image the color saturation is normalized.
There are many existing techniques to overcome the issue of LDR images in which some of the above methods have used a block-based technique. That divides the image into non-overlapping blocks and selects the maximum information from all blocks. Such techniques will preserve the scene information within the highlighted area and perform faster. However, if the block size is not optimal, it will produce artifacts on the scene boundaries. Some of the pixel level techniques used a weighted average by using different filters and optimizing the weight for a better quality image. The resultant fused images in such methods are vivid and reliable as it incorporates object features. But the performance of such techniques is not satisfactory because of the high or low level of decomposition. We have used the pixel-level technique in this paper, and to overcome the issue of decomposition, the two-level decomposition method is implemented and with the help of a guided filter, it produces optimal results on time.

III. PROPOSED METHOD
We propose a solution for multi-exposure and multi-focus images. The source images, taken under different exposuresettings or with diverse focuses, are first decomposed into base and detailed layers. Then, weights, called saliency map, for highly-structured and smooth areas are calculated using the Laplacian-of-Gaussian (LoG) filter. In the next step, which is especially well-suited for de-ghosting, the color dissimilarity features are estimated. The saliency map is refined using the guided image filter after combining it with the color dissimilarity features to get weight maps associated with the base and detailed layers. Finally, the base and detailed layers along with their respective weight maps, are fused together to generate a more vivid, enhanced, and detailed image without any ghosting artifacts. The proposed technique is comprehensively elaborated in the sequel.

A. OVERVIEW OF GUIDED IMAGE FILTER
In previous years, edge-preserving filters [38], [39] tended to be a most promising research topic in the image fusion field. There has been a number of edge-preserving filters implemented in image fusion methods such as conventional weighted least squares [39], bilateral filter [40], [41] and joint bilateral filter [42] in the early research, however, these filters may go through the consequences of ''gradient reversal'' artifact which occurs when a pixel has similar pixels around it, and also, a weighted average of Gaussian can be unstable. A Guided Filter (GF) has been proposed by He et al. [38], which significantly improved computing time. The GF has a fast and non-approximate linear time algorithm, whose computational complexity is independent of the filtering kernel size [38]. The GF can be utilized in many other applications such as image de-hazing [43], de-noising [44], removing snow/rain from images [45], soft matting, image enhancement and so on.

1) TWO-SCALE IMAGE DECOMPOSITION
Two-scale image decomposition is used to get detail and base layer of source images, as shown in Algorithm 1. Base layer is extracted by applying an average filter on each source image which produces a smooth image. Detail layer which presents edges in the image are extracted after excluding the base layer from source images. To extract the information related to small-scale and large-scale variations in the sources images [46], called the base (ψ b ) and detailed (ψ d ) layers, an average filter is applied as follows, where, q ∈ [1, n], c ∈ {r, g, b}, and S xy represent the window size; that is, m × n. In the proposed method, both m and n are chosen to be 31.

2) INITIAL WEIGHT MAP CONSTRUCTION
In this subsection, the weights for edges will be computed. Initially, Gaussian filter is applied to get a smooth image so that we can avoid artifacts. After that, a Laplacian filter is applied to the smooth image to get edges. As shown in Algorithm 2 edges from source images are taken as input and set weight maps as output. Remmet function normalizes it into zero to one range. Each pixel value of the image is compared with the relative pixel of other images and the maximum pixel value is set as 1 (weighted pixel, we are interested in) and the rest are set as 0. First, the Gaussian filter with kernel size of 5 × 5 is applied on the source images as it removes the noise and smoothes the images, as shown below,

10
W i (x, y) = mono; 11 end for 12 return W q In the above equation,Ì q is the gray-scale version of I q c , and f (x, y) is a low-pass Gaussian filter, which is defined below, where x ∈ [−2, 2], y ∈ [−2, 2], and σ 2 = 11 is standard deviation parameter for smoothing. Afterwards, the Laplacian filter L(x, y) of size 3×3 is applied on the smooth image G(x, y) as follows, where, 2 f is a symmetric 3 x 3 Laplacian Kernel. This matrix is used to localize the edges, which is defined as follow: The estimated weights give us good characterization of the coarse level. However, the weight are further refined as follows. where, 3

) COLOR-DISSIMILARITY
The images captured at different exposures and under different focus settings may contain moving objects. As a result, when these images are fused to make a more vivid and detailed image, the resultant image contains shadows of these moving objects called ghosting artifacts. Therefore, in devising an image fusion techniques, these moving objects should be removed. A number of techniques [47]- [50] have been proposed to cater this problem. However, the existing techniques have some limitations, such as time-consumption and require the users' input to select the reference image. To overcome all these shortcomings, a method called color dissimilarity [16] is utilized in this paper.
As shown in Algorithm 3, through the median filter we get the reference image and acquire the moving object by subtracting it from histogram equalized images. This algorithm helps to remove ghosting artifacts. First, the histogram equalization is applied on the source images I q c to get the more enhanced images, I q c,h . Afterwards, to select the static background, a median filter is applied on the source images separately as follows, where, I s c represents the static background. The moving objects appear less often than the intended objects; therefore, a median filter extracts the static background efficiently. Whereas, the moving objects are extracted by calculating the color dissimilarity as follows, Afterwards, the information of moving objects in different current channels of the source images are combined as shown below.M VOLUME 8, 2020 The moving objects detected in 11 are further refined using the erosion and dilation operations as follows.
where, ⊕ and represent the dilation and erosion operations, respectively. The disk-like structure elements: s 1 and s 2 are chosen with different radii, discussed in more detail in Section IV. The resultantM q contains the information of moving objects, and the process does not require the user's input in selecting the reference image.

4) WEIGHT ESTIMATION AND REFINEMENT
In this subsection, the weights related to edges and moving objects are merged in way that both types of information remain intact. The weights calculated in (7) and (12) contain, respectively, information related to edges and moving objects.
In Algorithm 4, we have used a guided filter which helps to remove the artifacts from the boundaries of the image. It preserves the details of edges using the guidance image which is the input image.

Algorithm 4 Weight Map Optimization
mean a = f mean (a) mean b = f mean (b) 12 R i = mean a ×Ì i + mean b 13 end for 14 return R q B , R q D 15 /* f mean is mean filter, is regularization parameter and r is radius */ To keep this information intact, the weights are combined as follows: The weight map P q are noisy and not aligned with the edges that causes artifacts in the final fused image. Spatial consistency [51], which means if the color and brightness of two adjacent pixels are same then their weights should be same, is considered as a possible solution. A famous spatial consistency approach [52], in which two terms: energy function and smoothness term, are calculated. The energy function and the smoothness term contain the pixel saliences and edge aligned weights, respectively. The weight function is, then, further optimized using global minimization technique. However, the spatial consistency techniques are considered inefficient. Therefore, in this paper, a better and reliable method is used. To refine the weights, a guided filter is applied on each input imageÌ q and the corresponding weight map P q . Furthermore, theÌ q works as a guidance image.
In above equations, the parameters: r1 and r2 denote the local window radii, ε1 and ε2 represent the regularization parameters of the guided filter. The parameters are selected to accordingly to calculate the weight maps for base and details layers. The incentive of the proposed weight construction technique is as follow. According to GF [38] if the local variance is smaller at the location (i, j) then the place of the pixel will be in flat area of guidance image and value of ak will be close to 0 and resultant filtering out will be equal tō P x which is average of adjacent input pixels. Instead of this if the local variance is larger at location (i, j) then it means pixel is at the edge area of the image and it will be close to 1. In both the case pixel weight of the brightness and color will have same weights. This is exactly according to the principle of spatial consistency. As it has been defined before that if the base layer seems spatial smooth then its corresponding weight must also be spatial smooth otherwise artificial edges will be revealed. In contrast to this sharp and edge, aligned weights are preferred for detail layers, which can lose its details if the weights are over smoothed. That is why to fuse the base layers filter size and blur degree both are kept large however for detail layer, small blur degree and small filter size are preferred.

5) TWO SCALE IMAGE RECONSTRUCTION
In this section we present the two steps for reconstruction of image. In first step, weighted averaging process fuse the base and detail layers of different input images by following equations,B In second step both the fused layers of base and detail are combined to get the output image.
As shown in Algorithm 5, reconstruction of images is done by combining the base and detail layers with a weighted base

Algorithm 5 Image Reconstruction
=B +D 6 return O T and weighted detail layer. Then to acquire a fused image both layers are further added to preserve the details of images.

IV. EXPERIMENTAL RESULTS
In this section, the experiments and results to validate the performance of the proposed scheme are presented. In addition to the objective and subjective evaluation of the presented technique, it is compared with prominent state-of-the-art techniques. Since the proposed technique provides a solution for both static and dynamic scenes, therefore, source images [24], [28], [53], [54] and the techniques for comparison [16], [55]- [58] are chosen accordingly. We compare our method with MEF and ghosting artifact removing techniques that are specifically presented for dynamic or static scenes.

A. EXPERIMENTAL SETUP
To evaluate the performance of the proposed method, experiments are conducted on different 20 image data-sets [53]. The image data-set includes sequences of images taken under different exposure setting, lighting conditions, focuses; few The proposed method is compared with 7 existing methods i.e., [9], [14], [33]- [37] and has tried to overcome the limitations in the existing methods both subjectively and objectively. In [33], the method removes the ghosting artifact but was unable to preserve the texture detail of the darker area of the image and unable to produce vivid color. To preserve the texture details and natural color, our proposed method constructed the saliency map at each pixel level. The method proposed in [9] produces good fusion efficiency also for the flash images but it cannot handle object boundaries, which is covered in our method by weight optimization. In [34], the authors suggested reducing the pyramid layer, in resultant this can produce halos, and thus we propose to refine the weight maps using the weighted average filter. In [35], the suggested method preserves the low luminance details effectively but often create unwanted visual artifacts. The method in [36] is based on the luminance of the source image using a higher gradient magnitude at every intensity location. It works well for focus images but for multi-exposure images, results are unsatisfactory in darker areas which can be seen in Figure 3f. The method in [37] works with multi-scale image decomposition and perform visual saliency detection on it. Usually, it smooths the high-frequency information of the image due to which it cannot detect the defocus or focus areas of the image. This can be further examined with Figure 11g, that shows another result of the same method having detection errors at the edge area of the image. Although guided image filter can improve the quality of the resultant image which avoids the block effect produce on the edge areas. The proposed technique works for the dynamic scene, multi-exposure static scene, multi-focus, and flash image.

B. SUBJECTIVE ASSESSMENT
As experiments are performed on 20 data-sets includes 96 source images but due to space constraints, we consider one or two data set from each class for the subjective analysis. The proposed technique is compared with 7 states of the art methods. We start by showing the excellence of the experimental static scene to authenticate our results of the proposed method. Figure 2 shows the sequence of ''Mud-House'' in which some features that are visible in one exposure disappear in the others due to over-or underexposure. Therefore, the basic goal of composition is to preserve all features present in the exposure sequence and make them visible in one image. The experimental results of different methods on the ''Mud-House'' sequences are shown in Figure 3. Where Figure 3a is obtained by Venmali et al.'s method, which has an overall dark appearance inside the house due to that things are not visible. Figure 3b, 3e results of Merten's and S. Lee's preserve the detail to a great degree with better color saturation. Besides, comprehensive observation reveals shadow of the table due to brightness. With V. Vonikakis's and Durga's method as shown in Figure 3c, and 3g respectively, due to high brightness, factor details are lost in the middle of the fused image in Figure 3c and in all part of Durga's method Figure 3g. Due to bad color saturation in the result of Kede's technique as shown in Figure 3d, the visual quality of the fused image is degraded. Similarly, Figure 3f also work well with brighter area of image but loss the detail inside door because of darker area which is proposed by Sujoy et al. The result of our proposed method as shown in Figure 3h, preserve all detail and has good visual quality. Figure 4 represents the image sequence of ''Flash'' dataset. A pair of images with flash-light appearance i.e., Figure 4a and without i.e., Figure 4b are shown. Experimental results of Flash data set is shown in Figure 5. As can be seen in Figure 5a Vanmali et al. method, the area of face loss the detail due to darker result. Figure 5b Mertens et al., results are a better than Vanmali et al. but due to brightness on leaves behind the color got faded on the face region. In Figure 5c, due to brightness on the left side of image colors are faded. In Figure 5d, 5f, 5g the white spot of flash-light appeared, which rather should be removed in fused images. 5e method is applied equally on all the area of image but due to excessive brightness, natural colors quality is lost. 5h represents our proposed method that overcomes all the above short-comes in the fused images. Figure 6 presents more results of the proposed method on the sample sequence which is gathered from three sources [24], [28], [54]. Figure 6(a) shows five image exposures with low, medium and high exposures, while Figure 6(b) shows the fused results of the exposures.
To validate the efficiency of our proposed method in dynamic exposures, we have used the ''Forest'' data set having 4 different exposures of dynamic scene with a moving man as shown in Figure 7.   the compared results suffers from severe ghosting artifacts. In contrast proposed method Figure. 8h shows pleasant visual results with more information as compared to the above results and is completely ghost-free. Figure 10 presents the example of multi-focus color images data set named as ''Children''. To examine the performance of multi-focus, source images with a combination of different focus settings are used. To evaluate the performance of multifocus, input images with a combination of different focus settings are used. Figure 11 is representing ''Children'' data set fused images obtained from multi-focus images of children sequence. In Vanmali et al., Figure 11a portion of the image where girl standing is cleared, however the boy on the front has lost its details due to blurriness at shirt's collar of boy. Figure 11b of Tom Merten's has better results as compared to Vanmali et al's result, but the area of the statue is blurred. In Figure 11c, 11d the girl at the back with the statue is losing its details because that area is blurred. In Figure 11e, 11f the results are better to a great extent but the white area of the statue is not cleared. In Figure 11g the details have been lost due to excessive brightness which can be estimated from the boys's hair brightness. In our proposed method as shown in Figure 11h, the whole area at the front and back is highly focused and the visual quality is better.

C. OBJECTIVE ASSESSMENT
In order to assess the performance of fusion results quantitatively, fusion metrics or visual quality technique can be used. Mostly in image processing methods, resultant image can be justify by the visual inspection, which is the best way of quality assessment [59]. There is no single standard evaluation metric which shows the quality of resultant image and also VOLUME 8, 2020  no standard choosing method for selecting fusion metric for evaluation. Therefore, we have applied different recent and traditional comprehensive evaluation metrics such as, structure based evaluation technique Q Y [55], mutual information theory based assessment method Q MI [56], visual image evaluation method which is inspired by human perception to assess the human visualization results of fused image Q G [58] and Q CB [57], [60].
1) Q Y Q Y method defined by Yang et al. [55] works on the similarity of structure for fusion evaluation. Following equation define Q Y as,    In the above equation input images are presented by A,B and fused image is denoted by F. However window size is set to 7 × 7, which is denoted by w and function SSIM shows the structure similarity check.
To calculate the local weight λ w presented in above equation in which variance of source images are represented by Aw and Bw from input image A and B in window w. This method preserve how well structural information of input images are gathered. Table 1 shows the Q Y comparison of [9], [14], [33]- [37] with proposed method on 4 different source image sequences. Values with bold text are high in score and have better quality information.

2) Q MI
Quality Mutual Information (Q MI ) technique works on information theory based metric. This technique have issue with VOLUME 8, 2020 traditional MI metric [61] which is not stable and could make ambiguity measure for the source image with maximum entropy. This method is normalized by Turitsyna and Webb et al. [56] and defined as below, Joint entropy between fused image F and input image A is denoted by H (A, F) and MI (B, F) and can be calculated same like MI (A, F). This method inquire how efficiently the details of input image is saved in fused image. Table 4 indicate that the respective method performs well for the static and dynamic scene images, for multi-focus as well as flash images. It consistently leads to better output over other analyzed state of the art methods.

3) Q CB
In this technique Chen and Blum [57] presents a contrast sensitivity function. Filter is applied on input and fused image after that for each image local contrast map is executed. Then relationship between source and fused image is described by preservation map. Finally overall quality is obtain by saliency map. Table 2 shows the execution results in which a larger value represents the better contrast. As per this metric, proposed method is considered as superior on other compared methods.

4) Q G
This technique was presented by Xydeas and Petrovic [58] and the basic concept of the matrix is to preserve the maximum border of source images to fused image. This technique works on the sobel operator to calculate the orientation and strength of the gradient of each pixel at the resultant image.
To evaluate the results, Table 3 shows the gathered statics. As per Q G results, our proposed fusion method delivers the better performance to preserve the border details.

D. ANALYSIS OF FREE PARAMETER
In this section, the results have been released by setting different parameters of radius r1 and r2, to select the best-fit parameter that will induce better-fused image. Each combination of the parameter is analyzed both quantitatively and qualitatively. For quantitatively, fusion quality performance is evaluated from the above-mentioned quality metrics. Table 5 shows the analyzed data of different parameters and bold numbers represent the best results.
For qualitatively, Figure 12 shows that the brightness of the image is changed on every combination of parameters and selected parameters have better-equalized brightness in Figure 12c. Parameters in Table 6 are induced for the guided filter.

E. EXECUTION TIME COMPARISON
After fusion quality, another important efficiency check for real application is computing evaluations. Table 7 Represents the comparison of average computational time with state-ofthe-art methods on static image ''Mud-House''. All evaluations are tested on a CPU 2.3 GHz with 4GB RAM and using MATLAB R2017b.

V. CONCLUSION AND FUTURE WORK
In this paper, we have presented a novel multi-exposure method which is based upon a guided filter and color dissimilarity due to which it generates weight functions comprehensively and efficiently. We initially utilized a useful, simple and reliable filtering technique which is a guided image filter used for its edge-preserving and smoothness features. Secondly, we have combined histogram equalization and median filter for removing the moving objects in it.
Final results after various experiments show that the proposed method is beneficial to preserve the details of source image, the fusion of multi-exposure images and for eliminating the moving objects. The proposed technique can be useful for various real-world applications such as machine vision for object detection and medical imaging. In future work, we will amplify its de-ghosting method for vanishing the moving objects in multiple locations and to secure more information by getting the desired area of an image. We also aim to make this method faster and easily applicable in mobile devices due to its low computational complexity.