An Enhanced Image Fusion Algorithm by Combined Histogram Equalization and Fast Gray Level Grouping Using Multi-Scale Decomposition and Gray-PCA

Image enhancement is a challenging task in image analysis particularly, it is more challenging in performing image fusion. Image fusion is the process of combining multiple images to produce quality output without any variation in contrast, blurring, and noise. Many image fusion algorithms have been implemented, but their final fused images suffer from variations in background contrast, uneven illumination, blurring, and the presence of noise. To overcome the aforementioned issues, this paper proposed a new image fusion method, which improves image contrast and also gives appropriate details of the image. Our method is based on a set of conventional techniques such as amalgamated histogram equalization and fast gray-scale grouping to handle the problems mentioned, and we improve overall fusion strategies by proposing a novel principal component analysis technique to convert RGB types images to high gray-scale contrast image as the final output image. We have carried out many experiments on different common databases used by various researchers. Our proposed method gives good subjective and objective performances compared to other statuses. Our proposed method can be used in different monitoring applications.


I. INTRODUCTION
The image fusion is the way of merging multiple input images to produce one output image which extracts high quality and more informative for the perception of human vision, robot and other processing tasks as compared to any of the input images [1]- [3]. In recent years, image fusion techniques have become a promising research area and have received a lot of interest in many applications such as computer vision, face The associate editor coordinating the review of this manuscript and approving it for publication was Hao Ji. detection and recognition, medical diagnose, surveillance, and so on [1], [4]- [6].
The image of different types such as computed tomography (CT), magnetic resonance imaging (MRI), visible, infrared and the images which are taken by the same camera with different focal lengths are suitable for fusion of images [7]. However, one model cannot capture enough information due to limitations of the system because it is hard to capture focus for all objects due to the limited focal length of camera. Therefore, combining images of different focal lengths for the same scenery by varying the focal length is known as VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ multi-focus image fusion [8]- [11]. Similarly, CT and MRI images are used to diagnose many medical conditions such as strokes, tumors. However, CT images provide only information about bones, whereas MRI images give information about soft tissues, so we cannot get sufficient information from a single modal medical image. Therefore, image fusion is used in this case to merge both CT and MRI images to get information of soft tissues and clear bones, which is known as medical image fusion [12]- [15]. Similarly, the visible images (VI) capture the image details when there is enough illumination. However, the contrast is poor when there is insufficient brightness. On the other hand, infrared images (IR) cannot capture the real details of scene because it can only reflect the gray-scale of the scene [16], [17]. Consequently, an image fusion technique is used in this perspective to get complementary information from both images which can be used in surveillance applications [1]- [3]. The image fusion is performed by three distinct processing levels named as pixel-level, feature-level and decisionlevel [5], [18]- [21]. Pixel-level based fusion is used by many researchers in various applications. It merges the pixels of input images directly to acquire the final output image [1], [3]. Some researchers have also taken attention on feature-level based image fusion that deals with high-level processing tasks. It extracts the features in the image and then fuses the features using some advanced fusion schemes such as region-based [1], [2]. The purpose of feature-level fusion is to acquire desired features from input images instead of all features [1], [2]. The decision level is the highest processing level of three levels mentioned above. It extracts all information from images and then according to specific criteria, the decisions are taken to fuse the extracted information [1]. This type of processing is widely used in biometrics, fingerprint verification and face recognition.
The main key goal of any image fusion algorithm is to increase contrast of source images so that it can preserve most of the useful information without producing artifacts and apply proper fusion strategy that should be robust to improper conditions. Though many image fusion algorithms have been designed till now but the fused image suffers from variations in background contrast, uneven illumination, blurring effect and the presence of noise. To overcome the shortcomings of the aforementioned issues, this paper presents an enhanced image fusion algorithm that addresses these issues and achieves better results than existing state-of-art image fusion techniques.
The main contributions of this work are as follows: i) An amalgamated histogram equalization and fast gray-level grouping (HEFGLG) technique is proposed in this work that automatically enhances the contrast of image. This method is computationally efficient and it also produces better results.
ii) Non-subsampled contourlet transform (NSCT) is applied to images, which are obtained from HEFGLG. This decomposition method can efficiently capture the true geometrical structure. Moreover, it has shift-invariance property with fast computation.
iii) Local energy-based fusion strategy by choosing either averaging or selection mode is applied for low-frequency images that preserve the energy information. Mean-gradient fusion strategy is applied for high-frequency images to maintain the texture, edges, boundaries and smooth contours. iv) Principal component analysis (PCA) based multichannel color to single-channel gray conversion technique is implemented for dimension reduction and robust operation. This technique effectively preserves the discriminability between color and textures in the resultant image.
The rest of the paper is organized as follows. Section 2 summarizes the literature review of recent image fusion algorithms. Section 3 elaborates the proposed work. Experimental results and evaluation of algorithm are presented in section 4, and the conclusion is stated in section 5.

II. THE RELATED WORK
Image fusion is the most popular research area and it has received remarkable progress in last few decades. We present literature on recent studies for state-of-art image fusion techniques.
The PCA is applied in [22], which reduces the dimensions of data and produces energy information in final fused output image [18]. Nevertheless, this method suffers from spectral degradation, and due to that, it cannot capture the smooth edges, textures and contours in the image. Discrete wavelet transform (DWT) was proposed by Yon et al. [23], which preserves high-frequency information with fast computation but it has lack of shift-invariance property that introduce artifacts and noise. Moreover, it has limited directional information and cannot capture the essential features like contours and edges. Mane et al. [13] introduced hybrid DWT and PCA so that it can achieve spatial as well as spectral information. The input images are decomposed by DWT and then PCA is applied to decomposed images. Though this method produces better fused results than individual DWT and PCA techniques but the output image still has limited directional information. Moreover, this combined technique is shift variant that introduces artifacts in the fused image and it also cannot capture the smooth contours and edges. To address the aforementioned issues, contourlet transform is used by two different authors in [19], [24]. It can capture smooth edges and contours, and gives multi-directional information. However, this method uses downsampling that makes it shift variant and it leads to Gibb's effect; thus, it affects the result of output image. The NSCT is discussed in [25], which is fully shift-invariant and has flexible frequency selectivity with fast implementation that addresses the above issues. However, the poor illumination and blurring effect degrades the fused image so it cannot preserve the detailed information from input images. The edge-preserving filter based image fusion is proposed by Tian et al. [4] using combined median-average based discrete stationary wavelet transform (DSWT) with PCA. At first, the median-average filter eliminates the noise from each pixel. Then, DSWT is applied to decompose the images and finally, PCA is used for data reduction. This method achieves better results but the fused image still suffers due to limited directional information and it cannot capture smooth edges and contours. The hybrid non-subsampled shearlet transform (NSST) based spatial frequency (SF) and pulse coupled neural network (PCNN) is implemented in [26]. The NSST is used to decompose both images and SF-PCNN is applied to sub-band coefficients for source images. This algorithm addresses the issues of shift variance and it has better frequency selectivity. However, the decomposed low-frequency image is affected by poor contrast and sharpness, so the image loses some part of energy information, which affects the overall performance of the output image [27]. An improved morphology-hat transform (MT) based image fusion algorithm using contourlet transform (CT) and PCA has been designed by He Li et al in [28]. Though, this algorithm increases the brightness of fused image, preserves the energy information and captures smooth contours. However, contourlet transform is shift variant because it uses down-sampling and thus Gibbs effect is introduced. In [29], the author has implemented fully convolutional network for the multi-focus image fusion method. In this work, the pooling layers are adjusted to change in convolutional layers; therefore, all layers are convolutional layers. The better visual effect is obtained by this method. However, it is very time consuming and it is also very hard to design the structure of network and train data sets. Shuaiqi et al in [10] have designed image fusion algorithm by amalgamating NSST and residual network (ResNet). It first decomposes the input images by NSST and then ResNet is applied for low frequency images, and enhanced gradient sum of Laplacian energy is performed on high frequency images. The fused image obtained by this method produces more clear information and better details. However, this algorithm is time consuming. The author in [30] apply the Latent lowrank representation (LatLRR) to fuse the infrared and visible images. However, the artifacts in the edge areas are introduced due to lack of spatial consistency in this method.

III. THE PROPOSED WORK
The existing image fusion algorithms have their pros and cons, and their pros should be amalgamated to enhance the quality of the fused image. There is need to design an image fusion algorithm that can automatically adjust the contrast of images so that it can preserve energy information, smooth edges, contours and sufficient complementary information without introducing artifacts and noise. Besides, apply the improved fusion strategy that should be robust to improper conditions while achieving high quality fused image. Moreover, the fused image can effectively preserve discriminability between color and textures in the resultant image by using some data reduction technique with the robust operation. This paper presents an enhanced image fusion algorithm that meets the aforementioned requirements and achieves better results than existing image fusion methods.
The proposed method includes several sequential stages as depicted in Figure 1. The main goal of each stage is to enhance the quality of an image without introducing artifacts. Each stage is explained in details in subsequent sub-sections:

A. COMBINED HEFGLG
It is challenging to enhance the low contrast images either natural, multi-focus, infrared and visible or medical images whose intensity of image in gray domain is very high at one position and very small in other parts of image. To overcome the above challenges, HEFGLG is used in this paper. It divides the histogram of low varying contrast into two histograms in accordance with the position of the highest intensity histogram component. Histogram equalization (HE) is used to increase contrast by equalizing left segment of VOLUME 8, 2020 is applied to histogram component for right side [31].
As gray level grouping (GLG) [32] technique first group the parts of low contrast image into the desired number of groupings based on predefined criteria. Furthermore, it redistributes these histogram components groups uniformly so that each group hold same size gray-scale segment as in other groups. Finally, all of the previously grouped gray levels are ungrouped. This whole process is time-consuming, increases the complexity and also produce washed-out effect that degrades the quality of image. This paper employs FGLG that uses a default value for all gray level bins [33] that makes it fully automated as it does not require the construction of the transformation function and calculation of the average distance between two pixels for each set of gray-levels. We set 20 as a default value for gray level bins that can be seen on right side of Figure 2, which reduces the time and the number of iterations. This feature is not only computationally efficient as it requires less time and number of iterations but also produces better results. Figure 2 shows the schematic diagram of HEFGLG.
The histogram of an input image h(i k ) is a discrete function h(i k ) = n k with the intensity level in the range of [0, L − 1], where i k is the kth amplitude level and n k is the total pixel numbers for input image. The histogram is normalized by dividing its all components using total pixel numbers denoted by product of m × n in an image, where m and n indicate rows and columns of that image. Therefore normalized histogram is given by P(i k ) = n k /mn, k = 0, 1, 2, 3, . . . . . . , L − 1 and P(i k ) is the probability of occurrence for i k in the source image. After the histogram is calculated for the source images with intensity level in range of [0, L − 1], the procedure of HEFGLG is as follows: (1) Find the highest amplitude histogram component position P hist on gray-scale for the source image. If the P hist lies on left side but not in the first component of NZHC, then the original histogram is segmented into two histograms, the first start from (0 to P hist − 1) and other from (P hist to L − 1). If the P hist lies either inside the right part or in first component of NZHC, then simply improve the low contrast of input image using FGLG.
(2) Apply HE in the first histogram component (0 to P hist − 1), and use FGLG for other histogram componen (P hist to L − 1). (
(4) Therefore, the piecewise transformation function is calculated as: The piecewise transformation function is applied for reconstruction to get an enhanced and rich contrast image.

B. NSCT
The NSCT is multiscale and multi-directional image decomposition method designed by amalgamating the non-subsampled pyramid filter banks (NSPFB) and nonsubsampled directional filter banks (NSDFB) [34]. This decomposition method can efficiently restore the true geometrical structure of an image such as edges, contours, and it is fully shift-invariant.
The NSPFB decomposes the image by using two-channel non-subsampled 2D filter banks by avoiding down-sampling or up-sampling. Further, NSDFB is used to split band-pass sub-bands into different directions, as depicted in Figure 3.
The perfect reconstruction for NSPFB is obtained by: where H 1 (z) = 1 − H 0 (z); G 0 (z) and G 1 (z) are low-pass and band-pass filters. The non-subsampled pyramids are designed using iterative non-subsampled filter banks to obtain multiscale decomposition. Then, all filters are up-sampled by 2 for the next level and due to that, they satisfy the perfect reconstruction criteria. The equivalent filter for k −th level cascaded non-subsampled pyramids is expressed as: Here z j stands for [z j 1 , z j 2 ]. After applying decomposition by NSPFB, the NSDFB is applied that is a shift-invariant [35]. It is also two-channel non-subsampled filter bank iterative method which achieves better directional decomposition. This same process will be used for desired higher level decomposition.

C. FUSION STRATEGY FOR LOW-FREQUENCY AND HIGH-FREQUENCY IMAGES
The performance of the final fused image is directly affected by the fusion strategy and the way it is applied to images. At present, a weighted average fusion strategy is used but it cannot capture useful features such as energy information, textures, edges and contours in the fused image.
In this paper, local energy based fusion strategy and meangradient based fusion strategy are applied for low-frequency and high-frequency images. The accuracy in the fusion of low-pass images can be further improved with a silence feature which determines whether selection or averaging mode is applied. Meanwhile, the mean-gradient based fusion strategy is employed to high-pass images that can efficiently capture the smooth edges, contours, and boundaries.

1) FUSION STRATEGY FOR LOW-FREQUENCY IMAGES
The low-frequency images have mainly the energy information and it is vital to preserve the contrast details of input images. The simplest way is to apply averaging fusion technique but it cannot capture high-quality images. Therefore, local energy based fusion strategy is applied for NSCT, and then two distinct combination modes; averaging and selection modes are used for coefficients to enhance the fusion accuracy.
First, the local energy E l (x, y) is computed by centering the current coefficients in the coarse sub-band C J that is given as: Here (x, y) represent the current non-subsampled contourlet coefficients and W L is a template of size 3 × 3.
The silence factor S J is computed to decide whether averaging or selection mode is used in the fusion process. The silence factor S J is given by: Here C X J (x, y), X = A, B represent the low pass nonsubsampled contourlet coefficients for input image A or B, respectively.
The silence factor S J reflects the similarity between two input images for low-pass sub-bands. The S J value is compared with threshold level T . If S AB J > T the input coefficients in c A J and c B J are very similar and averaging mode is applied, and the information is obtained from both input images. Its equation is computed by: c F J denotes the final fused coefficients at the position (x, y), β A and β B are weights.
Here β min ∈ (0, 1), β min + β max = 1 On the other hand, if S AB J ≤ T it corresponds to the dissimilarity between the input coefficients c A J and c B J , the selection mode is applied in this case. The coefficients with larger energy are selected, while coefficients with small energy are discarded. The selection mode is computed by: Here E A l (x, y) and E B l (x, y) represents the local energy of image A and B.

2) FUSION STRATEGY FOR HIGH-FREQUENCY IMAGES
The high-frequency images contain the textures, edges, contours and object boundaries of original image. This paper incorporates the mean-gradient fusion scheme.
The mean gradient (G) of region (R) is computed first by: The size of R is M × N (M , N is odd, N ≥ 3, M ≥ 3) , I x and I y are the first-order difference of f (x, y) in X and Y directions.
After that, fusion coefficient (β) of two images is obtained by mean gradient of both input images, which is given by: where β A and β B are the mean gradient of both images. Finally, the fused coefficients for high-frequency images are computed by:

D. PRINCIPAL COMPONENT ANALYSIS (PCA) BASED GRAY CONVERSION METHOD
The next essential part is to apply a gray-PCA technique that reduces the amount of data and produce an enhanced single-channel gray output image with fast computation and robust operation [36]. Figure 4 shows the schematic diagram of the gray-PCA based color-to-gray conversion process. The image that is processed must be a luminancechrominance representation but if the image is in RGB color space, then we need to convert it into luminance-chrominance representation. The first step is formation of the vectorized color image (I rgb ∈ R 3×n ) by stacking image representation into RGB color channels side-by-side. Subsequently, zero-mean (YC b C r ) processed image (I YC b C r ∈ R 3 , Y and C b C r indicate the luminance and chrominance) is obtained by separating the luminance and chrominance using a transfer function f (.) , shown in paper [37].
After that, Eigenvalues λ 1 ≥ λ 2 ≥ λ 3 ∈ R 1 and their normalized eigenvectors v 1 ≥ v 2 ≥ v 3 ∈ R 3 are computed by applying principal component analysis. The resultant singlechannel gray-scale image I gray is computed by weighted linear combination of three projections, and the weights are obtained by percentage of their eigenvalues. As a result, the first subspace projection results in multicolor-to-gray mapping because of its highest eigenvalue. In contrast, the second and third subspace projection preserves the details for colored images in the resulting single-channel gray-scale image.

IV. EXPERIMENTAL RESULTS, DATASETS AND EVALUATION OF ALGORITHM
We have taken ten pairs of input images and two pairs of input images are processed separately by each stage to clearly show the influence of each stage in the proposed method. The two pairs of input images are shown in Figure 5. The HEFGL is adopted to enhance the contrast of images. It can be clearly seen in Figure 6 that HEFGL method enhances the contrast of input images and images are more vivid with sufficient information. The enhanced images are then processed by NSCT that uses four multiscale decomposition levels. The NSCT has property of shift-invariance with flexible frequency selectivity that results in high quality image  without introducing artifacts. The images generated by NSCT are depicted in Figure 7. Finally, the gray-PCA method is applied to images that are obtained from NSCT. The gray-PCA method not only reduces the dimensions of data with robust operation but this method also effectively preserves the discriminability between color and textures in the resultant image and produces high quality output image. The final fused image is depicted in Figure 8. Therefore, it can be analyzed that images generated at each stage produce better quality image, which shows the positive influence of each stage in proposed method.

A. DATASETS
In this paper, we have used ten pairs of datasets for RGB images to compare proposed method with state-of-the-art  image fusion techniques. Many authors have used these datasets for image fusion research field. For instance, the medical images which contain registered CT and MRI are publically distributed by the Harvard Medical school at http://www.med.harvard.edu/AANLIB/home.html, and McConnel Brain Imaging Centre of the Montreal Neurological Institute has distributed datasets at http://www.mouldy.bic.mni.mcgill.ca/brainweb. These datasets have been applied in several image fusion areas [38]- [41]. The datasets for infrared and visible imaging are taken from http://www.metapix.de/indexp.htm, http://ece.lehih.edu/SPCRL/IF/image_fusion.html, and http://web.media.mit.edu/∼ raskar/NPAR04/. These datasets have been used in several image fusion research areas [7], [39], [42]- [44]. The multi-focus datasets are available at https://dsp.etfbl.net/mif/, which is shared by Savic and it has been used in different papers for image fusion [45], [46]. All source images are RBG but they are categorized into two scenarios: RGB images visualized as colored image and RGB images visualized as a gray image. The results of all final fused images are converted from three-channel RBG image to single-channel gray image.

1) SCENARIO ONE: RGB IMAGES VISUALIZED AS GRAY IMAGES
In this scenario, we use the RGB source images that are visualized as gray images. Figure 9 to 14 shows the RGB source images that visualize as gray images and the results of final fused images.

2) SCENARIO TWO: RGB IMAGES VISUALIZED AS COLORED IMAGES
In this scenario, we use the RGB source images that are visualized as colored images. Figure 15 to 18 shows the RGB VOLUME 8, 2020 source images that visualize as colored images and the results of final fused images.

B. EVALUATION OF ALGORITHM
Evaluating the performance of the proposed algorithm with existing techniques is a challenging task [47]. The performance evaluation is categorized in subjective and objective evaluation. We have compared the experimental results with PCA [22], DWT [23], medianaverage based DSWT-PCA [4], non-subsampled shearlet transform based spatial frequency and pulse code neural network (NSST-SF-PCNN) [26] and morphology-hat transform based contourlet transform with principal component analysis (MT-CT-PCA) [28] image fusion techniques. The median-average based DSWT-PCA, NSST-SF-PCNN and MT-CT-PCA are new hybrid image fusion algorithms, which perform better than other techniques but the proposed method achieves the best performance among the aforementioned schemes.

1) SUBJECTIVE EVALUATION
The subjective evaluation is used to evaluate the quality of the final fused image according to the perception of human vision. It is the most widely used and popular way to visualize the difference between images by the human eye. Figure 9 to 18 shows the subjective evaluation of proposed method and five aforementioned state-of-art image fusion techniques.
It can be seen in Figure 9, the fusion effect of the NSST-SF-PCNN method, median-average and DWT is almost similar. Their contrast is high but the details of these fused images are not precise; some useful information about soft tissues is missing. Compared to NSST-SF-PCCN, median-average and DWT fusion methods; the PCA produces better information about soft tissues. But the fused image has washed out effect due to which the combined image has artifacts that degrade the overall quality of fused image. In comparison to these fusion schemes, the MT-CT-PCA method has better contrast and improvement in the information about soft tissues. However, the overall effect is still not satisfactory. It can be seen that the proposed method adjusts the contrast by using HEFGLG and we can get precise information about soft tissues highlighted in red boxes of image. The overall effect of the proposed method is much better than aforementioned methods. Figure 10 shows the fused results for multi-focus clock images. It can be clearly seen that proposed method achieves much better results for fused images than existing techniques. The brightness of the clock in proposed method is consistent with the clock in the source images. Moreover, it is obvious in red highlighted boxes that the contours are precise; the curve and edges in left clock are smooth with bright contrast. Though the MT-CT-PCA fusion scheme also has better contrast but the curve and boundaries are not stable as in the proposed method, which is highlighted in Figure 10 with red boxes. The performance of NSST-SF-PCNN fusion, median- average and DWT almost resembles; the contrast is not bright, the image is blurred and overall results are not satisfactory.
The fusion effect of MT-CT-PCA, NSST-SF-PCNN fusion and DWT in Figure 11 is not satisfactory. The contrast is also very poor and the objects in the image have no clarity. The median-average method produces better results than existing techniques and three persons are visible. However, the gun and car wheel are not visible in the median-average method and the contrast of fused image is also not consistent with input images. It can be analyzed in Figure 11 that proposed method produces much better results. Three persons, car wheel and the gun are more vivid in the proposed method. Furthermore, the brightness of the three persons and the gun is consistent with the source images.
The fusion effect of Infrared and visible image for desert-car is depicted in Figure 12. The fusion effect of MT-CT-PCA, NSST-SF-PCNN fusion, DWT and PCA are almost same; the brightness is not much bright; textures of a car are poor. The fusion effect of median-average method is even weaker and the background detail is not clear. In comparison to all above methods, the overall fusion performance of proposed method is better; the contrast is higher, texture information is much clear and the background detail is vivid.
In Figure 13, the fusion effect of PCA is worse and the overall impact is deficient. The texture details and background scenery for farmhouse are almost similar in NSST-SF-PCNN fusion, median-average and DWT but the contrast of DWT and median-average is blurred. From  Figure 13, it can be observed that the background scenery of MT-CT-PCA and proposed method are much clear and obvious than the rest of fusion schemes. Moreover, the proposed method has even more brightness, better contrast and high edge profile information than MT-CT-PCA that shows the superiority of our method than all aforementioned techniques.
The fusion effect of a bunker for IR and VI images is shown in Figure 14. The contrast of fused image for median-average and DWT is low in comparison to MT-CT-PCA and NSST-SF-PCNN methods but the textures of bunker are more apparent. The performance of PCA is inferior; the brightness is very high due to that, the overall effect of PCA is fuzzy. From  Figure 14, it can be seen that the texture details of proposed method are much precise, and the information is more abundant in comparison to the conventional methods. Moreover, the overall effect of proposed method is best among the rest of the algorithms.
It can be seen from Figure 15 that the fusion results of book-man-clocks for MT-CT-PCA and NSST-SF-PCNN fusion methods are almost same, but their contrast is higher than median-average, DWT and PCA. In comparison to other techniques, the performance of DWT is inadequate for this fused image. The contrast of proposed method is the best and the book letters are even more vivid than aforementioned methods. Additionally, the wall clocks are also obviously clear than other methods and fused image has sufficient information. Therefore, the overall quality of proposed method is much better, which shows superiority than other techniques.
It can be observed from Figure 16, the fusion effect of median-average, DWT and PCA look similar, the contrast and detail information are almost same and they produce satisfactory results. The proposed method, MT-CT-PCA and NSST-SF-PCNN fusion methods have much bright contrast and brightness than median-average, DWT and PCA. It can be seen in Figure 16 that the backside contrast and mountain in the proposed method are even more vivid than MT-CT-PCA and NSST-SF-PCNN fusion methods. Therefore, it can be analyzed that the overall performance of proposed method is higher than other conventional techniques, and it has high edge profile information with abundant information.
The MT-CT-PCA and NSST-SF-PCNN fusion schemes have better contrast, and letters are more clear in comparison to median-average, DWT and PCA, which is illustrated in Figure 17. The fusion effect of median-average, DWT and PCA look similar with a very slight difference; the letters on books are not bright and contrast is also reduced. The proposed method has much better results and the letters are more vivid than all aforementioned methods as depicted 17. The proposed method has less artifacts, more information in well-adjusted contrast. Besides, it preserves edge profile information, texture details and contours. The median-average method in Figure 18 for zoo-animal has more contrast than MT-CT-PCA, NSST-SF-PCNN, DWT and PCA techniques but the overall image quality is almost same. It can be analyzed that the proposed method for zoo-animal has better contrast and vivid information than other methods. It has also all detailed information from both input images with very less information loss and almost similar edge profile, texture and contour information as in both input images. Therefore, it is more obvious that proposed method is capable of fusing much information and other details from input images in comparison to other methods.

2) OBJECTIVE EVALUATION
In this paper, we also conduct objective analysis for the proposed method and state-of-art image fusion methods. This analysis is based on a mathematical calculation by some computable formulas. The selected objective evaluation assessments are given below: The API computes the index of fused contrast image and it is calculated by: f (i, j) represents the pixel intensity at position (i, j) and m×n is the image size.

b: STANDARD DEVIATION (SD)
It computes the spread of information in fused image and it is computed by: Here f (i, j) is the pixel value of image, F is the mean, and mn is the total size of image.

c: AVERAGE GRADIENT (AG)
It calculates the degree of sharpness and clarity in the image which is calculated by: Here f (i, j) is the pixel value of an image, mn is the number of rows and columns in an image.

d: SPATIAL FREQUENCY (SF)
The SF calculates the total information in a region of the image. It is obtained by: where RF and CF are expressed as: VOLUME 8, 2020

e: CROSS-CORRELATION (CC)
This parameter is used to measure the similarity between the fused image and both input images. The CC is calculated as: Here, a(i, j), b(i, j) represent the input image 1 and image 2, A and B are the mean of each. The value of CC is closer to 1 if the two images are more similar.
The values for quality assessment parameters are presented in Table 1 and Table 2. Table 1 presents the parameter comparison of RGB images from Figure 9-14 that are visualized as gray images. Table 2 presents the objective parameter comparison of RGB images for Figures 15-18 that are visualized as colored images. The quality assessment parameters used in this paper are API, SD, AG, CC and SF. Higher the value for API, SD, AG, CC and SF; better would be the quality of the fused image. In these Tables, bold values show the higher value for that quality parameter. It can be analyzed in Table 1 and Table 2; the proposed method also achieves best results than existing fusion techniques for quality parameters. However, the proposed method attains less value for some quality parameters but its overall performance is better that can be seen in Table 1 and 2. The fused image of proposed method has better contrast, more vivid, higher energy information, better edges, textures and smooth contours with minimal distortion and negligible information loss than the rest of fusion methods.

3) COMPUTATION TIME COMPARISON
The computational efficiency of the proposed method and other state-of-art image fusion techniques are compared in Table 1 and Table 2. The experiments for computation time (t) are performed on MATLAB 2016b with 8 GB RAM and core i5 3.20 GHz CPU. It can be seen in the last column of Table 1 and Table 2 that the proposed method consumes very less time in comparison to MT-CT-PCA, NSST-SF-PCNN and median-average based DSWT-PCA. In addition, its qualitative and quantitative performance is also better than aforementioned state-of-art techniques. However, proposed method consumes little more time that DWT and PCA but the performance of proposed method is best that can be seen in subjective as well as objective evaluation parameters, which reveals the superiority of the proposed method.

V. CONCLUSION
In this research, we proposed a novel method that combines HEFGLG with NSCT and PCA-based conversion from multi-channel to a single-channel gray image. The experimental results substantiate that the proposed method combines the advantages of different sequential stages (HEFGLG, NSCT, improved fusion strategies, and gray-PCA). These combined methods provide better contrast, preserve sufficient energy information, achieve bright textures, smooth contours, and better visual effects than existing state-of-art techniques. Intuitively, the proposed method has tremendous fusion effects that can be observed from subjective as well as objective evaluation, which reveals the superiority of proposed work.
Many research points are still open and should be considered for future work. The next research direction would be to design more simple image fusion algorithm that plays a vital role in improving the image quality and applicable for real-time applications. Another research direction would be to use deep convolutional neural network for real-time monitoring applications of image fusion that consumes less time and produces high quality image.  He is currently a Postdoctoral Fellow with the South China University of Technology. His research interests include renewable energy integration, demand response, load management, power system optimization, and integrated energy system planning. VOLUME 8, 2020