A Novel GA-Based Optimized Approach for Regional Multimodal Medical Image Fusion With Superpixel Segmentation

For multimodal medical image fusion problems, most of the existing fusion approaches are based on pixel-level. However, the pixel-based fusion method tends to lose local and spatial information as the relationships between pixels are not considered appropriately, which has much influence on the quality of the fusion results. To address this issue, a region-based multimodal medical image fusion framework is proposed based on superpixel segmentation and a post-processing optimization method in this paper. In this framework, the average image of the source medical images is firstly obtained by a weighted averaging method. To effectively obtain homogeneous regions and preserve the complete information of image details, the fast linear spectral clustering(LSC) superpixel algorithm is carried out to segment the average image and get superpixel labels. For each region of the medical images, log-gabor filter(LGF) and sum modified laplacian(SML) are adopted to extract texture feature and contrast feature for the measurement of region importance. The most important regions are selected and the decision map is generated by comparison. Moreover, to get a more accurate decision map, a new post-processing optimized method based on genetic algorithm(GA) is given. A weighted strategy is applied to the extracted features and the weighting factor can be adaptively adjusted by GA. The effectiveness of the proposed fusion method is validated by conducting experiments on eight pairs of medical images from diverse modalities. In addition, seven other mainstream medical image fusion methods are adopted for comparing the performance of fusion. Experimental results in terms of qualitative and quantitative evaluation demonstrate that the proposed method can achieve state-of-the-art performance for multimodal medical image fusion problems.


I. INTRODUCTION
As a fundamental and effective supplementary tool, medical images play an increasingly significant role in modern clinical diagnosis and treatments. However, due to the limitation of the imaging mechanism, medical images from a The associate editor coordinating the review of this manuscript and approving it for publication was Wentao Fan . single modality usually cannot provide sufficient information to meet the requirements of complex diagnoses [1]. For instance, computerized tomography (CT) image can provide a clear visualization of dense structures like bones and implants, but it's not good at presenting the soft tissues. Magnetic resonance imaging (MRI) image can provide high-resolution detailed information of soft tissues, but it is also prone to introduce artifacts when taking photos VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ of bone structures [2]. Functional information of blood flow and metabolic changes can be reflected by positron emission tomography (PET) and single-photon emission computed tomography (SPECT) images, but the spatial resolution is usually very low. Multimodal medical image fusion is an effective technique to solve this problem, which aims to generate fusion images with complementary information contained in medical images from different modalities.
For medical image fusion problems, numerous methods have been proposed which can be roughly divided into three levels: pixel-level, feature-level, and decision-level [3]. Generally, pixel-level image fusion mainly includes two categories: spatial domain methods and transform domain methods. Spatial domain methods such as PCA [4], IHS [5], and averaging fusion select pixels from the source images to construct the final fused image. This kind of fusion methods can completely preserve spatial information and reduce computational complexity [6]. However, they also introduce color distortion and suffer from contrast decrease, which are unacceptable for the fusion of medical images. Different from spatial domain methods, transform domain methods decompose source images into high and low frequency by transform. Multiscale-transform(MST)-based approaches are popular in the field of medical image fusion due to their excellent performance of feature extraction [7]. The transform, namely the decomposition of image, is considered as an important analytical tool that has great effects on the extraction of information and the quality of the fusion results. There are numerous transform methods have been presented including contourlet transform (CT) [8], Laplacian pyramid (LP) transform [9], shearlet transform(ST) [10] and so on. Nevertheless, due to the current level of subband image obtained by the subsampling from these transforms is halved, fusion methods based on them fail to preserve the shift-invariance. To address this problem, the nonsubsampling schemes including nonsubsampling contour transform(NSCT) [11] and nonsubsampling shearlet transform (NSST) [12] are proposed. The nonsubsampling technique can well preserve the shift-invariance property of the decomposition but the fusion strategies adopted are very simple (either average strategy or maximum strategy), which limit the performance to some extent. In recent years, more effective MST-based medical image fusion approaches are proposed by developing more complicated fusion strategies. Sparse representation(SR) [13] and pulse-coupled neural network (PCNN) [14] are two popular fusion strategies used in medical image fusion. SR-based fusion algorithm can accurate describe and reconstruct signal by a linear combination of sparse coefficients. PCNN is a kind of neural network proposed by Eckhorn, which is derived from the cortical model and owns properties of global coupling and pulse synchronization [15]. To completely present the information of the source images, numerous algorithms have been proposed by combining the aforementioned transform methods and fusion strategies. For example, Xia et al. [16] proposed a multimodal medical image fusion method that combined NSCT with SR; Yin et al. [17] proposed a medical image fusion method based on NSST and parameter-adaptive PCNN model (NSST-PAPCNN); Zhu et al. [18] introduced a medical image fusion algorithm utilizing cartoon-texture decomposition (CTD) and SR to merge the decomposed coefficients(CTD-SR); Li et al. [7] introduced a multimodal medical image fusion algorithm based on Laplacian redecomposition (LRD). Although these algorithms can extract more salient features of the source images, several drawbacks of them can also be identified [19]: (1) time-consuming; (2) prone to decrease the contrast of images; (3) sensitive to misregistration and noise.
Researches [20]- [22] show that merging regions is further significant than pixels, because regional image processing is more in line with the human visual system and computer vision task as the relationship between pixels are sufficiently considered in it. Compared to pixel-based fusion algorithms, different advantages can be found in region-based fusion methods such as more stable to the noise, better maintain the contrast of the source images, and more efficient due to the reduction of processing units [23]. Basically, the procedure of the region-based algorithm includes two steps: (1) segment source images into regions; (2) select the most important regions by considering their properties to construct the fused image. Therefore, accurate image segmentation plays a remarkable role in the performance of medical image fusion. A region-based fusion scheme is initially introduced by Lewis et al. in Durga et al. [24], where a dual-tree complex wavelet transform (DT-CWT) [25] is utilized to segment the source images and features of each region are extracted to fuse images region by region. Garg et al. [26] presented a region-based medical image fusion algorithm utilizing an evolution algorithm to segment medical images. Luo et al. [27] applied the watershed algorithm to segment images into regions for fusion. However, the fusion results of these methods suffer from different degrees of artifacts due to the segmentation algorithms adopted is not precise enough. To address this problem, Normalized cuts(Ncuts) are employed in medical image fusion algorithms [23], [28] to segment images and get homogeneous regions. But it is time-consuming as the conventional eigen-based Ncuts is of high computational complexity. To get a better trade-off between efficiency and accuracy, Meher et al. [29] introduced an image fusion method which employed fuzzy c-means (FCM) clustering to segment image. Similarly, Li et al. [30] proposed an image fusion algorithm applying entropy rate(ER) superpixel segmentation and get good performance. Nevertheless, the regions segmented by FCM and ER are irregular in shapes and sizes and it is not suitable for the feature extraction of medical image fusion.
Based on the above discussion, it is clear that segmentation is an important factor that determines the fusion results of region-based medical image fusion algorithms. In addition, the fusion strategy should be well designed to ensure the most important regions can be correctly selected to construct the final images. As we know, the fusion strategies adopted by pixel-level medical image fusion methods are not suitable for the region-based medical image fusion methods as they work at the pixel level and they are time-consuming as well. Focusing on these two problems, this paper presents a region-based multimodal medical image fusion method that utilizes an effective superpixel segmentation algorithm and a fusion strategy based on feature extraction and optimization algorithm. The main contributions can be summarized as follows.
1) A novel framework for regional image fusion based on fast Linear spectral clustering (LSC) superpixel segmentation is shown in this paper (the general architecture is shown in Fig. 1). To the best of our knowledge, this is the first trial to employ superpixel in medical image fusion. Different from conventional region-based medical image fusion methods, the proposed LSC based method makes achievements in preserving the local and spatial information of the source images and get better trade-offs among efficiency, accuracy and fine structure. 2) We present a region-competition-based fusion strategy. Log-Gabor filter(LGF) and Sum Modified Laplacian(SML) are modified to calculate the texture feature value and contrast feature value of each region. The decision map is constructed according to the comparison of these two values for each region. In this way, the important information can be preserved and redundant information can be removed appropriately. 3) In the framework, a new post-processing optimized method based on genetic algorithm(GA) is proposed to optimize the fusion strategy by adaptively adjusted the weights of features. The application of this GA-based post-processing further improves the quality of the fusion results as the importance of the regions is more accurately measured. The rest of this paper is organized as follows. Section II introduces the related works on superpixel and optimization algorithms. Section III provides a detailed introduction of the proposed region-based medical image fusion method with superpixel segmentation and genetic algorithm. Experimental results and performance evaluation are shown in Section IV. Finally, the conclusion is drawn in Section V and the future work is also discussed.

II. RELATED WORK A. SEGMENTATION ALGORITHM
As mentioned in the introduction section, accurate image segmentation determines the performance of region-based medical image fusion, because incorrect partition usually leads to unexpected artifacts in fusion results. Convolutional neural networks (CNNs) have been widely used in automatic medical image segmentation in recent years [31], [32]. For instance, Xx et al. [33] proposed a medical image segmentation method based on dynamic adaptive residual network (DAR-net) to get accuracy segmentation. Pan et al. [34] put forward an automated segmentation method for nuclei working with sparse reconstruction and deep convolutional networks. However, these methods are designed to segment the region of interest in medical images (such as the lesion area), which is not suitable for regional image fusion where homogeneous regions are needed. Superpixel is a technique which not only can be used as an atomic unit for image processing, but also achieves the best trade-off between good performance and high efficiency [35]. Jia et al. [36] proposed an effective superpixel-based feature extraction for hyperspectral image fusion, which improves the efficiency of image fusion as the feature extraction is performed on superpixels instead of pixels of the image. Wu et al. [37] put forward a superpixel regions extraction for object detection to solve the problems of redundant information and time-consuming in target searching. Zhang et al. [38] presented a superpixel-based edge detection in which the clustering-based superpixel methods are centroid updated to improve the accuracy and enhance the robustness. In order to address the wrong selection of similar pixels in remote sensing image fusion, Wang et al. [39] applied a superpixel segmentation algorithm to ensure that pixels in the same block have similar properties. These researches prove that superpixels can be flexibly used for many applications of image processing to improve efficiency, accuracy, and robustness.
Although there is no case where superpixel algorithm has been employed in medical image fusion at present, we can find it has been used in other fields of medical image processing. For instance, Achanta et al. [40] introduced a simple linear iterative clustering(SLIC) superpixel segmentation which was first applied in medical image segmentation and achieved excellent performance. Wang et al. [41] combined SLIC with U-net architecture to segment the lesion area of tuberculosis. However, simple and time-efficient SLIC fails to get a better trade-off between homogeneous regions and fine structures, which could lead to mis-segmentation and generate artifacts in the final medical fusion result. In recent years, many superpixel algorithms have been proposed including TURBO [42], ERS [43], SEEDS [44], and LSC [45]. LSC is an algorithm simply applying K-means clustering in the combined ten dimensional color and coordinate space. Compared to other superpixel segmentation algorithms, LSC overcomes the shortcoming of SLIC while VOLUME 9, 2021 FIGURE 2. The specific architecture of our proposed method in this paper. To begin with, two source images are weighted averaged to obtain the average image. Then, the average image is segmented by the LSC to obtain the superpixel label. And the label is applied to source images for ensuring that they have the same regions. Subsequently, log-gabor filter(LGF) and Sum modified laplacian (SML) are used to extract texture feature and contrast feature of each region, respectively. The important regions are selected by the region-competition-based fusion strategy. This process is optimized by genetic algorithm to get the best decision map. Finally, the fused image could be obtained according to the decision map.
maintaining simplicity and efficiency. Therefore, LSC is considered to be used in this paper for segmentation.

B. OPTIMIZATION ALGORITHM
Intelligent optimization algorithms such as gray wolf optimization (GWO), modified central force optimization(MCFO), particle swarm optimization (PSO), and genetic algorithms(GA) have been performed effectively in medical image fusion where some kinds of optimizations for parameters are required. Asha et al. [46] proposed a medical image fusion method which can adaptively adjust the weights of features by using GWO to minimize the distance between the fused image and the source images. Liu et al. [47] proposed an effective image fusion method based on simplified PCNN (S-PCNN) which uses PSO algorithm to set the parameters of PCNN. El-Hoseny et al. [48] introduced an optimal solution for medical image fusion by means of utilizing the MCFO technique to set parameters and improve the quality of medical fused images. Xie and Qin [49] applied GA to optimize the objective function for a proper image. Compared to traditional optimization algorithms, intelligent optimization algorithms require a more relaxed expression of the objective functions and pay more attention to the speed and efficiency of computation. In this paper, we employ GA to optimize the fusion process as it is the most accurate, and has the best stability among all optimization methods [50].

III. PROPOSED METHOD
The architecture of our proposed fusion method is shown in Fig. 2. The proposed method is suitable for the fusion problems of more than two images. Here, the fusion of two medical images is taken as an example. First of all, two source images are weighted averaged to obtain the average image. Then, the average image is segmented by the LSC superpixel segmentation algorithm to obtain the label of the superpixels. And the label is applied to both source images so that the regions of the source images can keep consistent. Subsequently, Log-Gabor filter(LGF) and Sum Modified Laplacian (SML) are used to extract texture feature and contrast feature of each region, respectively. And the decision map is generated by comparing the value of the weighted strategy applied to these two features. The weighting factor is iteratively adjusted by a genetic algorithm. Finally, the fused image can be obtained according to the decision map. In the proposed fusion method, the average image can be obtained by Eq.1, where I 1 , I 2 denote the source images, respectively.
In medical image fusion, the accuracy and efficiency of segmentation algorithms are of great importance to the performance of region-based image fusion. The linear spectral clustering (LSC) algorithm produces superpixels with the best boundary adhesion in only linear time, which appropriately solves these bottlenecks of image fusion. For an M × N medical image I , we define an extensibility mapping to map I to the CIELab color space. In CIELab color space, the value of an pixel p = (x, y) is determined by brightness l and color contrast α and β. These three components are combined with X-Y coordinates to obtain a five dimensional vector (x, y, l, α, β) for each pixel. According to the methodology of LSC, in a well-designed ten dimensional feature space, we can simply use weighted K-means clustering to replace the complex operations in Normalized cuts when the Eq.(2) is satisfied.
Here, each pixel p is assigned with a weight d(p); D(p, q) stands for the similarity between two pixels p and q, and represents the map function which maps the pixel to higher dimensional feature space for improving the linear separability.
In order to measure the similarity between pixels, we first consider widely used Euclidean distance. For any two pixels p = (x p , y p , l p , α p , β p ) and q = (x q , y q , l q , α q , β q ), the formula for measuring their similarity is: where d xy and d lab represent the Euclidean distance and the color difference between two pixels, respectively. N xy and N lab are constants that balance the relative importance between color similarity and spatial proximity. A smaller value of N xy or N lab means the more important the corresponding feature is. Although Eq.
(3) has a very specific physical meaning in measuring the similarity of pixels, it cannot be used in LSC because it does not satisfy Eq.(2). It can be seen that d xy and d lab both have the form of 1 − u 2 , u ∈ [−1, 1], so Eq.(3) can be adapted to Eq.(4).
Furthermore, the fourier series of t(u) is close to cos π 2 u in mathematics. In this respect, Eq.(4) is rewritten to Eq. (5).
Combining the above derivation, the map function is defined as Eq. (6).
Above all, the mapping function to a ten dimensional feature space has been designed as Eq. (6). Each pixel of the medical images is mapped using Eq.(6) so that we can simply use K-means clustering to achieve the same or even better segmentation performance of Normalized cuts, which decreases the risk of generating artifacts in pixel-level and the conventional region-based image fusion methods.

B. FEATURES EXTRACTION 1) SUPERPIXEL-BASED LOG-GABOR FILTERS
Although the two-dimensional Gabor filter has good local properties in both the spatial and frequency domains, its even-symmetric filter produces a nonzero DC component when the bandwidth is greater than one times the frequency; the Log-Gabor function is unrestricted in terms of bandwidth and has minimal spatial support. The Log-Gabor function is a Gaussian function on a logarithmic frequency scale, and on a linear frequency scale, the Log-Gabor function is expressed as: In Eq. (7), f 0 is the filter center frequency; β is used to determine the radial bandwidth.
According to frequency domain analysis, the twodimensional Log-Gabor filter is a band-pass filter in a specific direction. For more comprehensive feature extraction, we use multi-channel Log-Gabor filters with different frequencies and directions to extract texture features, the specific steps are as follows.
i The image is first filtered by the Log-Gabor filter. u frequency scales and v directions are selected for each channel, and the features of the medical image are extracted using Eq. (8).
where LG uv is the Log-Gabor filter; I (N ) is a medical image divided into N regions by superpixels; F uv is the feature extracted. ii LSC Superpixel segmentation divides a medical image into several regions of uniform size, and calculates each region using Eq. (9).
where (x, y) is the center coordinate of each region, and F uv (x, y) is the final texture feature extracted by the Log-Gabor filter.

2) SUPERPIXEL-BASED SUM MODIFIED LAPLACIAN
Sum Modified Laplacian(SML) is a fundamental feature extraction operator widely used in image processing. In this section, we make some corrections for SML at the superpixel level so that we can use it to calculate the contrast feature value of each region in the source image.
In [51], SML calculates the sum of the absolute values of the convolution of an image with modified Laplacian operators(ML), whose expression for the discrete approximation is shown in Eq. (10). Here, h(i, j) is the Laplacian of the pixel and s denotes a variable space. And SML can be calculated as Eq. (11), where S is the parameter that determines the window size used to calculate ML, (x, y) represents the center of the window.
In order to calculate the value of SML for each superpixel, we modify Eq.(11) to Eq. (12).
where SSML denotes the SML value of a superpixel whose center is located at (x, y), M , N is the shape of the medical image, p(i, j) denotes the pixel of the image, C (x,y) represents a collection of pixels belonging to a superpixel whose center locates at (x, y), I(·) is an indicator function when a pixel belongs to a superpixel equal to 1, otherwise equal to 0. Therefore, contrast features of every superpixel region can be obtained from Eq.(12).

C. GENETIC ALGORITHM AND DECISION MAP
In this section, a weighted strategy and an optimization method are used to generate the decision map.
For each superpixel at the same location in two source medical images, the weighted strategy merges the texture and contrast features into a value named hybrid feature(HF). And we select superpixels with larger HF values to generate the decision map. The detail is shown in Eq.(13)- (14).
HF(x, y) = β · F uv (x, y) + (1 − β) · SSML(x, y) (13) DM (x, y) = 1, if HF(x, y) I 1 ≥ HF(x, y) I 2 0, if HF(x, y) I 1 < HF(x, y) I 2 Here, (x, y) means the superpixel whose canter locate at (x, y), HF means the hybrid feature obtained by the weighted average of texture feature F uv and contrast feature SSML, β is the factor used to control the weight of the features. HF(x, y) A and HF(x, y) B represent the HF value of the superpixel (x, y) of source medical image I 1 and I 2 , DM is the decision map.
In the proposed fusion framework, other features of the medical images that we consider important can be embedded in the fusion process flexibly. In this case, Eq.(13) can be rewritten as Eq. (15).
Here, F i (x, y) represents the value of feature i of the region centered at (x, y); β i is the corresponding weight of feature i; HF(x, y) means the weighted sum of all features in the region centered at (x, y). Similarly, all weights can be optimized by genetic algorithm. However, more features involved in fusion also pose problems. On the one hand, genetic algorithms need to spend more time to find the optimal solution for all the feature weights; on the other hand, too many features complicate the process of generating a decision map and increase the probability of region misselection. In this paper, appropriate quantities of image features are involved in the generation of the decision map. In this way, important information of medical images can be detected and maintained in the fusion result while low computational complexity is achieved in the algorithm.
In the weighted strategy described as Eq. (14), the weighting factor β is related to whether the algorithm can select the superpixel blocks needed for fusion. Since different medical images have different salient features, it is necessary to find the right value of β to maximize the HF value of the superpixel blocks with important information. To address this problem, the genetic algorithm(GA), which is an adaptive probabilistic search technique based on the mechanism of natural selection and natural genetics [52] is adopted. The traditional GA mainly consists of these elements: chromosomes, population, generations, crossover probability, mutation probability, and fitness function. Chromosomes are the individuals in the population, which represent the solutions to the problem at hand. In each generation, chromosomes are first selected by a well-designed fitness function, and then the selected excellent chromosomes crossover and mutate at certain probabilities to form the next generation. Throughout the evolutionary process, the design of the fitness function plays an important role in obtaining the optimal solution. And in this paper, the fitness function designed for the fusion problem is shown as Eq. (16).
Here, the root mean square error(RMSE) is used to measure the similarity between the decision map and the source image, i is the individual within the population in each generation, N denotes the size of the image. The smaller the fitness function is, the more similar the decision map is to the source image, which means more information about the source image is contained in the decision map.
Through the optimization of genetic algorithm, the best weighting factor can be figured out and Eq. (14) can most accurately measure the importance of each region in the medical image. Finally, the fused image can be obtained according to the decision map by Eq. (17).
where F represents the final fused image; DM is the decision map obtained by Eq. (14); I 1 , I 2 denotes the source images, respectively. By applying genetic algorithm to optimize the weighting factor, not only the cumbersome and inaccuracy of manual definition of weight factor are avoided, but also the accuracy of medical image fusion is further improved. More details of the proposed multimodal medical image fusion method are shown in Algorithm 1.

Algorithm 1 The Proposed Medical Image Fusion Method
Input: a pair of source medical images I 1 and I 2 Output: The fused image F 1: Weighted averaging two source images to get an average map 2: Using linear spectral clustering to segment the average image and generate superpixels 3: Set generation G = 10, individual number N = 10, crossover probability P c = 0.5, mutation probability P m = 0.1 4: for generation g = 1 : G do 5: for each superpixel number k = 1 : K do 6: Calculate the texture feature and contrast feature with Eqs. (9) and (12) 7: Calculate the HF value of the superpixels and compare to obtain decision map according to Eq. (14) 8: end for 9: Calculate the fitness F(i) for each individual in the population according to Eq.(16). 10: for individual number n = 1 : N do 11: Perform crossover operation on two individuals with crossover probability P c

12:
Perform mutation operation on two individuals with mutation probability P m 13: end for 14: Updating individual populations 15: end for 16: Fuse source medical images using decision map.

IV. EXPERIMENTAL RESULTS AND ANALYSIS
In this section, the setups of experiments are introduced firstly. To verify the performance of our proposed method, it is compared with 7 state-of-the-art medical image fusion methods on eight pairs of multimodal medical images.

A. DATASET AND SETUPS OF EXPERIMENTS
Medical images from different modalities including Computerized Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission tomography (PET), and Single Photon Emission Computed Tomography (SPECT) are used for the diagnosis of various diseases. In the experiments, eight pairs of multimodal medical images are utilized, including three pairs of CT-MRI medical images, two pairs of MRI-SPECT medical images, and three pairs of MRI-PET medical images (see Fig. 3). The proposed method is compared to seven other state-of-the-art medical image fusion methods: LP-SR [53], CNN [47], CFL [54], NSST-PAPCNN [17], NSCT-PC-LLE [55], LRD [7] and NSCT-SR [56]. The resolution of all medical images in the experiments was set to 256*256 and all the experiments were conducted in Matlab 2018a.

B. FUSION RESULTS
In the experiment, both subjective quality and objective metrics of the fusion results were evaluated.  Fig. 4). The fusion results of LP-SR and NSCT-SR lose a large amount of energy information, which leads to a serious reduction of VOLUME 9, 2021 the contrast and intensity of many regions(see (a1),(a3),(g1) and (g3) in Fig. 4). The CFL and NSST-PAPCNN methods can preserve the image energy well, but some redundant information is not completely removed from CT image and some artifacts are produced (see (b2) and (c2) in Fig. 4). The CNN method and the proposed method perform better in this experiment, edge details and the salient features of the source medical images are preserved well in the fused images of these two methods. But compared to the CNN method, the fusion results of the proposed method are more refined in detail (see (b1), (b2), (b3), (h1), (h2), and (h3) in Fig. 4). Fig. 5 shows the fusion results of two sets of MRI and SPECT images. Compared to the fusion results of CT and MRI images, it is obvious that LP-SR and NSCT-SR fail to keep good performance in color medical image fusion. Color artifacts appear in the fusion results of these two methods, and a large area of green appears in the fused images (see (a1),(a2), (g1) and (g2) in Fig. 5). The LRD method performs better on preserving the original color of the images, but the fusion results are blurred in the boundaries and many details. In addition, some information of the SPECT images is weakened or even lost after fusion (see Fig. 5 (f1) and (f2)). The fusion results of NSST-PA-CNN and NSCT-PC-LLE preserve most details of the source images, but the energy seems to be lost and the contrast of the fused images decreases (see Fig. 5 (d1),(d2), (e1) and (e2)). The CNN and CFL methods generally perform well, but the fusion results of CNN still suffer from slight color distortion(see Fig. 5 (b1) and (b2)), and the fusion results of CFL lose some detailed information of SPECT images (see Fig. 5 (c1),(c2)). The proposed method performs well in retaining the brightness and contrast of the source medical images. Besides, the clarity of the tissue texture and the edges of the medical images are preserved well in the fusion results (see Fig. 5 (h1) and (h2)). But the integration of important information is not perfect enough. Fig. 6 shows the fusion results of three sets of MRI and PET images. Obviously, the fusion results obtained by the LP-SR and NSCT-SR methods have a certain degree of distortion, and the contrast of fused images is much lower because of the loss of energy. The white area of bone in the middle of the MRI image has chaotic colors (see Fig. 6 (a1) and (g1)). The other group of images fused by the LP-SR and NSCT-SR methods also have a similar issue (see Fig. 6 (a2),(a3),(g2) and (g3)). Such a situation is not conducive to clinical diagnosis. The CNN method suffers from slight contrast 96360 VOLUME 9, 2021 distortion. In addition, the outer boundary of the organ is lost mostly in the fusion image (see Fig. 6 (b1) and (b2)). The NSCT-PC-LLE and LRD methods perform well in preserving colors and edges, but the contrast of the fusion results of LRD is relatively low compared to that of the NSCT-PC-LLE method, and the results of NSCT-PC-LLE also lose the clarity to some extent (see Fig. 6 (e1),(e2),(e3),(f1),(f2) and (f3)). The CFL and NSST-PAPCNN methods perform better in this issue, but color distortion still exists in the fusion results of NSST-PAPCNN (see Fig. 6 (d1),(d2)). The proposed method has a good performance in retaining good information in detail, but the brightness is enhanced slightly (see Fig. 6 (h1),(h2), and (h3)).

1) QUALITATIVE COMPARISON
In order to further verify the performance of the proposed region-based fusion method, two representative examples (CT-MRI and MRI-SPECT) chosen from the eight groups of fusion images are shown in greater detail in Fig. 7. As can be seen from the enlarged regions with red borders, in group 1 (Fig.7 (a1)-(h1)), the fusion results of LP-SR, CNN, CFL, NSST-PAPCNN, and LRD have the problems of blurring and adhesion in the gap between the white regions. The fusion results of NSCT-PC-LLE and NSCT-SR are clear at the gap, but not dense enough. The proposed method makes the fusion gap fit exactly, which is clear and compact. In group 2 ((a2)-(h2)), the fusion results of the LR-SR, LRD, and NSCT-SR methods have different degrees of color artifact. The CNN, CFL, NSST-PAPCNN, and NSCT-PC-LLE methods obtained good fusion results, and the proposed method enhances the information of the boundary.
After the above comparative analysis of subjective visual effects, we can see that the proposed method performs satisfactorily in all types of multimodal medical image fusion. It is also evident that the proposed method can precisely retain the important salient features of the source images without producing abnormal details.

2) QUANTITATIVE COMPARISON
In this paper, four objective metrics are applied to the evaluation of fusion performance of different methods: average gradient (AG), spatial frequency (SF), mutual information (MI), and petrovic metric Q AB/F . a) Average gradient (AG) [57] refers to the obvious difference in the grayscale near the border point or both sides of the shadow line of the image, which can be used to indicate the clear point of the image. Generally speaking, the larger is the value of AG, the better is the fusion result. b) Spatial frequency (SF) [58] is used for calculating the general activity level of the space in the image. The value of SF is larger, the richer information is contained in the fused image, and the better the fusion method has performed. c) Mutual information (MI) [59] is used to measure the dependence between the source images and the fused image and perfectly indicate the shared information of the fused and the source images. The value of MI is larger, the fused image contains more information about the source image.     achieve the best values of MI, the only value that is not the best is very close to the best one (only 0.0558 smaller than the best one), and the average value of MI in eight groups is still the largest. Table 4 shows Q AB/F values of different fusion methods. It can be seen that the best score of our proposed method is achieved in CT-MRI image set.
Because the Q AB/F is a metric for the evaluation of pixel-level image fusion methods, it is not so suitable for the evaluation of region-based image fusion methods and thus not all of our fusion results get good scores. Nonetheless, the average score of our method in QAB/F is still the best among all methods. Therefore, it can be seen from the results of the four objective  quantitative evaluations that the proposed method can achieve the state-of-the-art performance and even more effective than the mainstream fusion methods. This is attributed to the accurate segmentation of the LSC algorithm as well as the effective optimization of the decision map.

3) TIME COMPLEXITY ANALYSIS
In this section, the computational efficiencies of the different fusion methods are conducted on the image sets ''CT-MRI1'', ''MRI-SPECT1'', and ''MRI-PET1'', and all the tests were implemented on a computer with a 1.99 GHz CPU and 8 GB of RAM. The running time results are compared in Table 5. The average value is the average running time of all fusion pairs in the experiment. As shown in Table 5, the computational complexity of the LRD method is the highest, and CNN and NSCT-SR are also inefficient. The LP-SR method is the most efficient because the Laplacian pyramid it uses to decompose source images for fusion does not cost too much calculation. In our method, the genetic algorithm is used to obtain better fusion results, but it makes us less efficient as it takes most of the running time. However, the linear computational complexity of the LSC algorithm improves our efficiency at the same time.

C. DISCUSSIONS
In the proposed region-based medical image fusion method, the LSC algorithm is utilized to segment source medical images. To further verify the overall performance of LSC is superior to other segmentation algorithms, a comparative experiment is conducted on three classical superpixel segmentation algorithms (Ncuts, Turbo, and SLIC) and LSC. CT and PET images are used for segmentation. Fig. 8 shows the superpixel segmentation results of the four superpixel segmentation methods. It can be seen that all methods produce uniformly sized superpixels, but Turbo and SLIC tend to produce superpixels that contain pixels of different colors (see Fig. 8 (b1),(b2),(c1) and (c2)). And the  superpixels produced by Ncuts and LSC is homogeneous (see Fig. 8 (a1),(a2),(d1) and (d2)). In terms of objective criteria, edge intensity(EI) is used to measure the edge compactness of the segmentation results and the running time of all segmentation algorithms is also recorded. As is shown in Table 6 and Table 7, Ncuts produces the best superpixel edge compactness, and the LSC is in second place. But the computational complexity of Ncuts is much higher than that of LSC (almost 10 2 ). SLIC has the shortest running time but the quality of segmentation is not as good as the LSC. In summary, although LSC is not the best in some metrics, it achieves the best trade-offs among homogeneous regions, excellent structures, and low computational complexity.
In order to make the proposed method achieve the best fusion performance, the appropriate setting of the superpixel VOLUME 9, 2021  number in LSC and population size in GA are of great importance. For example, fewer superpixels mean larger regions, which may lead to an increased probability that the same region contains both required and unrequired information, thereby resulting in an increased probability of region misselection in the fusion process. For the GA, if the population size increases, the likelihood of the algorithm converging to the optimal weights also increases. Therefore, to further investigate the effects of these two parameters on image fusion, we conducted experiments using eight pairs of medical images at different numbers of superpixels and population sizes, respectively. The results are shown in Fig. 9 and Fig. 10.
In Fig. 9, the values of AG, SF, and MI increase with the growth of superpixel numbers. The optimal number of superpixels is between 300 and 500. But it does not suggest that the higher the number of superpixels, the better the fusion result. At the stage where the number of superpixels is 800 to 1000, the values of all three evaluation metrics decrease, which is partly due to the fact that too-small region division destroys the structure of the information. Besides, too many regions can lead to low fusion efficiency, which can be seen that the running time of fusion increases linearly with the increase of the number of superpixels. Fig. 10 shows the relationship between the values of AG, SF, MI, and population size, and we can see that most of the fusion results become better as the population size increases. The fusion result remains stable at most population sizes, the larger the population size, the greater the probability of convergence to a more optimal solution. However, the running time also increases when the population size is larger. Therefore, population sizes between 10 and 20 are optimum.

V. CONCLUSION
In this paper, a novel regional multimodal medical image fusion method based on superpixel segmentation and a post-processing optimized method is proposed. For multimodal medical images, more homogeneous regions can be obtained by the LSC superpixel algorithm. Based on the above regions, Log-Gabor filter and sum modified laplacian are adopted to get texture feature and contrast feature, respectively. Subsequently, a post-processing method based on genetic algorithm is proposed for adaptively adjusting the effects of these two features. By comparing the important information of each superpixel, the final decision map is generated and then the fused image can be obtained. Experiments are conducted on eight groups of multimodal medical images. Compared with seven mainstream fusion methods, the proposed method can achieve better performance in both visual effects and objective evaluation because the segmentation of medical images is more accurate and the detailed information is excellently preserved in the fusion results. In the future, a more accurate segmentation algorithm with deep learning and a faster post-processing optimized method will be considered to further improve the performance of multimodal medical image fusion.