A Multi-Angle Hierarchical Differential Evolution Approach for Multimodal Optimization Problems

Multimodal optimization problem (MMOP) is one of the most common problems in engineering practices that requires multiple optimal solutions to be located simultaneously. An efficient algorithm for solving MMOPs should balance the diversity and convergence of the population, so that the global optimal solutions can be located as many as possible. However, most of existing algorithms are easy to be trapped into local peaks and cannot provide high-quality solutions. To better deal with MMOPs, considerations on the solution quality angle and the evolution stage angle are both taken into account in this paper and a multi-angle hierarchical differential evolution (MaHDE) algorithm is proposed. Firstly, a fitness hierarchical mutation (FHM) strategy is designed to balance the exploration and exploitation ability of different individuals. In the FHM strategy, the individuals are divided into two levels (i.e., low/high-level) according to their solution quality in the current niche. Then, the low/high-level individuals are applied to different guiding strategies. Secondly, a directed global search (DGS) strategy is introduced for the low-level individuals in the late evolution stage, which can improve the population diversity and provide these low-level individuals with the opportunity to re-search the global peaks. Thirdly, an elite local search (ELS) strategy is designed for the high-level individuals in the late evolution stage to refine their solution accuracy. Extensive experiments are developed to verify the performance of MaHDE on the widely used MMOPs test functions i.e., CEC’2013. Experimental results show that MaHDE generally outperforms the compared state-of-the-art multimodal algorithms.


I. INTRODUCTION
Multimodal optimization problems (MMOPs), as one kind of challenging and interesting optimization problems, have attracted increasing attentions in recent years [1]- [4]. MMOPs are an important problem area as it widely exists in many real-world applications [5], such as virtual camera composition problems [6], metabolic network modeling problems [7], laser pulse shaping problems [8], job scheduling problems [9], [10], and neutral network problems [11]. These The associate editor coordinating the review of this manuscript and approving it for publication was Pavlos I. Lazaridis .
MMOPs are required the algorithms to locate the global peaks as many as possible and refine the accuracy of the found solutions as high as possible, so that the high-quality decisions can be finally made. In detail, MMOP is a kind of complex optimization problem that requires the algorithm to not only locate multiple global peaks simultaneously, but also achieve certain accuracy of solutions on the global peaks. In fact, the algorithm for MMOPs faces the problem of how to improve the population diversity to locate more peaks and accelerate the convergence speed on each of the found peaks.
Different from the single optimization problems (SOPs), MMOPs have many local peaks and multiple global peaks.
Most existing evolutionary algorithms (EAs) are successful used in solving SOPs [12]- [18]. Drawing on the success of EAs in SOPs, many advanced EAs are designed for solving MMOPs, such as differential evolution (DE) [19]- [28], particle swarm optimization (PSO) [29], [30], genetic algorithm (GA) [31]- [34], and ant colony optimization [35]. These algorithms all obtain a great success when dealing with MMOPs. Among these algorithms, DE variants have shown their effectiveness and superiority [19]- [28]. However, existing DE-based multimodal optimization algorithms still have some limitations. Firstly, the DE/rand or DE/best strategy are widely used in many algorithms, but there are still some problems. The DE/rand strategy decreases the convergence speed due to the random searching directions, while the DE/best strategy provides homogeneous guidance, which makes the population easily get trapped into local peaks. Therefore, different strategies should be used for different individuals according to their fitness quality. Secondly, traditional DEs may suffer from the dilemma of the balance between exploration and exploitation because they usually use the fixed evolutionary operators during the whole process. This is not efficient to satisfy the search requirements of different evolution stages.
Therefore, to solve above difficulties, this paper enhances the DE algorithm from two angles. The one angle is to consider the different fitness quality of the individuals so as to configure them with different mutation strategies satisfy their search requirements. The other angle is to consider the different evolution stages so as to carry out different evolutionary operators to satisfy the search requirements. Firstly, a fitness hierarchical mutation (FHM) strategy is proposed that divide the individuals into different levels based on individuals' fitness quality, so that they can use different mutation strategies to avoid the weakness of single mutation strategy. Secondly, by considering the angle of evolution stage, a directed global search (DGS) strategy and an elite local search (ELS) strategy are proposed in the late evolution stage to help enhance the population diversity for locating more peaks and to help refine the accuracy of the good solutions found, respectively. This way, a multi-angle hierarchical DE (MaHDE) is proposed, whose main differences with the common method in solving MMOPs is shown in Fig. 1. The frameworks of the common method and the MaHDE to solve MMOPs are shown as Fig. 1(a) and Fig. 1(b), respectively. The novelties and advantages of the proposed MaHDE algorithm are described as the following three aspects.
1) The FHM strategy is proposed to balance the exploration and exploitation ability of different individuals. To achieve these targets, FHM divides the individuals into two levels (i.e., low/high-level) according to their fitness quality in the current niche. The low-level individuals are guided by the best individual of a niche to move towards its nearest peak. In addition, the highlevel individuals are guided by themselves to maintain their superiority. Moreover, the neighbors' perturbation and the global perturbation are added to the evolution of low-level and high-level individuals, respectively, thus helping the algorithm to avoid local peaks. 2) The DGS strategy is designed for the low-level individuals in the late evolution stage to increase the population diversity and provide the new chance to re-search for the new global peaks. By expanding the search range of the low-level individuals, the DGS strategy helps them jump out of the local peaks and further to improve the population diversity.
3) The ELS strategy is proposed to refine the solution accuracy and save the fitness evaluations. Concretely, the ELS strategy is only performed on the high-level individuals in the current niche, which works via Gauss perturbation for its narrow sampling space. In this way, the ELS not only can refine the solution accuracy, but also can save the number of fitness evaluations. The rest of this paper is organized as follows. Section II reviews the DE algorithm and the related works of MMOPs. Section III presents the proposed MaHDE algorithm. The experimental study is shown in Section IV. Finally, the conclusions are given in Section V.

II. RELATED WORKS A. DE
DE was first proposed by Storn and Price [36], which works through mutation, crossover, and selection strategies to imitate the process of biological evolution. Due to its simple and efficient performance, DE is adopted by many researches to solve some complex problems [25], [37]- [42].

1) MUTATION
DE is to start from a randomly generated initial population, use the difference vector of two randomly selected individuals (i.e., x g_r2 , x g_r3 ) as the source of a third individual VOLUME 8, 2020 (i.e., x g_r1 ), then a mutation vector (v i ) is produced by weighting the difference vector and x g_r1 . This operation is called mutation. The common used mutation strategies as (1) and (2), which are named DE/rand and DE/best, respectively.
where i = 1, 2, . . . , NP, NP is the population size. x g_r1 , x g_r2 , x g_r3 are randomly selected from the population, and x g_r1 = x g_r2 = x g_r3 = i. x best is the best individual in the current generation. F is often referred to as the scaling factor, which is used to control the weights of the difference vectors.

2) CROSSOVER
The new individual u i is generated by using a binomial crossover operation on the individuals x i and v i as where j = 1, 2, . . . , D, D is the dimension of search space. j rand is a randomly selected number from {1, 2, . . . , D}. rand j (0, 1) is a random number between 0 and 1, which is different in each dimension. CR is the crossover probability, j = j rand can ensure that at least one dimension comes from v ij .

3) SELECTION
If the fitness value of u i (i.e., f (u i )) is better than that of x i , the u i is replaced by x i in the next generation, otherwise x i remains, and the process is called selection as (4) for a maximization problem.
Recently, another efficient selection method has been widely used in many researches, where u i is compared to its nearest individual x p in the parental generation. If f (u i ) is better than that of x p , then x p is replaced by u i and enter into the next generation. This selection process is shown as (5) and it is adopted in this paper.

B. RELATED WORKS ON MMOPs
There are many researches on MMOPs. To better review these researches, we divide them into three categories according to the different base algorithms. They are MMOPs based on DE, MMOPs based on PSO, and MMOPs based on GA, respectively.

1) DE FOR MMOPs
The crowding DE (CDE) [19] and speciation DE (SDE) [20] are two typical methods used to solve MMOPs. CDE works by comparing the fitness of u i (generated by crossover) with the nearest individual in parental generation. Then the better individual is selected to enter the next generation. SDE locates multiple peaks by dividing the population into multiple independent sub-populations. However, these two methods all need to employ extra parameters (i.e., the crowding size in CDE and the speciation radius in SDE), which are sensitive to the performance of algorithm. To reduce the influence of parameters on the algorithm, Qu et al. [21] designed a neighborhood mutation strategy by borrowing the neighborhoods information, resulting in a neighborhood CDE (NCDE) and a neighborhood SDE (NSDE). Subsequently, Gao et al. [22] utilized the clustering technique to divide the population into several subpopulations, and combing the self-adaptive parameter control technique to deal with MMOPs, resulting in self-CCDE and Self-CSDE. By designing the new mutation strategies, Biswas et al [23] proposed an information sharing mechanism based on CDE and SDE, termed as LoICDE and LoISDE. Meanwhile, a parentcentric normalized mutation strategy also was designed by Biswas et al. [24], resulting in PNPCDE.

2) PSO FOR MMOPs
To solve the poor local search ability of the PSO-based niching algorithms, Li [29] designed a ring neighborhood topology based on PSO, which termed as r2pso and r3pso. Subsequently, a distance-based locally informed particle swarm (LIPS) optimizer was proposed by Qu et al. [30], which formed a stable niching by using the neighborhood information to solve the niching parameter sensitivity problem algorithm in MMOPs. Parsopoulos et al. [43] proposed a sequential PSO niching technique by using the objective function stretching to solve MMOPs. Brits et al. [44] adopted an initial swarm to produce the multiple sub-swarms by monitoring the fitness of particles. The number of particles in a sub-swarm can change by absorbing particles from the main swarm. Ren et al. [45] proposed a scatter learning PSO algorithm for MMOPs, which contracted an exemplar pool by collecting the high-quality solutions in the solution space.

3) GA FOR MMOPs
Many researches used GA to solve MMOPs as reviewed in [46]. Li et al. [31] divided the population into several species according to their similarity, and formed a species conservation technique for evolving parallel subpopulations to solve MMOPs. Gan and Warwick [32] proposed a dynamic niche clustering technique-based fitness sharing to solve MMOPs. Petrowski [33] proposed a new method that sharing the resources within subpopulations of individuals characterized by some similarities, and combined the best individuals of each subpopulation with GA to solve MMOPs. Bandaru and Deb [47] combined the power of dominance with traditional variable-space niching to solve MMOPs, which is implemented within the NSGA-II framework, and termed as ANSGAII.
Recently, Lin et al. [48] designed a novel algorithm that focus on the formulation, balance, and keypoint of species to balance exploration and exploitation in generating offspring. Wang et al. [49] proposed an automatic niching DE, in which the affinity propagation clustering that did not need sensitive clustering parameter was used to divide the population. Zhao et al. [50] borrowed the local binary operator idea in image processing to help efficiently form the niches. Chen et al. [51] designed a distributed individual DE that treated each individual as a distributed niche to track peaks. These algorithms all use DE as the base algorithm to solve MMOPs and achieve a great success. Therefore, this paper proposes a new algorithm based on DE, which designs different strategies to accommodate individuals of different fitness levels and search requirement in different evolution stages for efficiently solving MMOPs.

III. MaHDE
In this section, the motivations of MaHDE are first introduced. Then, the FHM strategy is proposed to balance the exploration and exploitation ability of different individuals. Furthermore, the DGS strategy is proposed in the late evolution stage to increase the population diversity and provide the new opportunity for the low-level individuals to re-search for the global peaks. In addition, the ELS strategy is introduced to refine the solution accuracy of the high-level individuals. Last but not least, the completed MaHDE algorithm is given. The complexity analysis of MaHDE is given finally.

A. MOTIVATIONS
Since MMOPs have multiple peaks, it is most efficient method to partition the population into several overlapping or independent niches (subpopulations), and each niche targets to locate a peak. In this paper, the overlapping niches are adopted to locate more peaks. Here, how to uses the individual's information of a niche (e.g., the fitness of individuals, the distance between individuals) has a key effective to enhance the performance of algorithm. From this perspective, there are several interesting observations. Fig. 2 shows some different situations of individuals' distribution in a niche. For ease of description, we define the dotted circle with red represents a niche, the solid circles (e.g. A, B, and C) represent the individuals in a niche, and the solid circle with red (e.g., A) represents the current individual. Fig. 2(a) shows the diagram of whether to guide A by B or C. Here, by calculating the vertical distance from B, C to A (i.e., BO 1 , CO 2 ), we can obtain that CO 2 is larger than BO 1 . Besides, we can also find that f (C) (i.e., the fitness of C) is better than f (B) by calculating the fitness of B, C. This shows that C is the better individual within the niche. Therefore, A is guided by C to pursue the global peaks quickly. Similarly, when A, B, and C are on the same peak as Fig. 2(b), the vertical distance from B, C to A (i.e., BO 1 , CO 2 ) can be calculated and we can obtain that CO 2 is larger than BO 1 . Moreover, f (C) is better than f (B) by calculating the fitness of them. Therefore, A is also guided by C in this situation to improve the search ability. Fig. 2(c) shows that A and C are on the same peak, but A and B are on different peaks. From Fig. 2(c) we can find that CO 2 is larger than BO 1 and f (C) is better than f (B). Therefore, A is also guided by C in this situation to avoid local peaks.
From the above three situations, we can find that when f (A) is lower than the mean fitness (f ) of the current niche, using the better individuals (i.e., the larger vertical distance) to guide A can help to locate the global peaks quickly and avoid local peaks. Motivated by these ideas, the individuals of a niche can be divided into two groups according to their fitness. Here,f is used as a boundary, and we can obtain the individuals worse thanf and marked as the low-level individuals. Meanwhile, the individuals better thanf are marked as the high-level individuals. The best individual of the niche is used as a leader to guide the low-level individuals, this way can make these individuals move towards the promising direction and further to locate the nearest peak. On the contrary, the high-level individuals are guided by themselves to maintain their superiority in the niche. Besides, the neighbors or global perturbation can be added to the above evolution to avoid local peaks, respectively. Therefore, an FHM strategy is proposed to make the low-level individuals move towards the nearest peak and make the high-level individuals maintain the global peaks.
Besides, with the population evolving, most of the individuals converge to the local or global peaks, leading to the loss of population diversity. Moreover, in the late evolution stage, the low-level individuals may trap into local peaks due to its poor fitness, and the high-level individuals may close to the global peaks. Therefore, we consider performing the DGL strategy on the low-level individuals to improve the population diversity and help these low-level individuals to obtain the chance of re-search for the global peaks. Besides, the ELS strategy is performed on the high-level individuals to refine their accuracy.
To sum up, the FHM strategy is proposed based on the fitness hierarchical technique, the DGL strategy and ELS strategy are performed on the low-level individuals and highlevel individuals in the late evolution stage, respectively. Therefore, a multi-angle (i.e., fitness quality and evolutionary stage) hierarchical differential evolution (MaHDE) is proposed for MMOPs in this paper.

B. FHM STRATEGY
Generally speaking, the different individuals of a niche adopt the single mutation strategy (e.g. DE/rand, DE/best). However, the DE/rand strategy decreases the convergence speed due to the random searching directions, and the DE/best strategy is easy to trap into local peaks for the homogeneous guidance direction. To avoid this situation, the FHM strategy can provide adaptive guide individual evolution, which divides the individuals of a niche into two levels (i.e., low/high level) according to their fitness quality. On one hand, the mean fitness (f 1 ) of the niche that formed by the current individual x i is calculated to measure the individuals' level. If the fitness of x i (i.e., f (x i )) is worse than or equal to (f 1 ) (i.e., f (x i ) ≤f 1 ), the x i is marked as a low-level individual. To accelerate the convergence speed, the best individual in a niche (x nbest ) is used to guide x i to quickly locate to the global optimum that nearest to x i . Besides, the neighbors' perturbation is added to the evolutionary process to avoid local peaks. On the other hand, if f (x i ) is better thanf 1 (i.e., f (x i ) >f 1 ), it means that x i is a promising solution in this niche and is marked as a high-level individual. To maintain the superiority of the high-level individual in the niche, we can use itself as the base for the mutation. Besides, the global perturbation is added to evolution to avoid local peaks. By combing these two-level mutation strategies, the FHM strategy not only can accelerate the convergence speed but also can maintain the found solutions. The detailed process of the FHM strategy is as follows.
i) Find the nearest K individuals to the current individual x i to form the niche of x i and stores them in S.
ii) Calculate the mean fitness (f 1 ) of these K individuals (not include x i itself) as iii) Select the corresponding mutation strategy according where x nbest is the best individual in the current niche. x n_r1 and x n_r2 are two individuals randomly selected from S, and x n_r1 = x n_r2 ∈ S. Besides, x g_r1 and x g_r2 are two individuals randomly selected from the population excluding S, and x g_r1 = x g_r2 = i / ∈ S.

C. DGS STRATEGY
With population evolving, most of the individuals gradually converge to local or global peaks, leading to a loss of population diversity in the late evolution stage. This situation will cause some peaks not to be located due to the poor population diversity. It is necessary to increase the population diversity in the late evolution stage. Therefore, in the late evolution stage of MaHDE, we use two strategies as DGS and ELS for those low-level individuals and high-level individuals, respectively. Similar to the FHM strategy, we also use the mean fitness of all the individuals of the niche to divide the individuals.
It should be noted that as the DGS/ELS is carried out after the FHM, crossover, and selection, as Fig. 1(b), the fitness values of the individuals in the niche have changed and we should re-calculate the mean fitness and denote it asf 2 .
Herein, the DGS is described for the individual x i whose fitness is worse than f 2 (i.e., f (x i ) <f 2 ). Such an individual is regarded as low-level individual, which is not located the global peaks due to its poor fitness quality in the late evolution stage. Moreover, it may be trapped into local peaks due to the stagnation in late evolution stage. It will be meaningless to continue to evaluate the low-level individuals. Therefore, expanding the search range for these low-level individuals can help them jump out of the local peaks. Here, the DGS strategy is proposed as (8) to expand the search space of the low-level individual x i in the late evolution stage. Here, we define when the population satisfies fe > η× MaxFEs as the late evolution stage. (8) where fe is the current number of fitness evaluations, MaxFEs denotes the maximum number of fitness evaluations. η is a parameter that controls the evolutionary stages of population. The impact of the value of η on the algorithm will be investigated in Section IV-E. Besides, x gbest and x gworst are the best individual and the worst individual (measured by fitness) of the whole population in current generation. f (x gbest ) and f (x gworst ) are the fitness of x gbest and x gworst . x g_r1 and x g_r2 are two individuals selected from the population randomly excluding S, and x g_r1 = x g_r2 = i / ∈ S. In (8), the second part controls the search direction. That ) is the promising direction to explore more global peaks. Therefore, this global search strategy can be considered as the directed. Note that only the new individual p_x i will enter the next generation when its fitness is better than the original individual x i (i.e., f (p_x i ) > f (x i )). By this way, the DGS strategy can improve the population diversity. Moreover, the DGS strategy provides the low-level individuals with the opportunity to re-search for the global peaks.

D. ELS STRATEGY
To refine the solution accuracy, the local search strategy is a widely used method. However, if a local search is performed on each solution, it will result in a waste of fitness evaluations. Conversely, if a local search is performed only for the best individual, it may be useful to single optimization problems (i.e., only one global peaks solution can be located) and it will not works to MMOPs. Since the global peaks generally survive in around the better individuals in the late evolution stage, we consider that the local search strategy is executed only for the high-level individuals in the late evolution stage. Note that by comparing the fitness of current individual f (x i ) with the mean fitness (f 2 ) of the niche formed by the current individual x i , we can obtain the high-level individuals (i.e., f (x i ) >f 2 ). Here, the ELS strategy is proposed as (9) to refine the accuracy of the high-level individuals in the late evolution stage.
where U j and L j are the upper boundary and lower boundary of the search space. Besides, N (µ, σ ) is the Gaussian distribution, here µ = 0, σ is set to 1.0E−04. Here, it is note that the better individual is selected into the next generation by comparing f (x i ) and f (q_x i ).
In (9), there are two advantages by borrowing the narrow sampling space of Gaussian distribution. On one hand, the search range can be controlled around the high-level individuals, avoiding the blind and ineffective search. On the other hand, selecting the better individual from x i and q_x i enter into the next generation ensure that the found solutions can be maintained in the whole evolutionary process. By this way, the ELS strategy not only ensures the accuracy of the solutions to be refined but also avoids a waste of fitness evaluations in evolution.

E. COMPLETE MaHDE ALGORITHM
The MaHDE includes the FHM strategy, the DGS strategy, and the ELS strategy for dealing with MMOPs. Here, the advantages of FHM strategy are shown in the following three aspects: i) It can accelerate the convergence speed by guiding the low-level individuals move towards the nearest peak.
ii) It can maintain the superiority of the high-level individuals by the autonomous guidance strategy. The DGS strategy is designed for the low-level individuals in the late evolution stage, which can improve the population diversity by expanding the search range. Besides, the ELS strategy is performed on the high-level individuals in the late evolution stage, which can help our MaHDE refine the accuracy of the solutions by the Gaussian distribution. The detailed process of MaHDE is shown in Algorithm 1.
The termination condition of the MaHDE algorithm is when achieves the maximum fitness evaluations (MaxFEs). Moreover, if the algorithm has located all the global peaks, it also stops. × D). It should be noted that the computational complexity of most state-of-the-art multimodal algorithms is O(NP 2 ×D) due to the sharing distance between each pair of individuals, which is same to our proposed MaHDE.

IV. EXPERIMENTAL STUDIES
In this section, a comprehensive experimental analysis and comparison verify the advantages of our MaHDE algorithm. Firstly, the test functions and experimental settings are introduced. Secondly, the results of MaHDE with some state-of-the-art algorithms are listed and analyzed. Thirdly, we compare MaHDE with two winners of CEC competitions in different accuracies. Then, the landscapes of the typical problems are listed to visualize the evolution process. Finally, the impacts of parameter settings are analyzed. VOLUME 8, 2020 Algorithm 1: MaHDE Begin 1: Randomly initialize the population with size NP and set fe = 0; 2: While fe < MaxFEs 3: For i = 1 to NP 4: Find the nearest (measured by Euclidean distance) K individuals of the current individual x i and store them in S. 5: Calculate the mean fitness of the individuals in S by (6); 6: Perform the FHM strategy by (7); 7: Perform the crossover strategy by (3); 8: Perform the selection strategy by (5); 9: End For 10: If fe > η× MaxFEs 11: For i = 1 to NP 12: Calculate the mean fitness (f 2 ) of the current niche formed by x i ; 13: If f (x i ) <f 2 / * The DGS strategy * / 14: Perform the DGS strategy by (8) and produce the new individual p_x i ; 15: Evaluate the fitness of p_x i ; 16: fe++; 17: Compare the fitness of x i and p_x i and select the better one to enter the next generation; 18: Else / * The ELS strategy * / 19: Calculate f (x gbest ) and f (x gworst ) in current generation 20: Perform the ELS strategy by (9) and produce q_x i ; 21: Evaluate the fitness of q_x i ; 22: fe++; 23: Select the better one from x i and q_x i to enter the next generation;  [52]. In addition, two commonly used measures of MMOPs are adopted to measure the performance of MaHDE and different compared algorithms, which are peak ratio (PR) and success rate (SR),, respectively. The definition of these measures also can refer to [52]. In this paper, four accuracy levels (ε) that ε = 1E−01, ε = 1E−02, ε = 1E−03, and ε = 1E−04 are adopted in the experiments. The results of ε = 1E−04 are mainly reported as [48]- [51]. Besides, the NP, MaxFEs, and K of MaHDE adopt the same settings as [50].
The compared algorithms are divided into three groups according to different algorithms as reviewed in Section II-B. The first group uses DE as a base algorithm to deal with MMOPs (i.e., CDE, Self_CCDE, PNPCDE, LoICDE, NCDE, SDE, and NSDE). The second group uses PSO as a base to deal with MMOPs (i.e., r2pso, r3pso, and LIPS). The third group uses GA as a base to deal with MMOPs (i.e., ANSGAII). Here, the relevant parameters in the comparison algorithms are set the same as the settings in their original papers. Moreover, we ensure a fair comparison between the proposed method and the comparative methods by the following three settings. Firstly, the proposed algorithm and the compared algorithms are all tested on the same multimodal functions in CEC'2013 benchmark set. Secondly, the algorithm parameters in the compared algorithms are set as their original settings that have been fine turned by their authors. Thirdly, the NP and the MaxFEs in solving each function are the same for the proposed algorithm and the compared algorithms.
Note that all results in this paper are the mean values obtained from 51 independent runs of the algorithm. The experimental are conducted on a PC with 8 Intel Core i5 CPUs, 8 Gb memory and Windows10 with 64-bit system.

B. COMPARISON WITH STATE-OF-THE-ART ALGORITHMS
In order to comprehensively analyze the performance of MaHDE, we compare the PR and SR results of MaHDE with the compared state-of-the-art algorithms. The results of PR and SR on F1-F20 with the accuracy ε = 1.0E-04 are shown in Table 1. The best PR values are bolded in all the algorithms. In addition, the Wilcoxon rank-sum test [53] is used to statistically evaluate the PR results of MaHDE and the compared algorithms in 51 runs, and the significance level is set as 0.05. The symbols ''+'', ''−'', and ''≈'' represent the MaHDE is significantly better, worse, and similar than the compared algorithms, respectively.
From Table 1, we can find that our MaHDE can locate all the global peaks on F1− F6 and F10 in each run. For F1−F5, the results of MaHDE are similar with CDE, Self_CCDE, PNPCDE, LoICDE, and NCDE. These algorithms can locate all the global peaks in each run except for CDE and LoICDE in F4. We can find that these algorithms are the improvements of CDE, which indicates that the principle of CDE has a good advantage for F1−F5. The r2pso and r3pso also perform better on F1−F5 except for F4. Because F4 is a nonlinear function, the global peaks are difficult to identify by the simple search strategy. For F6, only our MaHDE and LoICDE can locate all the global peaks in each run with all the compared algorithms.
MaHDE performs slightly worse than CDE, Self_CCDE, PNPCDE, and NCDE on F7. But our MaHDE can locate more than half of the global peaks on F7 (i.e., PR is 0.804). Moreover, the PR result of MaHDE on F7 is better than SDE, NSDE, r2pso, r3pso, LIPS, and ANSGAII. For F8, our MaHDE almost can locate all the global peaks due to its strong search ability. In addition, the PR result of MaHDE on F8 is better than all the compared algorithms except for Self_CCDE. MaHDE performs slightly worse on F9, which may be because that F9 has many global peaks (i.e., 216) and local peaks. However, the PR result of F9 is still better than SDE, NSDE, r2pso, r3pso, LIPS, and ANSGAII.
F11-F20 are the complex and composition functions, which are difficult to locate all the global peaks for the algorithms. The PR result of MaHDE on F11 is better than all the compared algorithms except for Self_CCDE and NCDE. Here, our MaHDE performs competitive with Self_CCDE and NCDE on F11, and they all locate most of the global peaks (i.e., the PR results of these algorithms all achieve above 0.5). For F12, F15, and F17−F20, our MaHDE performs best on all the compared algorithms. In particular, F18−F20 are the problems of more than 10 dimensions, and the search spaces of them are complex that the other algorithms have difficult to locate the peaks. However, the MaHDE algorithm can still locate at least 2 global peaks in each run, which shows the superiority of our MaHDE algorithm on high-dimensional problems.
Based on the experimental results, the following analysis is obtained. Firstly, from the fitness quality perspective, the FHM strategy considers that individuals of different levels should play a different role to balance the exploration and exploitation ability of different individuals. By designing VOLUME 8, 2020 different mutation strategies for the low/high-level individuals, FHM strategy helps the low-level individuals move towards the nearest peak by the guidance of the best individual in the current niche, and help the high-level individuals maintain their superiority by the autonomous guidance strategy. Secondly, from the evolution stage perspective, the DGS strategy is designed for the low-level individuals in the late evolution stage, which can improve the population diversity and provide the new opportunity for exploring more global peaks. Thirdly, the ELS strategy is designed for the high-level individuals, which borrows a narrow sampling space of Gaussian distribution to refine the accuracy of solutions in the late evolution stage. Therefore, our MaHDE obtains the promising results in dealing with MMOPs.

C. COMPARISON WITH WINNERS OF CEC COMPETITIONS
To further investigate the performance of MaHDE, we also compare MaHDE with the winners of CEC'2013 and CEC'2015, which are the niching CMA-ES via nearest-better clustering (NEA2) [54] and niching migratory multi-swarm optimizer (NMMSO) [55], respectively. Table 2 presents the PR and SR results of MaHDE, NEA2, and NMMSO on F1−F20 with accuracies of 1E−01, 1E−02, and 1E−03, respectively. The best results in Table 2 are marked as bold, and the last line counts the number of best results for each algorithm on F1-F20, which are marked as ''#Best''.

D. THE LANDSCAPES OF MaHDE ON THE SELECTED FUNCTIONS IN DIFFERENT GENERATIONS
To better visualize the distribution of individuals in evolution, Fig. 3− Fig. 5 show the landscapes of MaHDE with different typical problems in different generations. Notes that ''Gen'' denotes the current generation, and Gen= 0 means the initial distribution of the population. It can be divided into three groups based on the characteristics of problems. The first group is the simple functions (i.e., F1 and F2), and the global peaks of them can be located in early stage of evolution as shown in Fig. 3. The second group is the problems that have many peaks (i.e., F6 and F7), locating to all peaks of them may exhaust a lot of fitness evaluations as shown in Fig. 4. The third group are more complex functions (i.e., F11 and F12), and the global peaks of them are difficult to locate with high accuracy as shown in Fig. 5. From Fig. 3, we can find that our MaHDE can locate all the global peaks in the early evolution stage and until all individuals converges to the peaks. Fig. 4 shows the problem with  more global peaks, which requires the algorithm to maintain the found solutions during the evolution. From Fig. 4, we can find that our MaHDE not only can locate all the peak regions but also can maintain the found solutions until the final generation. From Fig. 5, we can find that our MaHDE still can locate all the peaks regions on these complex problems. It indicates that our MaHDE not only has strong global search ability but also can avoid local peaks. From Fig. 3−Fig. 5, we also can find that the different individuals can converge around different peaks with the increase of generation, which indicates that our MaHDE has a better convergence speed in the whole evolution.

E. IMPACTS OF PARAMETER SETTINGS
The parameter η controls the evolution stage of the population. In other words, η balances the algorithm ability of exploration and exploitation. The small η will reduce the exploration ability of MaHDE, which causes some individuals to trap into local peaks. Conversely, the large η will reduce the exploitation ability of MaHDE, so some peaks cannot be located. Therefore, the value of η is important for the performance of MaHDE. Table 3 shows the PR results of MaHDE on F1−F20 with different η.
As we can see form Table 3, on the first 6 functions, different η values make nearly no difference in MaHDE on F1-F6, which can locate all the global peaks in each run except for η are 0.7 and 0.9. For the problems F7−F9, our MaHDE obtains the best results when η is 0.4. For F10, F13, F14, and F16, different η values also make nearly no difference in MaHDE. However, with the dimension of problem increase, the advantages of MaHDE become obvious when η is 0.4. Therefore, we set η is 0.4 in this paper. VOLUME 8, 2020  To further investigate the advantages of the MaHDE when η = 0.4, we use the results of the MaHDE with η = 0.4 as a base and perform the Wilcoxon rank-sum test with α = 0.05 on the different PR results of the other η values. Last three rows of Table 3 (i.e., '+', '−', and '≈') show the PR results of η = 0.4 are significantly better than, worse than, and similar to the PR results of other η values on MaHDE, respectively. From Table 3, we can find that the PR results of η = 0.4 are significantly better than all the results of the other η values, which shows the overall better performance of MaHDE with η = 0.4.

V. CONCLUSION
This paper proposes the MaHDE algorithm to better deal with MMOPs. Considering both fitness quality angle and evolution stage angle, our MaHDE not only can accelerate the convergence speed and preserve the found solutions around peaks, but also can improve population diversity for locating more peaks and to help refine the accuracy of the found good solutions. Concretely, three novel strategies are designed to ensure the performance of the algorithm. Firstly, the FHM strategy considers that the different levels individual should play a different role to balance the exploration and exploitation ability of different individuals. By designing the different mutation strategies for the low/high-level individuals, it makes the low-level individuals move towards the nearest peak by the guidance of the best individual in the current niche. Besides, it makes the high-level individuals maintain their superiority by the autonomous guidance strategy. Secondly, the DGS strategy is designed for the low-level individuals in the late evolution stage, which can improve the population diversity and provide the new opportunity of re-search the global peaks for the low-level individuals. Thirdly, the ELS strategy is designed for the high-level individuals, which borrows a narrow sampling space of Gaussian distribution to refine the accuracy of solutions in the late evolution stage. Based on these novel strategies, MaHDE can achieve a promising performance when comparing with some state-of-the-art multimodal algorithms and two winners of the CEC competitions in different accuracy levels. The experimental results can show the superiority of the proposed MaHDE when dealing with MMOPs.
Although MaHDE shows promising performance in dealing with MMOPs, it still has some limitations. For example, its performance on some high-dimensional problems is still not good enough. In the future, we will extend the MaHDE to solve some MMOPs in some potential real-word applications, including resource-constrained project scheduling [56], electricity markets [57], energy resource management [58], and optical networks [59]. For example, how to detect multiple equilibriums simultaneously is a key challenging economic game problem in electricity markets, and our MaHDE algorithm can be used to solve these problems. Meanwhile, we can also further improve the performance of MaHDE in the high dimensional problems. His current research interests include differential evolution, colony optimization, and the estimation of distribution algorithm and their applications in real-world optimization problems.
DONG LIU received the B.S. and M.S. degrees in computer science from Zhengzhou University, in 2004, and the Ph.D. degree in computer science from Tianjin University, in 2013.
He is currently an Associate Professor with Henan Normal University. His research interests include educational data mining and complex network analysis.