Enhancing Differential Evolution on Continuous Optimization Problems by Detecting Promising Leaders

Due to that the performance of differential evolution (DE) significantly depends on offspring generation strategies, various DE variants have been reported with the improved mutation operators. However, on the one hand, the mutation operators in most DE variants are guided by the elites in terms of the fitness value, without considering their distribution information in the fitness landscape. It may lead to the population be evolving towards the unpromising areas more frequently if these elites are clustered in a locally optimal region. On the other hand, in most DE variants, the evolutionary information of the potential trial vectors is not fully utilized to guide the search, which will hamper the local exploitation in the promising regions that they are located in. To overcome these weaknesses, this article proposes an enhanced DE framework (DELDG) with a leaders-detection-and-guidance mechanism that consists of an adaptive leaders detection (ALD) and a neighborhood-based tournament selection (NTS). With these two novel operators, DELDG can not only guide the mutation process of each individual with multiple promising leaders detected by ALD, but also accelerate the convergence speed with the competition among the potential trial vectors by NTS. Therefore, DELDG is characterized by the explicit detection of the promising leaders according to their fitness values and distribution information and the effective use of the potential trial vectors in the neighborhood of each leader. Compared with 36 excellent DE variants and evolutionary algorithms (EAs), the experimental results on 28 IEEE CEC2013 real-parameter functions and 17 IEEE CEC2011 real-world problems have demonstrated the competitive performance of DELDG.


I. INTRODUCTION
Since its invention in 1997, differential evolution (DE) has received rapid popularization and developed as a powerful and successful optimization tool for the optimization problems [1], [2]. Due to its simple structure, easy to implement, and strong robustness, DE has been successfully applied to solve different complicated optimization problems, such as multi-objective, multi-modal, large-scale, and dynamic problems, and various real-world applications in the field of science and engineering [3], [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Tachun Lin .
Like other population-based evolutionary algorithms (EAs), DE manipulates a population of candidate individuals by three evolutionary operators, i.e., mutation, crossover, and selection [1]. However, as shown in [3], [4], the performance of DE significantly depends on offspring generation strategies (i.e., mutation and crossover) and the control parameters (i.e., the population size NP, the scale factor for mutation F, and the crossover rate Cr). Therefore, to further enhance the performance of DE, much research effort has been devoted to the advanced search techniques to overcome the weaknesses of DE, which results in the emergence of many DE variants. For the mutation operator of DE, the techniques in the existing DE variants mainly focus on the following aspects: VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ developing new mutation strategies, integrating multiple strategies, and hybridize with other EAs and swarm intelligence algorithms [2]- [4]. For the crossover operator of DE, several sophisticated techniques, e.g., multiple exponential recombination technique [5], linkage learning technique [6], and the Eigen coordinate system technique [7], are introduced into DE to enhance its search ability. For the parameter control in DE, adaptive or self-adaptive techniques are designed for DE based on the feedback information from the whole population or/and different individuals [8]- [10]. A detailed review of the enhanced DE variants with the above techniques will be given in Section III.
Although the above-mentioned techniques can alleviate the shortages of DE to some extent, the search ability of the mutation strategy still suffers from a few limitations. Generally, the mutation process of most DE variants is mainly guided by the elites with the best fitness values. Typically, several original mutation strategies, such as ''DE/best/1'', ''DE/currentto-best/1'' and ''DE/rand-to-best/1'', use the best individual of the current population to guide the search. In [11], a local and global neighborhood-based mutation operator was presented by guiding the population with the best neighbor of the target individual and the best individual of the current population concurrently. In [9], a new mutation strategy ''DE/current-to-pbest'' was proposed in JADE by using the top 100p% individuals as the leaders to generate the trial vector for each individual. Due to its competitive performance, many modifications of JADE have been developed to further improve its robustness, such as SHADE [12], L-SHADE [13], jSO [14], and so on. In [15], the strategy ''DE/current-tonbest/1'' with the best neighbor of the target individual and the strategy ''DE/current-to-pbest'' were used independently to generate two trial vectors for each individual. In these mutation strategies, the leaders for guiding the population are mostly selected based on their fitness values, which may lead to the population be evolving towards the unpromising areas more frequently when they are clustered in a locally optimal region. Therefore, it is necessary to develop an efficient strategy to detect the promising leaders at different stages of evolution by considering both their fitness values and distribution information. Further, in most DE variants, the trial vectors lost in comparison with their parent rivals are directly discarded during the evolutionary process. In this manner, the valuable information contained in the potential trial vectors will not be effectively utilized for the local exploitation in the promising regions that they are located in. Thus, it is expected that introducing a reuse mechanism for the potential trial vectors that are defeated during the selection process will greatly benefit the convergence speed of the DE.
To address the above issues, this article proposes an enhanced DE framework with a leaders-detection-andguidance mechanism, termed DELDG. The novel mechanism in DELDG is composed of an adaptive leaders detection (ALD) and a neighborhood-based tournament selection (NTS). To be specific, on the one hand, ALD first divides the current population into a number of non-overlapping clusters with an adaptive leaders sizing strategy, and then the best individual in each cluster will be identified as a promising leader and be elected to a leadership group (LG) for guiding the search. On the other hand, NTS will reserve the trial vectors lost during the selection process and reuse these failed trial vectors to compete against the stagnated individuals belonging to the same neighborhood. In NTS, the individuals will be regarded as stagnation if they have not been improved within a certain number of times, and each defeated trial vector will have a chance to replace these stagnated individuals if it performs better than the them.
In general, the characteristics of DELDG can be summarized as follows: • Two novel operators, i.e., adaptive leaders detection (ALD) and neighborhood-based tournament selection (NTS), are introduced into DE to enhance its performance for global optimization. With these two operators, DELDG can provide an effective guidance for each individual with multiple promising leaders and further promote convergence of DE by strengthening the search within the neighboring region of each leader.
• An adaptive leaders detection (ALD) is proposed to avoid the misguidance towards the unpromising areas.
Other than the mechanism in most DE variants that only uses the top best individuals as leaders to guide the search, ALD elaborately detects the promising leaders for guidance by considering their fitness values and distribution information simultaneously. Further, by arming with an adaptive leaders sizing strategy, ALD can dynamically adjust the number of leaders as the optimization proceeds.
• A neighborhood-based tournament selection (NTS) is presented to make use of the potential trial vectors for the local exploitation. Different from directly discarding the failed trial vectors in most DE variants, NTS performs the tournament selection among the failed trial vectors and the stagnated individuals of the same neighborhood to further exploit the neighboring region of each leader.
The remaining of this article is organized as follows. Section II briefly introduces the basics of DE. Section III reviews the related works on the techniques for offspring generation strategies and control parameters in DE. The proposed DELDG framework is presented in detail in Section IV. In Section V, the experimental study on two sets of benchmark problems are carried out to evaluate the performance of DELDG. Finally, the conclusions and future works of this study are presented in Section VI.

II. BASICS OF DE
DE is a population-based evolutionary algorithm for solving continuous optimization problems. In DE, an initial population (P 0 ) of NP individuals is randomly sampled from the pre-specified search space at the beginning of evolution. Each individual in P 0 is denoted as where D is the dimension of the optimization problem, and i = 1, 2, . . . , NP. Then, the jth variable of − → x 0 i is generated as follows: where rndreal(0, 1) represents a uniformly distributed random number in the interval [0,1], and U j (L j ) is the upper (lower) bound of the jth variable. Followed that, three key operators of DE, i.e., mutation, crossover, and selection, are carried out iteratively to search for the optimal solution of the problem being optimized.

1) Mutation operator (MO)
In the mutation operator, each individual of the current population ( − → x g i , also called the target vector) employs a mutation strategy to generate a mutant vector ( In the DE community, several mutation strategies are used frequently, which are shown below: where r1, r2, r3, r4, and r5 ∈ {1, 2, . . . , NP} \ {i} are mutually different random indices, − − → x g best is the best solution at generation g. More details of them can refer to [1], [2].

2) Crossover operator (XO)
After the mutation operator, the crossover operator is applied to each pair of The binomial crossover used in this article is expressed as follows: where Cr ∈ [0, 1] is the crossover rate and j rand ∈ [1, D] is a randomly selected integer. If the variable of the trial vector is out of the pre-specified interval, it will be re-initialized by Eq. (1).

3) Selection operator (SO)
After the crossover operator, the selection operator is carried out to choose the better solution for the next generation from each pair of − → x g i and − → u g i , which is executed as follows: where f ( − → x ) is the fitness value of − → x for the optimization problem to be minimized. As shown in Eq. (8), the trial vector will successfully replace its parent if it has better performance; otherwise, it will be abandoned directly.

III. RELATED WORK
In recent decades, much research effort has been devoted to further improving the performance of DE on complex optimization problems, leading to the coming of many advanced DE variants. As pointed out in [2]- [4], the offspring generation strategies, i.e., mutation and crossover operators, and the control parameters greatly affect the performance of DE. Therefore, a large body of research has focused on investigating the techniques to overcome the weaknesses of DE from the following aspects.
For the studies on setting the control parameters, numerous adaptive or self-adaptive techniques have been introduced into DE. For example, a self-adaptive DE (SaDE) was proposed in [8] by using a control parameter strategy based on the historical successful and failure experience in generating offspring. In [10], an individual-dependent parameter setting was proposed by the parameter values for each individual based on its fitness value. In [16], the control parameters technique based on zoning evolution was designed for DE by considering its performance with different combinations of control parameters in different search zones. In [17], a new historical and heuristic DE (HHDE) was presented with a historical-heuristic-based parameter adaptation mechanism. In [18], a self-adaptive method for the parameters control was proposed by learning from the information on the superior individuals. In [19], a novel parameter adaptive DE was proposed by employing a grouping strategy with an adaptation scheme for both F and Cr and a parabolic population size reduction scheme. In [20], a decreasing mechanism for the Jrand number in the crossover operator was proposed based on a feedback guidance technique, and a new strategy for updating the control parameters was designed by the fixed orientation and the Levy distribution.
For the studies on crossover operators, some researchers have focused on investigating the strategy that can effectively handle different types of variable interrelations. In [6], a novel DE with hybrid linkage crossover was proposed by extracting and incorporating the linkage information of the problem to be optimized into the crossover process. In [5], a multiple exponential recombination for DE was presented, where multiple segments of mutant vector and target vector will be exchanged to obtain the trial vector. In [7], an adaptive DE framework to tune the coordinate system was proposed and thus the crossover operator was performed based on both the original and Eigen coordinate systems.
In addition to the works on setting the control parameters and enhancing the crossover operator, a great deal of research is focused on developing the high-efficiency mutation operator for DE. In general, the studies on mutation operators can be broadly classified under the following four categories.
The first category is to propose new mutation strategies. In [21], a novel interactive information scheme (IIN) was proposed by incorporating a directional vector that is constructed with the ranking information into the mutation operator for providing promising directional information. In [22], an enhanced fitness-adaptive DE (EFADE) with a new triangular mutation scheme was presented, in which three solutions are selected randomly and their difference vectors are used to define the convex combination vector of the triplet. In [23], an adaptive guided DE (AGDE) with a novel mutation scheme was proposed by randomly selecting two vectors from the top and the bottom 100p% (p ∈ (0, 1)) individuals respectively and selecting the third vector from the middle individuals of the current population. In [9], an important and powerful DE variant, JADE, was proposed with a novel mutation strategy ''current-to-pbest'' in which the top 100p% best solutions are used to guide the mutation process. Due to its encouraging performance, plenty of improvements have been introduced into JADE. For example, in [12], an enhanced JADE variant with a success-history based parameter adaptation (SHADE) was proposed by using a historical memory of recently successful parameter sets to generate new control parameter values. Followed that, an extended version of SHADE with linear population size reduction (L-SHADE) was presented to further improve its performance by continually decreasing the population size from a larger value (i.e., 18 × D) to a smaller value (i.e., 4) related to the number of fitness evaluations. Recently, an improved version of L-SHADE, termed jSO [14], was proposed with a novel mutation strategy ''DE/current-to-pbest-w/1'' where a weighted factor F w is used to control the influence of the guided vector. More details of the JADE and SHADE-based variants can be referred to [24].
The second category is to employ multiple mutation strategies simultaneously to generate new mutant vectors. In [25], a composite DE (CoDE) was proposed by using three different mutation schemes simultaneously to generate trial vectors. In [26], a cheap surrogate model-based multi-operator search strategy was proposed for DE, where multiple mutant vectors are generated by multiple reproduction operators, and the best one evaluated by the surrogate model is selected as the final offspring. Similar to this, a DE variant with an underestimation-based multi-mutation strategy was presented in [27] by selecting the offspring with a cheap abstract convex underestimation model. In [28], a multi-population ensemble DE (MPEDE) was presented, where three mutation strategies are used concurrently with three equally sized smaller indicator subpopulations, and an additional larger reward subpopulation is allocated to the best performing mutation strategy. In [18], a DE variant based on the individual difference information (DI-DE) was proposed by selecting a suitable strategy from three different mutation strategies for the superior and inferior individuals respectively based on an adaptive method. In [29], an adaptive DE was proposed by combining two mutation operators with different characteristics, and a cooperative rule was employed to adaptively select a suitable mutation operator based on their historical successful information. In [30], a fitness-based adaptive DE algorithm (FADE) was proposed by adaptively selecting one of three specified mutation strategies to generate offspring for different individuals in the same swarm based on their fitness values.
The third category is to hybridize with other techniques. In [31], a comprehensive review of the existing hybrids on DE and particle swarm optimization (PSO) was given by considering five hybridization factors and building a systematic taxonomy of hybridization strategies. In [32], a hybrid artificial bee colony with DE (HABCDE) was proposed by modifying the scout bee phase with the search mechanisms of DE. In [33], a hybridization of cultural algorithm and DE was proposed where the two meta-heuristics are performed in parallel. In [34], an estimation of distribution algorithm (EDA) was hybridized with DE for generating the offspring together by using the evolutionary operators of DE and the Gaussian probabilistic model of EDA. In [35], a learning-enhanced DE (LeDE) was proposed by combining DE with a clustering-based learning strategy (CLS). In LeDE, the one-step K-means clustering technique was used to divide the population into several clusters and the information of population was exchanged between within the same cluster and between different clusters [35]. In [36], the K-means clustering technique was also incorporated into DE to present a cluster-based population initialization operator where the initial population is composed of the best solution in each cluster. In [37], a novel DE framework, termed self-organizing neighborhood-based DE (SON-DE), was presented by incorporating a two-dimensional self-organizing map (SOM) into DE to learn the neighborhood relationship of the population for guiding the search of DE. In [38], an adaptive dimension level adjustment framework was designed for DE to address the problem of stagnation through two re-initialization strategies with different search characteristics. In [39], a double layer search mechanism with a species-based clustering partition method was introduced into DE to deal with the multi-modal optimization problems, with the purpose of enhancing the population's diversity and locating more optima.
The fourth category is to design novel algorithmic frameworks for the parents selection. In [40], a proximity-based DE framework (ProDE) was proposed by selecting the parents based on the probability that is inversely proportional to their Euclidean distances from the target individual. In [41], a DE framework with the ranking-based mutation operators (RankDE) was presented by selecting the individuals as the parents for mutation based on their ranking values in the current population. In [42], an enhanced DE with multi-objective sorting-based mutation operators (MSDE) was proposed, in which the non-dominated sorting is used to sort all the individuals based on their fitness and diversity contribution, and then the parents selection is executed based on their ranking values. In [43], a neighborhood-adaptive DE (NaDE) was proposed by using different index-based topologies to define multiple neighborhood relationships for each individual and adaptively selecting a suitable neighborhood for it based on the historical successful and failure experiences of these topologies. In [44], a social learning DE (SL-DE) was presented, in which the social influence of each individual is adopted to construct the neighborhood relationships between individuals and the parents are selected from the neighborhood of each individual.
Issues existed in the previous DE variants: As reviewed above, much research effort has been devoted into studying the mutation operator of DE. Although the promising results have been achieved by these DE variants, the matter of selecting appropriate parents for mutation has not yet been solved properly. For example, JADE and its variants (e.g., SHADE, L-SHADE, and jSO) selects a set of the best individuals to guide the process of mutation only based on their fitness values in the objective space, regardless of their distribution information in the decision space. Accordingly, the evolution of the population might be misguided towards the unpromising areas if these leaders are clustered in a locally optimal region. Therefore, how to choose appropriate parents to guide the mutation process is still an issue that is worth studying. On the other hand, the failed trial vectors generated by the mutation operator will be directly discarded during the selection process in most DE variants. In general, each trial vector can provide the useful feedback information about the problem being optimized, which is beneficial in describing the characteristics of its local fitness landscape. Hence, how to effectively utilize the information of the failed trial vectors is also an important issue to the local exploitation for the problems being optimized, especially for the computationally expensive problems.

IV. PROPOSED ALGORITHM: DELDG
To overcome the above weaknesses in the existing DE variants, this study proposes an enhanced DE framework (DELDG) with a leaders-detection-and-guidance mechanism. In DELDG, an adaptive leaders detection (ALD) is used to dynamically and elaborately detect the promising leaders to build a leadership group (LG) for guiding the evolution of population, while a neighborhood-based tournament selection (NTS) is designed to enhance the local exploitation within the neighborhood of each leader by executing the tournament selection among the failed trial vectors and the stagnated individuals.
To present the proposed algorithm more clearly, several notations used in DELDG are listed below.
. . , − → l p }: the constructed leadership group with p leaders; • − → l i : the ith leader in LG; • p: the size of LG; • N i : the neighborhood for − → l i ; • Q i : the set of the failed trial vectors generated by the individuals in N i ; • t: the parameter to control the initial value of p; • DIV : the used division method in ALD; • CC: the crowding clustering method; • SC: the speciation clustering method; • KC: the K-means clustering method; • C i : the ith cluster obtained by applying DIV to P g ; • S i : the indicator of search state for − → x g i ; • TH : the pre-defined threshold for detecting stagnation; • SN i : the number of consecutive generations that − → x g i cannot be updated successfully.

A. THE GENERAL FRAMEWORK OF DELDG
The general framework of DELDG is summarized in Algorithm 1. It is clear that DELDG consists of three main components: adaptive leaders detection (ALD), offspring generation operators (i.e., the DE operators), and neighborhood-based tournament selection (NTS). In the initialization phase of − → x 0 NP } is initialized randomly from the specified search space by using Eq. (1). Then, the fitness value of each individual   which are generated by the individuals in N j are stored in Q j . Finally, NTS is executed on the failed trial vectors in Q j and the stagnated neighbors in N j for − → l j to obtain the next population.
Further, the flowcharts of the original DE algorithm and the DELDG framework are depicted in Figure 1. Clearly, compared with the original DE algorithm in Figure 1 (a), DELDG is armed with ALD to detect promising leaders for guiding the mutation process and NTS to introduce the competition into the neighborhood of each leader for exploiting the local areas. In the following subsection, these two main components of DELDG, i.e., ALD and NTS, will be detailed in the following subsections.

B. ADAPTIVE LEADERS DETECTION (ALD)
To detect a group of promising leaders for guiding, ALD needs to deal with the following three problems: 1) How many promising leaders should be included in LG during the process of evolution? 2) How to detect the promising leaders for LG from current population? 3) How to construct the neighborhood for each leader?

1) SETTING THE SIZE OF LG
For the first problem, ALD uses an adaptive leaders sizing strategy to set the value of p dynamically. To be specific, p is determined as follows: where t is used to control the initial size of LG, · is the function for rounding down, g is the number of current generation, and G max is the maximal number of generations. Note that the termination criterion is set based on the maximum number of fitness evaluations (Max_FE) in this article, and thus G max can be obtained by G max = Max_FE/NP. It can be seen that the value of p is gradually decreased from NP × t −1 to 2 as the iteration proceeds. The rationality of adaptively setting p is as follows. First, as shown in [40], the individuals of population distribute sparsely at the beginning of the iteration and then will gradually cluster densely along with the evolution. Therefore, the distribution information of population at different evolutionary stages can be effectively used by setting p through Eq. (9). Second, the LG with more leaders will be beneficial to the population diversity in the early phase of evolution, while the LG with fewer leaders will focus on convergence in the late phase of optimization. Hence, the balance between convergence and diversity of population can be dynamically adjusted with the value of p related to the number of generations.

2) DETECTING THE LEADERS FOR LG
For the second problem, a division method (DIV ) is first introduced into ALD to partition the current population into a number of non-overlapping clusters, and then the best individual in each cluster will be identified and selected as a promising leader for LG.
Concretely, after p is determined by Eq. (9), a DIV is employed to partition the whole population into p clusters. In this article, three clustering methods are considered as the DIV .
• Crowding clustering (CC) [45] In CC, a reference point ( − → r ) is randomly generated first in the search space. Then, the individual − → x that is closest to − → r is determined. Afterward, the NP/p − 1 individuals nearest to − → x are selected and combined with − → x to form a cluster. Once the cluster is constructed, the individuals within it will be eliminated from the population. Other p − 1 clusters will also be constructed in the same way. The procedure of CC is presented in Algorithm 2. Determine − → x that is closest to − → r ; 6: Select NP/p − 1 individuals those are nearest to − → x ; 7: Combine the selected individuals with − → x to form a cluster; 8: Eliminate these NP/p from P g ; 9: End While • Speciation clustering (SC) [45] In SC, all the individuals of P g are first sorted based on their fitness values. Then, the best individual of the population ( − − → x best ) is chosen as a seed. Next, the NP/p−1 individuals nearest to − − → x best are identified and combined with − − → x best to form a cluster. After the cluster is formed, the individuals that belong to it will be removed from P g . As an analogy, the construction of other p − 1 clusters is the same. The procedure of SC is given in Algorithm 3.
• K-means clustering (KC) [46] In KC, p individuals are randomly selected from the current population as the initial centroids of the p clusters. Then, each individual is assigned to the cluster (centroid) that is closer than other clusters. After that, the centroids of the clusters are recalculated by taking the average of all the individuals of that cluster. The above steps will be repeated until the assignment of all the individuals to the clusters does not change anymore. Finally, the p clusters Choose the best individual − − → x best ∈ P g as a seed; 6: Select NP/p − 1 individuals nearest to − − → x best ; 7: Combine the selected individuals with − − → x best to form a cluster; 8: Remove these NP/p from P g ; 9: End While have been constructed. The procedure of KC is shown in Algorithm 4. Assign each individual to the cluster with the closest centroid; 6: Recalculate the centroid of each cluster by taking the mean of all the individuals of that cluster; 7: End While Once the p clusters have been constructed by a DIV , the LG will be built as follows: (10) From Eq. (10), it can be found that the promising leaders in LG is detected by considering both their fitness information in the objective space and their distribution information in the decision space. In this way, the problem of misguidance with the clustered leaders will be alleviated, and thus the guidance of the population towards the promising search regions could be more effective.

3) CONSTRUCTING THE NEIGHBORHOOD FOR EACH LEADER
For the third problem, based on the partition of current population by using a DIV , the members of the cluster that the leader belongs to will be treated as its neighbors directly. Therefore, the neighborhood of − → l i is constructed as follows:

4) SUMMARIZING THE PROCEDURE OF ALD
By solving the above three problems, the procedure of ALD is summarized in Algorithm 5, where we can see that ALD includes four key steps. To be specific, by using an adaptive leaders sizing strategy, the first step of ALD is to determine VOLUME 8, 2020 Algorithm 5 Adaptive Leaders Detection (ALD) 1: INPUT: the number of current iteration g, P g , the used DIV ; 2: OUTPUT: LG and N i , i = 1, 2, . . . , p; 3: Calculate p using Eq. (9); 4: {C 1 , C 2 ,. . . , C p } ← DIV (P g , p); 5: Build LG using Eq. (10); 6: Construct N 1 , N 2 , . . . , N p using Eq. (11).
the value of p related to the number of the current generation g. Next, the p clusters are obtained by applying a DIV method (i.e., CC, SC, or KC) to the current population. At the third step, the LG is built with p promising leaders, each of which is selected from an obtained cluster based on its fitness value. Finally, the neighborhood of each leader is constructed with the individuals of the same cluster.

C. NEIGHBORHOOD-BASED TOURNAMENT SELECTION (NTS)
To effectively utilize evolutionary information of the failed trial vectors for the local exploitation, the following two issues will be addressed by NTS.
1) How to detect the individuals of the current population being stagnant? 2) How to perform the tournament selection among the failed trial vectors and the detected stagnant individuals?

1) DETECTING THE STAGNANT INDIVIDUALS
As expressed in [47], stagnation is used to describe the phenomenon that the population cannot generate any better solutions. It means that all the individuals of a population might get trapped in the locally optimal regions and cannot be updated with improved solutions. Therefore, in this study, the individual being located in an unpromising or locally optimal area will be referred to as a stagnant individual.
To detect the stagnant individual of the current population, the following rule is used [47]: where S i is an indicator of the search state of − → x i . S i = 1 means − → x g i is in stagnation. SN i is used to record the number of consecutive generations that − → x g i cannot be updated successfully. If − → x g i is replaced by its trial vector, SN i = 0; otherwise, SN i = SN i + 1. TH is a pre-defined threshold for detecting stagnation.

2) TOURNAMENT SELECTION BASED ON THE NEIGHBORHOOD OF EACH LEADER
As discussed above, the trial vectors generated during the iterations can provide the useful feedback information about the local fitness landscape of the optimization problem.
Based on this consideration, the tournament selection is executed based on the neighborhood of each detected leader in LG.
In each generation, each individual in {l i } ∪ N i is updated first via the DE operators, and the trial vector lost in comparison with its parent vector will be stored in Q i .
After that, if Q i = ∅, the tournament selection will be executed among the failed trial vectors in Q i and the stagnant individuals in N i . Specifically, the tournament selection with regard to − → l i is given as follows: As presented in Eq. (13), each failed trial vector in Q i competes against all the stagnant individuals in N i , and it will replace the competitor that is worse than it. In this study, to keep the diversity of the population, each failed trial vector will only replace the worse competitor that it first encounter.

3) SUMMARIZING THE PROCEDURE OF NTS
With the above two operators, the procedure of NTS is given in Algorithm 6. From Algorithm 6, NTS is performed to update the population for the next generation based on the neighborhood of each leader in LG. Concretely, for each leader − → l i ∈ LG, NTS first checks the search state (S i ) of each neighbor within its neighborhood N i . If a neighbor cannot be improved for TH consecutive generations, it will be identified as a stagnant individual, and S i will be set to 1. Otherwise, S i will be still set to 0. Next, the stagnant neighbors are detected according to their S i values. After that, the tournament selection is carried out among the failed trial vectors in Q i and the detected stagnant individuals in N i . If a failed trial vector is better than a stagnant neighbor, it will replace the stagnant neighbor and be removed from Q i immediately. Finally, when the neighborhood of each leader has been updated by the tournament selection, the new population for the next generation will be obtained by combining LG and the updated neighborhood of each leader. Detect the stagnant individuals in N i ; 6: Replace the stagnant individuals with Q i by Eq. (13); 7: End For 8: Obtain the population P g+1 for the next generation by combining N 1 , N 2 , · · · , N p , and LG.

D. APPLYING DELDG TO JADE AND SHADE
As presented in Algorithm 1 and Figure 1, DELDG has a simple structure and can be applied to most DE variants easily. When implementing a specific DE algorithm under the DELDG framework, the only thing to consider is how to generate the offspring with both the evolutionary operators and the guidance of LG. In this article, DELDG is applied to two important and representative DE variants, JADE [9] and SHADE [12]. The DELDG variants with JADE and SHADE are termed JADE-LDG and SHADE-LDG, respectively. In JADE-LDG, an improved mutation strategy, named DE/current-to-LG, is designed under the guidance of LG, which is shown as follows: where − → l r is a leader that is randomly selected from LG, and − → x g r1 and − → x g r2 are the same as those in the original JADE. From Eq. (14), it is clear that the generation of the mutant vector in JADE-LDG is guided by the detected promising leaders in LG, instead of the top best solutions in the original JADE. Additionally, setting the size of LG is performed by an adaptive leaders sizing strategy (described in 4.3.1) in JADE-LDG, rather than using a fixed value to control the greediness of the mutation strategy in the original JADE.
Compared with JADE-LDG, SHADE-LDG is only different in the parameter adaptation scheme. Specifically, JADE-LDG generates the control parameters based on the distribution around a single pair of µ Cr and µ F , while SHADE-LDG uses a historical memory to store a set of the µ Cr and µ F pairs and then samples the control parameters based on the distribution around a randomly selected pair of µ Cr and µ F . The details of the parameter adaptation scheme in JADE-LDG and SHADE-LDG can be referred to [9] and [12], respectively. Algorithm 7 depicts the offspring generation operators in JADE-LDG/SHADE-LDG. For each − → l i ∈ LG, the mutation strategy (MO) with DE/current-to-LG, the crossover operator (XO), and the selection operator (SO) will be applied sequentially to Additionally, as shown in Steps 8 -14 of Algorithm 7, − → x g j will be updated if its trail vector performs better than it. Otherwise, the trial vector lost in comparison with its parent will be stored in Q i .
In summary, by integrating the three main components (i.e., ALD in Algorithm 5, NTS in Algorithm 6, and offspring generation operators in Algorithm 7) into the general framework of DELDG, the complete procedure of JADE-LDG/SHADE-LDG is built. Obviously, the main differences between JADE-LDG/SHADE-LDG and the original JADE/SHADE are the use of ALD to guide the search of mutation operator and the employ of NTS to promote local exploitation.
In addition, to show how DELDG works clearly, the detailed steps of JADE-LDG are described below: 14: End If 15: End For 16: Update the archive set, µ F , and µ Cr as those in JADE/SHADE.
Step 3: Apply a division method (e.g., CC in Algorithm 2, SC in Algorithm 3, or KC in Algorithm 4) to partition the whole population into p clusters C 1 , . . . , C p .
Step 4: Build the LG with the best individual in each cluster by Eq. (10).
Step 5: Construct the neighborhood N i for each leader l i ∈ LG by Eq. (11), and set Q i = ∅, i = 1, . . . , p.
Step 6: Generate the control parameters Step 10: Step 11: Return to Step 6 until all the individuals of the population have been considered.
Step 12: Update the archive set, µ F , and µ Cr as those in JADE. VOLUME 8, 2020 Step 13: Check the search state of each individual in N i by Eq. (12), i = 1, . . . , p.
Step 14: Detect the stagnated individuals in N i and replace them with Q i by Eq. (13).
Step 15: Establish the population for next generation P g+1 by combining N 1 , . . . , N p and LG.
Step 16: Go to Step 2 for the next iteration if the termination criterion is not satisfied; Otherwise, output the best individual of the current population and terminate the process of the algorithm.

2) COMPARISONS BETWEEN DELDG AND THE EXISTING DE VARIANTS
On the one hand, compared with the existing DE variants, the proposed DELDG is characterized by its leadersdetection-and-guidance (LDG) mechanism. As analyzed in Section III, two important issues, i.e., guiding the search with promising leaders and reusing the failed trial vectors, are still not well solved in most DE variants. Thus, the LDG mechanism with ALD and NTS is proposed to deal with these two issues. The former is used to detect multiple promising solutions by considering their information in both the objective and decision spaces, while the latter is employed to update the neighborhood of each leader with the failed trial vectors by the tournament selection. With these two components, the LDG mechanism is expected to effectively alleviate the above issues existed in the previous DE variants. On the other hand, similar to DELDG, the DE variants for multi-modal optimization usually divide the population into several subpopulations (or species) to locate multiple optimal solutions [45], [48]. Although the multi-modal DE variants employ a similar division technique to DELDG, there are significant differences between them: (1) The division operator in the multi-modal DE variants is aim to maintain the diversity of the population and thus the multiple potential optimal solutions can be found and preserved during the optimization process [48]. Reversely, in the proposed DELDG, the division operator is employed to detect multiple leaders for guiding the search towards the promising areas. In addition, the number of leaders is dynamically adjusted by an adaptive leaders sizing strategy. (2) The selection operator in most multi-modal DE variants is performed by comparing the trial vector with a similar individual of the current population based on the Euclidean distance. Further, the trial vector will be directly discarded if it performs worse than its competitor. On the contrary, the trial vector lost in the comparison with its parent will be archived and used to replace the stagnant individuals of the same neighborhood if it is better than the competitor.

V. EXPERIMENTAL STUDY
In this section, a suite of benchmark functions from the CEC2013 special session on real-parameter optimization [49] is used to evaluate the performance of the proposed DELDG framework. The CEC2013 test set consists of 28 functions, which can be classified into the three categories based on their properties: unimodal functions (F1 -F5), basic multimodal functions (F6 -F20), and composition functions (F21 -F28). More details can be found in [49].
In the experiments, all the test functions are run 30 times independently for each compared algorithm. The Max_FE for each function is set to 10 4 × D as the termination criterion of a run. At the end of a run, the average value and the standard deviation of the best error got by the algorithm are recorded for comparison. In addition, the parameters for the compared DE algorithms and their augmented DELDG are set as follows if no change is mentioned: NP = 100, TH =50, and t=10. Other parameters of them under the DELDG framework are kept the same as those of their original papers.
To test the statistical significance between DELDG and the compared algorithms, single-and multiple-problem analyses by the Wilcoxon's signed-rank test [50], [51] are executed. Here, in the single-problem analysis at α = 0.05, ''w/t/l'' denotes that the considered algorithm is significantly better than, similar to and worse than its competitor on the corresponding number of functions, respectively [50]. In the multiple-problem analysis, R+ and R− mean the sum of ranks that the considered algorithm performs significantly better than and worse than its competitor for all the test functions, respectively, and ''+'', ''='' and ''−'' mean the considered algorithm is better than, equal to, and worse than the compared algorithm overall at the corresponding significance level, respectively [50]. In this article, only the statistical results of the comparisons are given for brevity, and the detailed results with errors and standard deviations are shown in the supplementary file.
Based on the purpose of this study, the experiments are carried out with the following points:  1) Three different division methods (i.e., CC in Algorithm 2, SC in Algorithm 3, and KC in Algorithm 4) are used in DELDG respectively to analyze their effect on the performance of DELDG, which is shown in Section V-A.
2) The effectiveness of the two proposed components (i.e., ALD and NTS) is investigated to show their contributions to the performance of DELDG, which is given in Section V-B.
3) The sensitivity of the parameter settings is studied to demonstrate the effect of these parameters on the performance of DELDG, which is presented in Section V-C. 4) DELDG is applied to four JADE-based variants to verify the effectiveness of the proposed framework, which is shown in Section V-D. 5) The comparisons between DELDG and other advanced DE variants, as well as the state-of-the-art EAs, are made to display the competitive performance of the proposed framework, which are given in Section V-E to Section V-G. 6) DELDG is applied to the real-world numerical optimization problems from the CEC2011 competition [52] to evaluate its performance on the real-world problems, which is presented in Section V-H.
In the following subsections, JADE-LDG and SHADE-LDG will be used as two representative DELDG variants for comparison.

A. COMPARISON OF DIFFERENT DIVISION METHODS
To investigate the influence of the division method in ALD on the performance of DELDG, the DELDG variants with different division methods, i.e., DELDG with CC (DELDG-CC), DELDG with SC (DELDG-SC), DELDG with KC (DELDG-KC), are considered. The statistical comparison results between them are summarized in Tables 1 and 2.
As can be seen from Table 1, all the DELDG variants are capable to significantly enhance the performance of the original DE on the majority of the test functions. To be specific, when compared with the original JADE algorithm, JADE-LDG-CC, JADE-LDG-SC, and JADE-LDG-KC obtain the significantly better results on 13, 12, and 10 functions, respectively, and is outperformed by it on 2, 3, and 4 functions, respectively. When compared with the original SHADE algorithm, SHADE-LDG with CC, SC, and KC are significantly better than it on 16, 14, and 12 functions, respectively, and is worse than it on no more than 4 functions. Also, based on the Friedman test [50] in Table 2, DELDG with CC gets the first ranking for both cases, followed by that with SC and KC. The original JADE and SHADE algorithms are ranked last in their respective comparisons.
From to the above results, we can draw some conclusions: 1) DELDG with CC, SC, or KC can greatly improve the performance of the considered DE algorithm, which verifies the effectiveness of different division methods in constructing LG under the proposed framework, and 2) DELDG with CC achieves the best results in terms of the average ranking among the three variants. Compared with DELDG with SC and KC, the better performance of DELDG-CC is largely attributed to the advantage of CC in finding and maintaining the detected promising leaders [45]. In the following sections, only CC is used as the division method in DELDG, and DELDG-CC will be referred to as DELDG for the sake of simplicity.

B. EFFECTIVENESS OF THE PROPOSED COMPONENTS
To study the effectiveness of the proposed two components (i.e., ALD and NTS), the following variants of DELDG are used for comparison. VOLUME 8, 2020  • DE-ALD: the proposed framework with ALD only.
• DE-NTS: the proposed framework with NTS only. In these two variants, DE-ALD only uses ALD to build LG for guiding the mutation process but without NTS, while DE-NTS only uses NTS to execute the tournament selection among the failed trial vectors and the stagnated individuals based on the DIV but without constructing LG. The comparison results between DELDG and these two variants are shown in Table 3.
From Table 3, some comments can be given below: • Compared with the original DE variant, both DE-ALD and DE-NTS can achieve the better results in most cases according to the multiple problem statistical analysis. Further, based on the p-value, JADE-NTS is significantly better than JADE overall at both α = 0.05 and α = 0.1, while SHADE-NTS significantly outperforms SHADE overall at α = 0.1.
• In the case of DE-ALD, JADE-LDG and SHADE-LDG are significantly better than JADE-ALD and SHADE-ALD on most functions, respectively. Based on the results of the multiple problem statistical analysis, both JADE-LDG and SHADE-LDG can obtain the higher R+ values than the R− values. In addition, the p values are less than 0.05, which indicates that both JADE/SHADE-LDG and SHADE-LDG outperform JADE-ALD and SHADE-ALD overall.
• In the case of DE-NTS, JADE-LDG and SHADE-LDG can obtain slightly better results than JADE-NTS and SHADE-NTS respectively. Although both JADE-LDG and SHADE-LDG obtain the higher R+ values than R−, no significant differences between them can be observed. According to the above comparisons, some interesting observations can be obtained: 1) Both ALD and NTS are beneficial to improving the performance of JADE and SHADE in most cases; 2) NTS can play a more significant role than ALD in enhancing the performance of DE; 3) The combination of ALD and NTS can result in synergy to further enhance the search ability of DE. In general, these observations verify the synergy between the proposed two components under the DELDG framework.

C. SENSITIVITY ANALYSIS OF PARAMETERS
To investigate the sensitivity of TH for detecting the stagnated individuals and t for setting the LG size, the comparisons among the DELDG variants with different parameters are made. Here, three functions with different characteristics are selected from the CEC2013 test set, i.e., the unimodal function F4, the basic multimodal function F13, and the composition function F25. From Figures 2 and 3, we can obtain some observations: 1) for F13, the DELDG variants with a larger TH value (e.g., 50, 60, or 70) have a faster convergence speed than those with a smaller TH value; 2) for F13, the DELDG variants with different TH values get the similar results; 3) for F25 at 30D, the DELDG variants with a smaller TH value (e.g., 10, 20, or 30) can obtain the better results than those with a larger TH value.
Therefore, it can be concluded that the performance of DELDG is not sensitive to the setting of TH in most cases. For the unimodal functions, DELDG with a larger TH value can provide more robust results, while, for the composition functions, DELDG with a smaller TH value might achieve the better performance. In the future, the adaptive or self-adaptive parameter control techniques will be studied to select the appropriate value of TH for different types of functions. As shown in Figures 4 and 5, the DELDG variant with a larger t value is capable to obtain the slightly better results in  Overall, DELDG with t = 10 or 15 values can achieve a better performance for the considered unimodal and basic VOLUME 8, 2020   Table 4.
Some findings can be obtained from Table 4: • In the case of D = 30, DELDG obtains better results than the corresponding JADE-based variants overall. Concretely, JADE-LDG, SHADE-LDG,  L-SHADE-LDG, and jSO-LDG are significantly better than the corresponding competitors on 13, 16, 6, and 10 functions, respectively.
• According to the multiple problem statistical analysis, DELDG obtains the higher R+ values than the R− values in all the cases. Further, the p values are less than 0.05 in two cases (i.e., JADE and SHADE).
• DELDG can achieve great performance improvement toward JADE and SHADE on most test functions. For L-SHADE and jSO, although the significant improvement cannot be brought by DELDG, the proposed framework is able to obtain the better results overall.
From the above results, we can conclude that the proposed DELDG framework is beneficial to the performance enhancement of the JADE-based variants on the test functions. These improvements might derive from two aspects. First, the constructed LG in DELDG guides the mutation process of the JADE-based variants with the detected promising leaders, instead of the top best solutions in these variants. Second, the update of population via NTS can further take advantage of the evolutionary information of the potential trial vectors for the local exploitation.

E. COMPARISON WITH THE CEC2013 COMPETITORS
To evaluate the performance of the proposed framework, SHADE-LDG is used as the representative DELDG VOLUME 8, 2020 algorithm to compare with the advanced DE variants and the non-DE EAs participating in the CEC2013 competition [49].
From Table 5, SHADE-LDG exhibits its superiority to the other eight DE variants. Concretely, based on the multiple-problem statistical analysis, SHADE-LDG obtains the higher R+ value than the R− values in all the cases. Besides, the p-value demonstrates that SHADE-LDG is significantly better than all the competitors for the functions both at 30D and 50D. Further, the results of the Friedman test in Table 6 also show the outstanding performance of SHADE-LDG among all the compared algorithms. SHADE-LDG is ranked first in terms of the average ranking values for the functions both at 30D and 50D. Therefore, the above comparisons demonstrate that SHADE-LDG is capable to provide the significant better solutions for the test functions when compared with other advanced DE variants in the CEC2013 competition.

2) COMPARISON WITH TEN NON-DE EAs
In this section, the comparisons between SHADE-LDG and ten non-DE EAs participating in the CEC2013 competition [49] are made. These non-DE EAs consist of NBIPOP-ACMA-ES [61], iCMAES-ILS [62], MVMO [63], DRMA-LSCh-CMA [64], CMAES-RIS [65], ABC-SPSO [66], fk-PSO [67], TPC-GA [68], PLES [69], and CDASA [70]. The results of these compared EAs are directly obtained from their original papers. The statistical comparison results are presented in Tables 7 and 8. Tables 7 and 8 clearly show that SHADE-LDG can also obtain significantly better results in most cases. To be specific, first, based on the multiple-problem analysis, SHADE-LDG gets the higher R+ values than the R− values in eight and seven cases for the functions at 30D and 50D, respectively. Second, the p values are less than 0.05 in seven and four cases for the functions at 30D and 50D, respectively, which demonstrates the superiority of SHADE-LDG over these compared EAs. Third, according to the Friedman test, iCMAES-ILS and NBIPOP-ACMA-ES are ranked first and second for the functions both at 30D and 50D respectively, followed by SHADE-LDG as the third-best among all the compared EAs. Based on the above observations, some conclusions can be drawn: 1) SHADE-LDG has demonstrated its outstanding performance when compared with these state-ofthe-art EAs from the CEC2013 competition; 2) although iCMAES-ILS and NBIPOP-ACMA-ES are slightly better than SHADE-LDG, no significant differences can be observed between them; 3) compared with the complicated structures and sophisticated operators in iCMAES-ILS and NBIPOP-ACMA-ES, SHADE-LDG has shown its advantages in simple structure and is easy to implement; 4) the effectiveness of the proposed framework has been verified in comparison with these state-of-the-art EAs, which indicates that DELDG is also a competitive algorithm for the test functions.

F. COMPARISON WITH UP-TO-DATE DE VARIANTS
To further study the effectiveness of DELDG, the comparisons of DELDG with 11 up-to-date DE variants are made. These DE variants include ZEPDE [16], DI-DE [18], IIN-SHADE [21], EFADE [22], AGDE [23], UMS-SHADE [27], MPEDE [28], DMPSADE [71], CoBiDE [72], EBSHADE [73], and FADE [30]. ZEPDE, DI-DE, IIN-SHADE, EFADE, AGDE, UMS-SHADE, MPEDE and FADE have been briefly introduced in Section III. DMP-SADE is a self-adaptive DE variant with discrete mutation control parameters where each individual has its control parameter and mutation strategy [71]. CoBiDE is a novel DE variant based on covariance matrix learning and bimodal distribution parameter setting are used [72]. EBSHADE is a DE framework by hybridizing the mutation strategy of SHADE and a new mutation strategy ''DE/current-to-ord_pbest/1'' [73]. These DE variants are selected for comparison due to their outstanding performance on the CEC2013 test functions. Note that three recently proposed DE frameworks with SHADE (i.e., IIN-SHADE, UMS-SHADE, and EBSHADE) are also included to show the advantages of DELDG over other excellent DE frameworks. The results of these DE variants are straightly obtained from their published papers. Tables 9 and 10 summarize the statistical comparison results between SHADE-LDG and the compared DE variants.
As shown in Tables 9 and 10, SHADE-LDG is capable to achieve the significantly better results than most compared DE variants. To be specific, • For the functions at 30D, SHADE-LDG obtains the higher R+ values than the R− values in all the cases according to the multiple-problem statistical analysis. VOLUME 8, 2020 Additionally, the p values demonstrate that the significant differences can be observed between SHADE-LDG and the compared DE variants in eight and nine cases at α = 0.05 and α = 0.1, respectively. In these cases, SHADE-LDG significantly outperforms the compared DE variants overall.
• For the functions at 50D, SHADE-LDG exhibits a consistent good performance when compared with these DE variants. The results of the multiple-problem statistical analysis also indicate the superior performance of SHADE-LDG in most cases.
• Compared against the three DE frameworks with SHADE (i.e., IIN-SHADE, UMS-SHADE, and EBSHADE), SHADE-LDG can obtain better results in most cases except UMS-SHADE for the functions at 50D. UMS-SHADE can get slightly better results than SHADE-LDG for the functions at 50D, but no significant differences can be observed between them.
• According to the Friedman test, SHADE-LDG is ranked first for the functions at 30D, followed by UMS-SHADE and EBSHADE. For the functions at 50D, UMS-SHADE is ranked first, followed by SHADE-LDG. Overall, the mean aggregated ranks across all the functions and all the dimensions show that SHADE-LDG is the best among all the compared DE variants.
From the above discussions, the superior and competitive performance of SHADE-LDG has been demonstrated by comparing with these up-to-date DE variants, which further confirms the effectiveness of the proposed framework.

G. OVERALL COMPARISON
From the results in Sections E and F, the effectiveness of DELDG has been clearly demonstrated when compared with the 18 CEC2013 competitors and 11 up-to-date DE variants on the CEC2013 test functions. To further show the advantages of the proposed framework overall, the average ranking values of all the competitors by Friedman test are shown in Figure 6.
As depicted in Figure 6, some interesting observations can be obtained: • Among all the 30 competitors, the proposed SHADE-LDG algorithm can achieve the better results overall than most of the competitors. Specifically, SHADE-LDG obtains the second and fourth ranks for the functions at 30D and 50D, respectively.
• Compared with the 18 EAs from the CEC2013 competition, SHADE-LDG is only outperformed by iCMAES-ILS for the functions at 30D and is outperformed by iCMAES-ILS, NBIPOP-ACMA-ES and UMS-SHADE for the functions at 50D.  • Compared with the 11 up-to-date DE variants, SHADE-LDG can obtain the better ranking values than most of them for the functions both at 30D and 50D.
• Among all the DE-based algorithms in this comparison, SHADE-LDG can achieve the best results overall.
From the results of Table 12, SHADE-LDG can achieve better performance in most cases. Specifically, according to the multiple problem statistical analysis, SHADE-LDG obtains the higher R+ values than the R− values in 13 out of 15 cases, except ED-DE and DE-∧ Cr . Based on the p-value, SHADE-LDG significantly outperforms 9 EAs From the results of the Friedman test in Table 13 show that SHADE-LDG obtains the better ranking than most compared algorithms. To be specific, DE-∧ Cr and GA-MPC are ranked first and second, respectively, followed by SHADE-LDG as the third-best among all the compared algorithms. Note that the main purpose of this study is to propose a general framework for further improving the performance of DE with the promising leaders. Therefore, DE-∧ Cr , as a hybrid DE variant, may be further enhanced by combining with DELDG for real-world problems, which will be studied in the future work.
From the above results, we can draw some conclusions. First, the proposed framework is capable to bring great benefits to enhancing the performance of SHADE for real-world problems. Second, the competitive and outstanding performance of SHADE-LDG has been verified by comparing it with the state-of-the-art EAs, which indicates that SHADE-LDG is an effective alternative for solving complex real-world problems.

VI. CONCLUSION AND FUTURE RESEARCH
To guide the search process of DE with multiple promising individuals, an enhanced DE framework with a leadersdetection-and-guidance mechanism, termed DELDG, is proposed for continuous optimization problems. In DELDG, two novel operators are designed and incorporated into DE. One is an adaptive leaders detection (ALD), which is used to adaptively select the promising individuals of the population to build a leadership group for guiding the mutation process. The other one is a neighborhood-based tournament selection (NTS), which is employed to enhance the local exploitation in the neighboring region of each leader with the potential trial vectors. To evaluate the effectiveness of the proposed framework, DELDG is applied to four JADE-based variants for solving a suite of 30 real-parameter functions from the IEEE CEC2013 and 17 real-world problems from the IEEE CEC 2011. Simulation results have clearly demonstrated the superior performance of DELDG when compared with the advanced DE variants and the state-of-the-art EAs participating in the CEC2013 competition, as well as 11 up-to-date DE variants. Further, for the real-world problems, the competitive performance of DELDG has been shown when compared with 14 state-of-the-art EAs. Also, the influence of different division methods, the effectiveness of the proposed compo-nents, and the sensitivity analysis of parameters have been investigated.
In the future, the present work will be extended in the following directions. First, other sophisticated division methods will be studied under the DELDE framework to show their potential in detecting the promising leaders. Second, the comparisons between DELDG and other representative DE variants on more test suites will be made to further investigate the effectiveness of DELDG. Third, DELDG will be extended to solve other optimization problems, such as multi-modal, multi-objective, or multi-tasking problems.