Co-Evolutionary Niching Differential Evolution Algorithm for Global Optimization

Preserving an appropriate population diversity is critical for the performance of evolutionary algorithms. In this paper, we present a co-evolutionary niching strategy (CoEN) to dynamically evolve appropriate niching methods and incorporate it into differential evolution (DE) to preserve the population diversity. The proposed CoEN strategy is achieved by optimizing a criterion, which involves both fitness improvement and population diversity resulting from employing the niching methods during evolution of DE. To verify the performance of proposed method, an extensive test on benchmark functions taken from CEC2019 and CEC2014 has been carried out. The results show the significance of proposed CoEN and, by incorporating CoEN, the resulting DE is able to achieve a better or competitive performance than related algorithms.


I. INTRODUCTION
Optimization problems are frequently encountered in many fields such as engineering design, data analysis, financial planning and business [1]. The global optimization algorithms aim to identify the optimal or near-optimal solution. As the solution space usually involves in many local optima, avoiding these local optima is a major challenge for global optimization. Evolutionary algorithms (EAs) [2]- [4], which are able to avoid local optima of solution space, have been widely employed for global optimization [5], [6]. Among various EAs, differential evolution (DE) [7], [8] turns out to be a viable EA. Unlike other EAs, DE generates offspring by employing scaled differences among randomly sampled individuals in the population, which makes it self-adaptive to the fitness landscape of search space [9]. However, like other EAs, DE suffers from premature convergence, especially, for the problem involving in complex search space with many local optima.
The associate editor coordinating the review of this manuscript and approving it for publication was Ehab Elsayed Elattar .
To address premature convergence, preserving the population diversity is critical and many schemes have been proposed. Among these schemes, the niching method is perhaps the most popular. The niching method is referred to the technique of preserving multiple niches or favorable parts of the solution space possibly around multiple solutions. Niching methods have been widely incorporated into EAs to identify multiple solutions of the problem [10]. As the niching methods can be used to preserve population diversity, they have also been employed by EAs to avoid local optima of solution space. For example, niching methods including crowding, speciation and fitness sharing have been incorporated into DE to preserve population diversity [11]- [15]. Existing niching based EAs, however, typically employ a single niching method during evolution [16]- [20]. Since different niching methods possess different diversity preserving properties, employing a fixed niching method during DE evolution may have a limited performance. To alleviate this issue, niching ensemble strategies [21], [22], which ensemble multiple niching schemes or employ a certain niching method with various parameter settings have been introduced and incorporated into EAs [14], [23]- [25]. These studies show VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ that using multiple niching methods is able to improve the performance of niching based EAs, which rely on a single niching scheme. However, in these strategies, the niching methods are not allowed to dynamically change during the evolution process, thus limiting their performance. In this paper, a co-evolutionary niching strategy (CoEN), which can dynamically evolve appropriate niching schemes, is proposed and incorporated DE for optimization. The main goal of this study is to produce a flexible, dynamic niching strategy, which can synergistically generate appropriate niching schemes depending on the situation of evolution. This study differs significantly with previous studies, which rely on a single fixed niching scheme or ensemble niching scheme. The proposed strategy has been incorporated into JADE and evaluated on benchmark functions taken from CEC2019 and CEC2014. The results show the significance of proposed CoEN and, by incorporating the CoEN, the resulting DE is able to achieve better or competitive performance than related algorithms.
The rest of the paper is divided into five sections. Related studies are reviewed in Section II. The proposed method is presented in Section III. After that, we report experimental results in Section IV along with an analysis and discussion. Finally, a summary and remarks of the study are given in Section V.

II. RELATED WORKS
This section will first briefly review JADE, which is used as a base for our proposed co-evolutionary niching-based DE algorithms. This is followed by reviewing DE variants, niching-based DE algorithms as well as ensemble and adaptive niching-based algorithms.
DE is a simple yet powerful EA for global optimization [26]. Since the concept of DE was formally introduced by Storn and Price [7], it has attracted an increasing research interest. Many variants of DE have been proposed, among which JADE represents one of state-of-art DE. JADE implements a mutation strategy called ''DE/current-to-pbest'' with an optional external archive. The DE/current-to-pbest mutation strategy is defined as: (1) where x i,g , x r1,g and x p best,g are randomly chosen from one of the top 100p% individuals in the current population P with p ∈ (0, 1] whilex r2,g is randomly selected from the union P ∪ A. Here, the archive A is used to collect a set of inferior solutions. In JADE, at the beginning of each generation, the crossover rate (CR) value corresponding to each individual is randomly initialized with a normal distribution of mean µ CR and standard deviation of 0.1, while the scale factor (F) is randomly initialized with a Cauchy distribution with a location parameter µ F and standard deviation of 0.1. After each generation, the µ CR is updated based on an arithmetic mean (mean A ) of successful CR values, while µ F is updated based on a Lehmer mean (mean L ) of successful F values. The procedure of JADE is shown in Algorithm 1.

Algorithm 1: Procedure of JADE
Set µ CR = 0.5, µ F = 0.5, A = ∅; Create a random initial population{x i,0 |i = 1, 2, · · · , NP}; Randomly choose x p best,g as one of the 100p% best vectors; Randomly choose x r1,g = x i,g from current population P; Randomly remove solutions from A so that |A| ≤ NP; Apart from JADE, many other DE variants have also been proposed. For example, in [27], a Success-History based Adaptive DE (SHADE) is proposed, in which the information of recently successful parameter settings is used to adaptively adjust the parameters of DE. LSHADE [28] extends SHADE algorithm by embedding a linear population size reduction scheme. The performance of LSHADE algorithm is further enhanced by Awad et al. [29], in which the values of F and CR are configured self-adaptively. In [30], Brest et al. devised a DE variant called iLSHADE to improve LSHADE by employing a memory based parameter updating mechanism. In [25], Awad et al. proposed to adaptively set the control parameters by using an ensemble sinusoidal mechanism. While in [31], Meng et al. tried to improve traditional DE by suggesting a grouping strategy for parameter adaptation along with a parabolic population size reduction scheme (PaDE). In this method, a grouping strategy is suggested to properly set the crossover rate during evolution along with a parabolic population size reduction scheme, which is devised to tackle the drawback of linear population size reduction scheme. Additionally, a time stamp based mutation strategy is introduced to avoid inferior solutions as well as to enhance the diversity of external individuals. In [33], Cui et al. proposed to enhance traditional DE based on the concept of working specialization. In this method, the population is divided into sub-populations, which are assigned with different DE strategies for either exploitation or exploration. In [34], a hybrid framework, named HMJCDE, which combines the merits a modified JADE (MJADE) and a modified CoDE (MCoDE), is developed to deal with global optimization problems. The results show HMJCDE is generally able to outperform state-of-the-art methods in terms of accuracy, convergence rate and robustness. In [35], Cui et al. tended to strengthen the performance of DE by obtaining the guidance of each individual from multiple elites concurrently and independently. Apart from DE, the recently proposed nature-inspired algorithm, named gaining-sharing knowledge based algorithm (GSK) [32], has also shown to be promising for global optimization. The GSK, which is inspired by the process of gaining and sharing knowledge during the human life span, is found to compare favorably to many biologically inspired and nature inspired optimization methods.
Niching methods can assist EAs to preserve the capability to explore the search space by maintaining a diverse population. Thomsen [36] proposed a crowding-based DE to preserve population diversity. Li et al. [37] proposed a fitness sharing niching scheme based Particle Swarm Optimizer (PSO) for optimization. Gong et al. [38] employed the neighborhood information of individuals to divide a population into niches. In [15], species and crowding-based niching methods were incorporated into DE to deal with multimodal optimization problems. In [39], a structure of hierarchical tree is used to form niches for preserving population diversity during evolution. In this method, nodes located at the top level of the tree denote centers of niches while the nodes beneath each of them represent the individual members of that niche. In [40], a niching DE algorithm, termed as automatic niching DE, was proposed for solving multimodal optimization problems.
Typically, the niching based EAs are based on a single niching scheme. Since different niching methods possess different diversity preserving properties, employing a fixed niching method during EA evolution may have a limited performance. To address this issue, the idea of ensemble [22], [24] has been adopted to employ multiple niching schemes. The existing ensemble niching methods generally differ in the following three aspects: the components of niching candidate pool, the configuration of parameter candidate pool, and the rule of selecting candidate niching schemes. For example, in [23], four niching methods are assigned to four subpopulations and each is responsible to evolve one subpopulation. In [41], an ensemble of restricted tournament selection (RTS) with different window sizes is incorporated into a DE algorithm for global optimization. Similarly, an ensemble of clearing method with different parameter values has also been proposed and embedded into DE in [14] to handle multimodal problems, where the population is divided into three subpopulations and each of them is assigned with a parameterized clearing method. These ensemble strategies could be used to alleviate the issue of using a single fixed niching method to a certain extend. However, the niching schemes used in these strategies are fixed, rather than allowing to dynamically change according to the search situation, thus having a limited performance.
It should also be mentioned that the performance of niching schemes rely on the setting of their parameters. For example, in the fitness sharing scheme, the parameter of niche radius is closely related to its effectiveness. It has been widely established that using a fixed niche radius during evolution may significantly deteriorate its performance. Many attempts have therefore been made to cope with this issue by proposing adaptive parameter control strategies. For example, in [42], Jelasity and Dombi adopted a radius function to control niche radius. While Dick [43] employed the fitness landscape information to identify a single uniform niche radius for all niches. Other works to adaptively set the niche radius can be found in [44]- [46].

III. CO-EVOLUTIONARY NICHING BASED DE (CoENDE)
The basic idea of co-evolutionary niching strategy lies in the synchronous mutating and updating niching schemes during the process of evolution. In the proposed method, the mutation probability associated with each niching scheme is generated based on a criterion, which considers both fitness improvement and population diversity resulting from its application during evolution. In the following sections, we should describe the details of proposed strategy. The proposed strategy will be incorporated into JADE for evaluation. In JADE, the optional archive A is used to record a set of inferior solutions. Apart from the optional archive A, in our method, an external archive B has also been introduced to preserve the elites found at each population.
The procedure of the resulting method, CoENDE, is presented in Algorithm 2.
Select niching scheme N j from NPop; According to N j , generate matingPool using Algorithms 3-5; Use Eq. (1) to generate a trial vector; Calculate div 1 and div 2 using Eqs. (7) and (8); Update the sliding window: SW j ← min{f (x best ), Sr j (−1)}; Calculate Prob and Q according to Eqs.  niching replacement component in RCP along with their associated parameters. As the six niching components possess different degrees of selection or replacement pressure, the formed niching schemes thus have various characters for diversity preservation. Consequently, the NCP can be appropriately used to serve for dynamic niching scheme generation and evolution.  The same operation is carried out between d2 with c2.

B. CoEN FRAMEWORK AND WORKFLOW
The proposed CoENDE consists of three layers: a strategy layer, an interface layer and an optimization layer, as shown in Fig. 1. The strategy layer contains the proposed CoEN strategy for evolving niching schemes. The interface layer is responsible for niching scheme generation, while the optimization layer contains a niching-based DE for evolution. The workflow of CoENDE is shown in Fig. 2. In CoENDE, the evolution is divided into multiple evolution periods (i.e., the learning period, L P ). At each generation, an offspring population will be generated from the parent population. During each generation, certain well-performed pivotal parameter values will be memorized in predefined memories. At the end of each generation, the control parameters of each individual x i will be independently generated according to certain adaptive rules. At the end of L P generations, the goodness of niching schemes will be evaluated and the results will be used to evolve the niching schemes, details of which are presented in the following subsections.

C. NICHING EVOLUTION
The niching evolution aims to dynamically evolve appropriate niching schemes according to the current search situation. Specifically, the process of niching evolution works as follows. First, a niching scheme population (NPop) is initialized by randomly selecting one component from selection and replacement component pools, respectively, as: where m is the size of NPop, N k (k = 1, · · · , m) is the kth niching scheme. Then, all niching schemes will be employed at least once to evolve the population. After that, all niching schemes will undergo a mutation operation, the probability of which is calculated as: where P k and Q k denote the mutation probability and the goodness of kth niching scheme, respectively. Q is a set of goodness evaluations of all niching schemes and max(·) denotes the maximum value. Based on the above equation, a niching scheme suitable to the current evolution stage will have a small probability to be mutated. Otherwise, a high mutation probability will be assigned. The mutation operation works by randomly changing the selection and replacement component of the niching scheme.
To evaluate the goodness of a niching scheme, both the fitness improvement (FI) and population entropy (PE) have been considered. To be specific, we calculate the goodness of a niching scheme based on a weighted sum of the ranking score of FI and PE as: where score( * ) means a ranking score and ω ∈ (0, 1) is a weight factor. PE, which refers to the entropy of current population, is calculated as: where p i is the probability of ith non-repeating individual, np refers to the number of different individuals and log( * ) is a logarithm function. To calculate FI , the best fitness values achieved so far are first stored in a queue memory (i.e., sliding window). Fig. 3 shows the structure of a queue memory. Then, FI is defined as the sum of difference between the best fitness value and each solution in the queue as: where W represents the length of sliding window (denoted by SW ), SW (i) denotes the fitness of i th solution stored in SW and f best is the best fitness value in the current generation. During evolution, if the number of individuals exceeds the predefined value of W , the earliest individuals will be removed.

D. PARAMETER ADAPTATION
The niching scheme generated with the six niching components involves certain parameters, which need to be properly set. For niching selection components, here we introduce a distance-based self-adaptive strategy to set their corresponding parameter values. Specifically, for each selection component, the corresponding parameter value is first randomly initialized with a normal distribution of mean µ s and standard deviation 0.5. At the beginning of each generation, the disparity between the current population and trial vectors is then calculated as: where dist(·) denotes the Euclidean distance, div 1 and div 2 indicate the population diversity, x i and x j are the i th and j th VOLUME 9, 2021 individual of population, and u j represents the j th trial vector. The current parameter setting is thought to be successful if div 2 is larger than div 1 . In this case, the current parameter value will be saved in a variable S S , and the parameter value is updated using an arithmetic mean of the values in S S . For the replacement component, successful parameter values will be first recorded in a variable S R . Then, a Lehmer mean of S R will be used to update these parameters at the end of each generation. The updating rules for parameters of niching schemes are defined as: For the crossover probability and mutation factor in DE, they are updated according to the scheme proposed in JADE. Specifically, at each generation g, the crossover probability CR i and mutation factor F i of each individual x i are calculated independently based on a normal distribution of mean µ CR with standard deviation 0.1 and a Cauchy distribution with location parameter µ F and scale parameter 0.1, respectively. At the end of each generation, µ CR and µ F are updated as: where µ CR j and µ F j are initialized to be 0.5.

A. DATA SETS AND PARAMETER SETTING
Twenty-four benchmark functions taken from [47] and [48] have been used to evaluate the performance of CoENDE. The functions to be tested involve multimodal functions with shift and rotation, hybrid functions and composition functions.  Table 1, where D denotes the dimensionality of the function, F min is the optimal value of the function and NGs denotes the number of maximum generations. The settings of parameters of the proposed method are listed in Table 2.
In the proposed method, the purpose of introducing L P is to make sure that the algorithm could run for a period of time in the current state to acquire more information from the optimization process. While, the SW is a queue memory, which is used to dynamically store the best fitness values achieved so far by each niching method. The value of L P and the size of SW are set experimentally. Specifically, ten trials of the algorithm with various values of L P and SW sizes have been tested. The one, which delivers the best average fitness, has been chosen to set the value of L P and the size of SW .

B. COMPARED ALGORITHMS
The proposed CoENDE will be compared with the following algorithms:    In the above algorithms, A1-A9 are classical or recently proposed algorithms, while A10-A14 are variants of the proposed CoENDE, in which one or two niching schemes are employed. All experiments are implemented using MATLAB on a machine running operation system of Ubuntu (version 16.04) with Intel Core i7-8700 and RAM of 16 GB DDR4. To make the comparison fair, all algorithms are implemented using the same population sizes and maximum number of generations. Unless otherwise stated, we report the mean and standard deviation of the difference between obtained best fitness and known optimal values as well as the Wilcoxon rank-sum test at a significance threshold of p < 0.1. It should also be noted that the difference is considered to be 0 when it is equal or smaller than 1E10 − 8, which is a guideline of CEC2014 test suit.

C. COMPARING CoENDE WITH NON-ENSEMBLE BASED RELATED ALGORITHMS
In this section, we report the results of comparing our proposed method with JADE, JADE based algorithms including SHADE, LSHADE and LSHADE-EpSin as well as two recently proposed algorithms including PaDE and GSK. The results are shown in the top rows of Tables 3 -6. A summary of Wilcoxon rank-sum tests of the results between our method and related methods is given in Table 7. The results show CoENDE is generally able to deliver better or comparable performance than the six related algorithms. Specifically, on the functions f 1 -f 10 , the performance of our method is better than JADE, SHADE, LSHADE-EpSin, GSK and PaDE while comparable with LSHADE. On the hybrid and composition functions f 11 -f 24 , our method could achieve better results than JADE, SHADE, LSHADE and LSHADE-EpSin, comparable performance than GSK while being slightly outperformed by PaDE. Based on the results, clearly, our proposed method is able to appropriate search the space, thus delivering a competitive performance.

D. COMPARING CoENDE WITH ENSEMBLE AND HYBRID EA ALGORITHMS
This section reports the results of comparing CoENDE with two ensemble niching based EAs (i.e., ENA [23] and EPSO [50]) and a hybrid niching based DE (NShDE [51]). In ENA, several existing niching methods are combined into a unified GA framework for optimization. EPSO, on the other hand, focuses on the ensemble of different PSO variants for optimization. NShDE is a sharing-based hybrid niching DE algorithm, in which a neighborhood based mutation is also proposed to generate offspring. The results are reported in middle rows of Tables 3 -6. Table 7 shows the results of significance tests.
From the results in Tables 3 -6, we can see that CoENDE generally performs better than ENA, EPSO and NShDE on most functions in term of the mean value. Among the three algorithms to be compared, we can find that NShDE could have the best performance, which is followed by EPSO and ENA. The results of significance tests show that the better performance of CoENDE comparing to ENA, EPSO and NShde is generally statistically significant. The worse performance of ENA and EPSO is mainly due to they employ a fixed niching pool, which limits their performance. It should be noted that our method could be outperformed by NShDE on a few functions to be tested. However, in most of the cases, the better performance of NShDE is not statistically significant.

E. COMPARING CoENDE WITH ITS VARIANTS
To examine the effectiveness of proposed CoEN strategy, we compare CoENDE with its variants: rather than the CoEN strategy, only a fixed one or two niching schemes is incorporated into the proposed framework for optimization. Five variants (i.e., CrowdingDE, DCDE, RTSDE, DCGRSDE, and RTSGRSDE), in which strandard crowding, DC, RTS, GRS and their combinations are employed as the niching scheme, have been compared. The results are reported in last rows of Tables 3 -6. The results show that the two RTS-based variants perform relatively better than the rest three variants. However, all variants are outperformed by CoENDE. From Table 7, the results show that the better performance of our method than its variants is generally statistically significant. This may reveal that CoEN can overcome the drawback of employing fixed niching schemes.

V. CONCLUSION
Niching technique has been widely investigated in EA community and existing studies generally focus on employing a single fixed niching scheme or several niching schemes in an ensemble manner. To effectively solve various optimization problems, different niching schemes may be required for an EA to work properly. Even for a specific optimization problem, the most suitable niching scheme may be different at different stages of the evolutionary process. In this paper, we propose to co-evolve the niching schemes, thus allowing appropriate niching schemes to be dynamically generated and employed during DE evolution. The performance of the proposed method has been evaluated on the benchmark functions from CEC2019 and CEC2014. The results show the significance of proposed evolutionary niching scheme and the resulting method could achieve better or comparable performance than related methods.
The proposed method can be extended in a few aspects. Firstly, it is desirable to employ more niching components to evaluate the effectiveness of the proposed niching evolution scheme. Secondly, as the evolved niching schemes are able to preserve multiple niches in the population, the proposed method can be suitably modified to address multimodal optimization problems. Finally, it is interesting to incorporate the devised niching evolution strategy into other meta-heuristics, such as PSO, for optimization.