An Adaptive Mechanism With Cooperative Coevolution and Covariance for Differential Evolution

Differential evolution (DE) is an evolutionary algorithm widely used to solve optimization problems with different characteristics in fields where actions and decisions depend on numerical data such as engineering, economics, and logistics. In this paper, an adaptive differential evolution mechanism with cooperative co-evolution and covariance (A-CC/COV-DE) is proposed to overcome the low efficiency of differential evolution when solving large-scale numerical optimization problems, especially when the correlation between the variables of the problem is unknown. An unknown correlation of variables hinders DE from achieving an optimal search process since different types of correlations ideally require distinct optimization strategies. According to the separability of variables, the appropriate evolutionary strategy is selected adaptively. For separable functions, cooperative coevolution is adopted. After using extended differential grouping to split the problem, the sub-components are optimized by differential evolution. This reduces the dimensionality and complexity of the problem, improving its convergence speed and global search ability. For non-separable functions, a covariance matrix is calculated, and then the eigenvector is used to rotate the coordinate system. This leads to eliminate the correlation between variables and improve the search efficiency of differential evolution. We evaluated the performance of A-CC/COV-DE on the CEC 2014 test suite and compared it with state-of-the-art differential evolution algorithms. The experimental results show that our proposal is quite competitive with recent algorithms.


I. INTRODUCTION
Differential evolution (DE) is a random search algorithm for numerical optimization problems inspired by natural species evolution [1]. In DE, a mutant vector is generated according to a mutation strategy, and then an experimental vector is generated according to the crossover operator [2]. Finally, the mutated vector is compared with the experimental vector, and the individuals with better fitness function value are reserved for the next generation. Since DE has an important influence in the field of evolutionary computing, researchers have proposed many improvement strategies for DE. For example, Sun et al. [3] adopted a novel Gaussian mutation The associate editor coordinating the review of this manuscript and approving it for publication was Jolanta Mizera-Pietraszko . operator and a modified common mutation operator to collaboratively produce new mutant vectors. In [4], they also proposed an elite representative based individual adaptive regeneration framework (EIR) that can be incorporated into any DE variant easily. In [5], Sun et al. proposed a time-varying strategy-based DE algorithm (TVDE), a novel simple variant of DE. In [6], Deng et al. designed an adaptive dimension level adjustment framework (ADLA) to relieve the premature convergence or stagnation problem faced by the DE algorithm. In [7], an elite regeneration framework for differential evolution (ERG-DE) was proposed, where a new individual is produced from the search space around each elite individual by sampling Gaussian or Cauchy probability models. In [8], Zhang and Sanderson proposed an adaptive differential evolution with an optional external archive (JADE).
However, variable interactions in objective functions (i.e. whether the variable is separable or not) has an important influence on evolutionary operators [9]. The operator of the original difference algorithm has rotational variability, and the evolution efficiency will be affected when optimizing the function of variable correlation. In [10], the CoBiDE (differential evolution based on covariance matrix learning and bimodal distribution parameter setting) algorithm proposed by Wang et al., combines DE with a crossover operator based on eigenvector to form a new DE algorithm with rotation invariance. When some correlations between variables exist, the covariance matrix of the sample points can be used to describe them. The feature vector and eigenvalue are used to map the sample points into the new space to eliminate the correlation between variables.
In [11], Potter et al. proposed a framework named cooperative coevolution (CC) combined with the classic genetic algorithm as an evolutionary approach to function optimization. In [12], cooperative coevolutionary differential evolution (CCDE) was proposed to combine CC with Differential Evolution for solving large-scale optimization problems, the optimization problem is decomposed into several independent subcomponents, and then these subcomponents are optimized at the same time. The experiments show that it has a good effect when solving problems with uncorrelated variables.
According to the CC framework, if variables are grouped randomly, the optimization process for some variables may interfere with the correct evolution of other variables. Ideally, only interacting decision variables should be assigned to the same subcomponent.
In reference [13], Yao et al. proposed the differential grouping method (DG), which was added to the CC framework. This new strategy detects and assigns the relevant decision variables to the same subcomponent. Since DG can only detect direct correlations between variables but cannot detect indirect correlations, the decomposition degree of the decomposition method is relatively low in some test functions. On this basis, an extended differential grouping (XDG) method is proposed in [14], which can decompose the variables of optimization problems appropriately. By embedding XDG into the CC framework, the optimization problems can be decomposed into several subcomponents, where variables are related if they belong to the same subcomponent and are not related if otherwise. Therefore, this paper proposes an adaptive mechanism with cooperative coevolution and covariance for differential evolution. The main contributions are as follows: (1) After analysis, we found that cooperative coevolution is effective when solving problems that separable variables, but its performance in problems with only non-separable variables is relatively poor. Differential evolution with covariance can analyze the characteristics of samples and rotate the original coordinates according to the rotation invariance feature vector to eliminate the correlation between variables and improve the algorithm's performance. (2) In the case of different correlation of variables, the CC framework and the rotation method based on covariance and eigenvector are analyzed, respectively. (3) A new learning mechanism based on coevolution and covariance is proposed, which automatically chooses between the differential evolution algorithm based on the CC framework (CCDE) and the differential evolution algorithm based on covariance (COVDE). Under the premise of unknown correlation of objective function variables, comparable experimental results are obtained. The structure of the paper is as follows. Section II presents the background for this work. Section III presents an adaptive mechanism with cooperative coevolution and covariance for differential evolution, called A-CC/COV-DE, which is used in these experiments. In Section IV experimental results of the A-CC/COV-DE algorithm on benchmark functions are presented. Section V concludes this paper with some final remarks.

II. RELATED WORK A. DIFFERENTIAL EVOLUTION
Differential evolution is a population-based metaheuristic algorithm [15]. Similar to other numerical optimization algorithms, DE uses a set of real parameter vectors x i = (x 1 , . . . , x D ), i = 1, 2, . . . , N where D is the dimension of the optimization problem and N is the population size. At the beginning of the search, the individuals in the population are randomly initialized, and the mutant vector v i,G is generated in the evolution process, the test vector u i,G is generated by the crossover operation. Finally, individuals with better fitness values are retained in the G generation according to the selection. The three key steps in Differential Evolution are mutation, crossover and selection: (1) Mutation: this operation uses the target vector The subscripts r1, r2 and r3 are randomly selected from [1, N], but should be different from i, the parameter F is a positive scaling factor. (2) Crossover: this operation uses the mutated vector v i,G and the target vector x i,G to generate a test vector u i,G . The common crossover operation of DE is binomial crossover: where rand(0, 1) represents a uniform random number between (0, 1), j rand is the subscript of decision variables selected uniformly and randomly from [1, D], CR is the crossover rate. VOLUME 9, 2021 (3) Selection: DE uses greedy selection, which chooses the individual with the better fitness between the target vector x i,G and the test vector u i,G : The operation of DE is relatively simple, and it can quickly find optimal solutions in low-dimensional problems. However, it becomes poor in high-dimensional problems. Therefore, to solve this problem, the framework of cooperative coevolution is proposed.

B. COOPERATIVE COEVOLUTION
Cooperative coevolution (CC) is often used to solve largescale optimization problems. Many researchers have done a lot of work on CC [16]- [22], and it has been widely used in various fields [23]- [25]. CC decomposes a high-dimensional optimization problem into multiple low-dimensional subproblems and then cyclically optimizes those multiple subproblems [26]- [28]. As the name implies, the difficulty of cooperative coevolution lies in the method for decomposing the optimization problem. There are several decomposition techniques, such as static grouping [29], random grouping [12], [30] and grouping strategies based on variable interactions [31].
The most common decomposition method is random grouping, that is, the decision variables are randomly assigned to subcomponents in each evolution cycle. From the mathematical point of view, the probability of dividing two interacting decision variables into the same group in several evolution cycles is quite large, but when the number of interactive decision variables is greater than 2, the random grouping becomes uncontrollable and the evolution performance is getting poorer. Delta grouping is similar to random grouping [32], this method decomposes a high-dimensional problem into several low-dimensional subproblems. It is necessary to pre-define the decomposition of decision variables into k-groups. If there are many interacting variables in the optimization problem, a large k-value will affect the performance of the algorithm negatively. However, for a smaller number of interacting variables, a small k-value will reduce the performance of the algorithm. In short, once the value of k is determined, the decision variables will be decomposed into k groups of a fixed size, which may be disadvantageous to practical optimization problems.
Differential grouping (DG) is an automatic decomposition strategy [13], which allocates decision variables to subcomponents, and is derived from the definition of partial additive separable functions. DG algorithm checks the interaction between each of the variables. If there is an interaction between two variables, both are placed into the same subcomponent; if a variable does not interact with any other variable, it is considered a separable variable. DG method can detect simple direct variable interactions but it performs poorly when identifying complex interactions. Therefore, an extended differential grouping (XDG) method was proposed to solve the shortcomings found in DG.
Extended differential grouping can capture two types of interaction between variables. As shown in Figure 1, the variables of Type I interact directly, eg. x 1 and x 2 (or x 2 and x 3 ) interact directly, and the variables of Type II interact indirectly, eg. x 1 and x 3 are linked by x 2 . The formal definition of interacting types is listed below: and ∃ a set of variables {x k1 , . . . , Variables x i and x j are independent with each other if for all candidate solutions, (5) holds and a set of variables {x k1 , . . . , XDG decomposes the optimization problem into several subproblems according to the correlation between decision variables of the objective function [14]. Algorithm 1 shows the pseudo-code of the XDG method, which is mainly divided into three stages. The first stage is to determine direct interactions between variables in lines 3-25. It executes pairwise comparisons once between each decision variable to detect the interactions. The detection works by evaluating a series of vectors with small perturbations of each pair of variables. The difference in fitness values is checked to decide if the variables are dependent or independent of each other. The second stage is aimed to identify indirect interactions between variables in lines 26-33. It simply searches for overlaps between the resulting groups from the first stage and merges the groups with common variables. If two sub-components have the same decision variable, they will be merged until all the sub-components do not intersect. The third stage is to group all separable variables into the same subcomponent in lines 34-40. It is performed by merging every group that contains only one decision variable into a single group. The decision variables contained in each subproblem are interactive and non-separable, but each group of subproblems can be separated.
Algorithm 2 shows the pseudo-code of the CC framework. The algorithm is mainly divided into two stages: the grouping stage and the optimization stage. The grouping stage is performed by XDG (Algorithm 1). The optimization stage optimizes the subcomponents formed in the grouping phase. For simplicity, the optimization process uses differential evolution. However, even if XDG is added to the CC framework, optimizing non-separable intragroup variables remains a problem. Therefore, an adaptive selection mechanism with two different methods is proposed. The first method is to add XDG to the CC framework, and then use the DE mutation operation to optimize; the other method is to add covariance to the DE process for optimization, which can make up for the limited performance of CC in the optimization process of non-separable problems.

C. COVARIANCE
It is usually difficult to determine the optimal search direction in the evolution process by using Differential Evolution by itself, especially when the variables are related. In this section, we use the covariance matrix to analyze the features of the sample points. Additionally, we use the eigenvector to transform the points in the original Cartesian coordinate system to a new coordinate system and eliminate the correlation between variables according to the rotation invariance [33].
According to the best N/2 individuals in the population after differential evolution, the covariance matrix is calculated: where cov(i, j) is the covariance of the ith and the jth dimensions of the first N/2 individuals in the current population, which is calculated as: cov(P 1:N /2 ) is decomposed as follows: where R is the D × D orthogonal matrix coordinate system representing the features, and each row of R is the covariance matrix cov(P 1:N /2 ), R represents the transformation from the characteristic coordinate system to the original coordinate system, and ∧ is a diagonal matrix composed of eigenvalues.
An individual x i in the original Cartesian coordinate system is expressed in the feature coordinate system as follows: Using the search equation of differential evolution, the candidate solution v j is generated in the characteristic − → where r1, r2 and r3 are randomly selected from [ system to be evaluated against the current population:

III. A-CC/COV-DE A. MOTIVATION
Differential evolution has clear advantages in some optimization problems, but it is known that the performance of differential evolution in large-scale optimization problems will get more complicated. The most popular methods to solve large-scale optimization problems are based on grouping and dimensionality reduction, with the CC framework being one of these approaches. We observe that the performance of the cooperative coevolution algorithm is better for optimization problems with separable variables, but the performance of the CC algorithm is relatively poor in optimization problems with non-separable variables. For example, if a 10-dimensional problem is fully separable, CC will decompose the individuals into 10 one-dimensional ones, and then each portion of the original individuals are solved separately. When the dimensionality increases, CC can accelerate the convergence speed of the algorithm when solving this kind of problems. For non-separable problems, CC will split the individuals, allocating related variables into the same subcomponent. Therefore, during the evolution, the convergence speed and global search ability of the algorithm will be affected, especially when the dimensionality is increasing. Figure 2 shows the process of generating a mutant vector v i,G+1 in the two-dimensional cost function. As shown in Figure 3, the construction of the characteristic coordinate system is used to release the correlation between variables. In the characteristic coordinate system, the mutant vector is closer to the global optimal solution, which helps to improve the convergence of the algorithm.  In differential evolution, the covariance matrix is added to analyze the characteristics of sample points. According to the eigenvector, the points in the original Cartesian coordinate system are transformed into the feature coordinate system to eliminate the correlation between variables. Our experiments show that for the optimization problem with non-separable variables, using covariance to analyze variables in differential evolution can improve the performance of the algorithm. For the problems with separable variables, the performance of the algorithm is poor. In practical optimization problems, it is difficult to know beforehand whether variables can be completely separated or not. Therefore, to improve the optimization performance of the algorithm, we propose an adaptive evolutionary mechanism based on cooperative coevolution and covariance.

B. ALGORITHM FRAMEWORK
The flow chart of the proposed algorithm is shown in Figure 4, consists of two parts. The left part is cooperative coevolution with differential evolution as optimizer (CCDE). Firstly, in the framework of CC, extended differential grouping (XDG) is used to group variables according to the correlation between the variables, and then differential evolution is used to evolve the sub-components after grouping in a cyclic manner. The part on the right is COVDE. Firstly, differential evolution is used to evolve the population, and then covariance is used to analyze the characteristics of variables (to describe the correlation between variables). DE operator with covariance has rotation invariance. According to the characteristic vector of samples, the points in the coordinate system are transformed into a characteristic coordinate system, which eliminates the correlation between variables and is conducive to improving the performance when evolving non-separable problems. Finally, the two algorithms are adaptively selected according to the success rate of each evolutionary algorithm (the ratio between the generated offspring that is successfully selected for the next generation in all generations and the total number of functions evaluations used by each algorithm), so that the proposed method can have better optimization performance for variable dependent or unrelated optimization problems.

C. AN ADAPTIVE EVOLUTIONARY SELECTION MECHANISM
According to the experimental observations, using XDG along with CC and differential evolution, has a good effect on problems with separable variables, but the results of functions with partially separable as well as non-separable variables are far from ideal, especially when compared with COVDE. To improve the performance of the algorithm, adaptive selection is conducted between CCDE and COVDE. After each generation, a probability p is calculated as follows: where P represents probability for CCDE as selected for the next generation, which is calculated by its relative success rate. The overall success rate of CCDE and COVDE are depicted as p1 and p2, respectively, and are calculated VOLUME 9, 2021 as follows: where ω1 and ω2 represent the number of times CCDE and COVDE were successful in the evolutionary process, respectively. n 1 and n 2 represent the evolution times of CCDE and COVDE, respectively, and k is the number of XDG groups. The selection probability calculation does not depend on the setting of any parameters, it only depends on the fitness value of the two evolution methods involved in the evolution process, which makes it simple and straightforward. If p>0.5 at the start of a generation, the success rate of CCDE is greater than COVDE's. Hence, CCDE will be the optimizer for that generation. Otherwise, the evolution will continue with COVDE. The advantage of this method is that whether the optimization problem can be separated or not, it can obtain better optimization results. The pseudocode of the adaptive evolutionary selection mechanism is given by Algorithm 3.

D. ALGORITHM COMPLEXITY ANALYSIS
To objectively compare the performance of the algorithms, here we mainly analyze the complexity of the proposed algorithm and differential evolution algorithm. The computational complexity of the algorithm can be divided into two parts: time complexity and space complexity. The spatial complexity of the algorithm is the same as DE, with the addition of O (D2) to store the covariance matrix in COVDE, and O (N) to store the group assigned for each dimension in CCDE.
For the DE algorithm, its time complexity is mainly related to the number of individuals in the population and the problem dimensionality. The population size is N, and the problem dimensionality is D. Then, the algorithm has a time complexity of O (N · D) for a single iteration, being O (N · D · Gen) for the whole run, where Gen is the number of generations given by FES/N.
For the proposed algorithm A-CC/COV-DE, CCDE has a complexity of O (N · D · S), where S represents the number of groups calculated by XDG. Since the maximum number of possible sub-populations in a problem is D, this complexity can also be represented as O (N · D2). For a whole run of the algorithm, the complexity is given by O (N · D · S · Gen/S), therefore it has the same time complexity as DE. The second part of A-CC/COV-DE, which is COVDE, also has a complexity of O (N · D2), whose upper bound is given by the calculation of the covariance matrix.
Since both sections of the algorithm have the same complexity, we can conclude that the overall time complexity of A-CC/COV-DE is O (N · D2).

IV. EXPERIMENTS AND ANALYSIS
In this section, experimental tests are conducted and their results are presented and discussed.

A. EXPERIMENTAL SETTINGS
To evaluate the performance of the proposed algorithm, A-CC/COV-DE is compared with CCDE, COVDE, DE,   The subcomponent optimizer used in this paper is DE, the adaptive crossover rate is 0.9, the scale factor is 0.5, the population size N is (2D), The maximum number of iterations FESmax is set to D * 10000, and D=30, 50 is the function dimension. In this paper, the experiments are performed independently 50 times.

B. COMPARISON OF A-CC/COV-DE RESULTS WITH OTHER ALGORITHMS
To evaluate the performance of the proposed algorithm, this paper uses CCDE, COVDE, DE, ADLADE, CSDE, TVDE in the CEC2014 benchmark suite as a comparison, Kruskal VOLUME 9, 2021 between the proposed algorithm and the other compared algorithms. The test results of the compared algorithms are marked as ''+/−/∼'', which means that the marked value is worse, better, or similar to those obtained by A-CC/COV-DE, respectively. The test results for D=30 and D=50 are shown in Table 1 and Table 3, and the average convergence plots for some representative benchmark functions are presented in Figure 5 and Figure 6, respectively. The values highlighted in bold represent the best average results for each benchmark function.
It can be seen from Tables 1 and 3 that the results confirm our initial hypothesis, cooperative coevolution is effective when solving problems that contain separable variables, but its performance in problems with only non-separable variables is relatively poor. For example, in the separable problems F8 and F10, the performance of CCDE is better than COVDE, while in non-separable problems, the performance of the COVDE algorithm is better than CCDE. The cause for this is the ability of differential evolution with covariance to analyze the characteristics of samples and rotate the original coordinates according to the rotation invariance feature vector to eliminate the correlation between variables and improve the algorithm's performance. The results of A-CC/COV-DE algorithm are better than those of CCDE, COVDE and DE in most functions, which means that the adaptive selection of the appropriate evolutionary method based on the correlation of variables can improve the efficiency of the algorithm. For D=30, the experimental results of our algorithm are similar to ADLADE, CSDE and TVDE. For D=50, A-CC/COV-DE performs better than ADLADE, CSDE and TVDE. This means that as dimensionality increases, A-CC/COV-DE show better signs of robustness and scalability when solving separable problems as well as non-separable problems, especially in functions F17, F18, F19, F20. Hence, these results show some competitive advantages compared with other algorithms. Tables 2 and 4 show the p values for the Wilcoxon rank-sum tests of A-CC/COV-DE vs. CCDE, COVDE, DE, ADLADE, CSDE, TVDE over 50 independent runs on the CEC2014 benchmarks with 30D and 50D.
All the algorithms in the experiments were also compared by the Friedman test carried out on the medians of   Table 5. Considering the average mean ranks, we can see that CSDE and TVDE show a better overall performance, indicating a possible good balance to solve a wide range of problems. However, as can be seen from Table 1 and Table 3, A-CC/COV-DE has the largest number of wins among the 30 benchmark problems, depicted in bold typeface in the tables. This acts as an indicator of the strong ability of our algorithm for converging towards the global optimum. The performance is especially remarkable in hybrid functions with complex landscapes and a set of different mixed properties.

V. CONCLUSION
Over the past many years, researchers have made a lot of exploration on optimization problems. As a numerical optimization method, differential evolution is popular in optimization problems due to its simplicity, but it is difficult to calculate the optimal solution for high-dimensional and complex optimization problems. Cooperative coevolution resolves the shortcomings of differential evolution to some extent. In this paper, the grouping method used in the CC framework is the Extended Differential Grouping (XDG) algorithm, which excels on detecting interacting variables and grouping them. Through experimental analysis, it is found that the performance of CCDE is better when the variables are separable (variables are not related), and the performance of COVDE on most problems is better when the variables are not separable (variables are related).
Since the correlation of variables is not clear in reality, we propose an adaptive mechanism based on cooperative coevolution and covariance. The mechanism is divided into two parts, including two evolutionary methods. One method is to add XDG into the CC framework, and then use differential evolution to optimize each subcomponent; the other one is to optimize the population with differential evolution, and then use covariance to analyze the characteristics of the resulting population. The better evolutionary direction is determined at the beginning of each generation by adaptive selection based on the success rate of both methods, and the better evolutionary direction of each generation is obtained by adaptive selection based on the fitness of the population.
Most of those existing adaptive DE variants are intended to update operators and control parameter settings. Instead, this paper proposed an adaptive mechanism with cooperative coevolution and covariance for differential evolution proposed which chooses between two evolutionary methods according to the fitness during the evolution so that CCDE can solve problems with separable variables and COVDE can solve problems with non-separable variables.
An adaptive mechanism with cooperative coevolution and covariance for differential evolution proposed in this paper can effectively solve optimization problems with unknown separability. Compared with COVDE, CCDE, DE, ADLADE, CSDE and TVDE, A-CC/COV-DE overperforms the results of its counterparts, showing a competitive advantage when solving the optimization problems in the CEC2014 benchmark suite. However, the algorithm we proposed still has some deficiencies in some aspects, which will be improved in subsequent research.
JUAN DIEGO PRADO received the B.S. degree in computer engineering from Universidad del Valle, Cali, Colombia, in 2017. He is currently pursuing the M.S. degree in computer application technology with the Faculty of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China. His current research interests include evolutionary algorithms, machine learning, artificial intelligence, and software development.
WENJUAN HE received the B.S. degree in computer management and application from the Xi'an University of Finance and Economics, Xi'an, China, in 1997, and the M.S. degree in computer science and technology from the Xi'an University of Technology, Xi'an, in 2004. She is currently working with the Shaanxi Key Laboratory for Network Computing and Security Technology. Her current research interests include evolutionary algorithms and artificial intelligence.
HAIYAN JIN received the B.S. and Ph.D. degrees from Xidian University, Xi'an, China, in 2005 and 2007, respectively. She is currently a Professor with the Faculty of Computer Science and Engineering, Xi'an University of Technology, Xi'an. Her current research interests include computer vision, image processing, intelligent information processing, and intelligent optimization algorithms.
QIAOYONG JIANG received the B.S. degree in applied mathematics from Wenzhou University, Wenzhou, China, in 2008, the M.S. degree in computational mathematics from the Beifang University of Nationalities, Yinchuan, China, in 2011, and the Ph.D. degree in pattern recognition and intelligent systems from the Xi'an University of Technology, Xi'an, China. He is currently a Lecturer with the Faculty of Computer Science and Engineering, Xi'an University of Technology. His current research interests include meta-heuristics, evolutionary computation, and multi-objective optimization.
XIAOFAN WANG received the B.S. and M.S. degrees from the Xi'an University of Technology, Xi'an, China, in 1999 and 2003, and the Ph.D. degree in computer science and engineering from Xidian University, Xi'an, in 2012. He is currently an Associate Professor with the Faculty of Computer Science and Engineering, Xi'an University of Technology. His current research interests include data mining, machine learning, and intelligent information processing.