Two-Archive Evolutionary Algorithm for Constrained Multi-Objective Optimization

When solving constrained multi-objective optimization problems, an important issue is how to balance convergence, diversity and feasibility simultaneously. To address this issue, this paper proposes a parameter-free constraint handling technique, two-archive evolutionary algorithm, for constrained multi-objective optimization. It maintains two co-evolving populations simultaneously: one, denoted as convergence archive, is the driving force to push the population toward the Pareto front; the other one, denoted as diversity archive, mainly tends to maintain the population diversity. In particular, to complement the behavior of the convergence archive and provide as much diversified information as possible, the diversity archive aims at exploring areas under-exploited by the convergence archive including the infeasible regions. To leverage the complementary effects of both archives, we develop a restricted mating selection mechanism that adaptively chooses appropriate mating parents from them according to their evolution status. Comprehensive experiments on a series of benchmark problems and a real-world case study fully demonstrate the competitiveness of our proposed algorithm, comparing to five state-of-the-art constrained evolutionary multi-objective optimizers.


Introduction
The constrained multi-objective optimization problem (CMOP) considered in this paper is defined as: minimize F(x) = (f 1 (x), · · · , f m (x)) T subject to g j (x) ≥ a j , j = 1, · · · , q h j (x) = b j , j = q + 1, · · · , x ∈ Ω where x = (x 1 , . . . , x n ) T s a candidate solution, and Ω = [x L i , x U i ] n ⊆ R n defines the search (or decision variable) space. F : Ω → R m constitutes m conflicting objective functions, and R m is the objective space. g j (x) and h j (x) are the j-th inequality and equality constraints respectively. For a CMOP, the degree of constraint violation of x at the j-th constraint is calculated as [1]: where is a relax term for the equality constraint, and α returns 0 if α ≥ 0 otherwise it returns the negative of α. The constraint violation value of x is calculated as: x is feasible in case CV (x) = 0; otherwise x is infeasible. Given two feasible solutions x 1 , x 2 ∈ Ω, we said that x 1 dominates x 2 (denoted as x x 2 ) in case F(x 1 ) is not worse than F(x 2 ) in any individual objective and it at least has one better objective. A solution x * is Pareto-optimal with respect to (1) in case x ∈ Ω such that x x * . The set of all Pareto-optimal solutions is called the Pareto set (PS). Accordingly, P F = {F(x)|x ∈ P S} is called the Pareto front (PF).
Since evolutionary algorithm (EA) is able to approximate a population of non-dominated solutions, which portray the trade-offs among conflicting objectives, in a single run, it has been recognized as a major approach for multi-objective optimization. Over the past two decades, much effort has been devoted to developing evolutionary multi-objective optimization (EMO) algorithms, e.g., elitist nondominated sorting genetic algorithm (NSGA-II) [2][3][4][5], indicator-based EA [6][7][8] and multi-objective EA based on decomposition [9][10][11][12][13]. Nevertheless, although most, if not all, real-life optimization scenarios have various constraints by nature, it is surprising that the research on constraint handling is lukewarm in the EMO community [14], comparing to algorithms designed for the unconstrained scenarios.
Generally speaking, convergence, diversity and feasibility are three basic issues for CMO. Most, if not all, currently prevalent constraint handling techniques at first tend to push a population toward the feasible region as much as possible, before considering the balance between convergence and diversity within the feasible region. This might lead to the population being stuck at some local optimal or local feasible regions, especially when the feasible regions are narrow and/or disparately distributed in the search space.
In this paper, we propose a two-archive EA, denoted as C-TAEA, for solving CMOPs. Specifically, we simultaneously maintain two co-evolving and complementary populations: one is denoted as convergence archive (CA); while the other is denoted as diversity archive (DA). The main characteristics of C-TAEA are delineated as follows: • As the name suggests, the CA is the driving force to maintain the convergence and feasibility of the evolution process. It provides a consistent selection pressure toward the PF.
• In contrast, without considering the feasibility, the DA mainly tends to maintain the convergence and diversity of the evolution process. In particular, the DA explores the areas that have not been exploited by the CA. This not only improves the population diversity of the CA within the currently investigating feasible region, but also helps jump over the local optima or local feasible regions.
• To leverage the complementary effect and the elite information of these two co-evolving populations, we develop a restricted mating selection mechanism that selects the appropriate mating parents form the CA and DA separately according to their evolution status.
We admit that the two-archive or multi-population strategy is not a brand new technique in the EMO literature. For example, [15][16][17] developed several two-archive EMO algorithms that use two "conceptually" complementary populations to strike the balance between convergence and diversity of the evolutionary process. Li et al. [18] developed a dual-population paradigm that combines the strengths of decomposition-and Pareto-based selection mechanisms. In this paper, we would like to, for the first time, explore the potential advantages of the two-archive strategy for CMOPs.
The rest of this paper is organized as follows. Section 2 briefly overviews the state-of-the-art evolutionary approaches developed for CMOPs and then elicits our motivations. Section 3 describes the technical details of the proposed algorithm step by step. Afterwards, in Section 4 and Section 5, the effectiveness and competitiveness of the proposed algorithm are empirically investigated and compared with five state-of-the-art constrained EMO algorithms on various benchmark problems. Finally, Section 6 concludes with a summary and ideas for future directions.

Preliminaries
In this section, we first briefly review some recent developments of constraint handling techniques in the EMO community. Afterwards, we will give our motivations based on some examples.

Literature Review
Generally speaking, the ideas of the existing constraint handling techniques in multi-objective optimization can be divided into the following three categories.
The first category is mainly driven by the feasibility information where feasible solutions are always granted a higher priority to survive to the next iteration. As early as the 90s, Fonseca and Flemming [19] developed a unified framework for solving MOPs with multiple constraints. In particular, they assign a higher priority to constraints than to objective functions. This results in a prioritization of the search for feasible solutions over optimal solutions. In [20], Coello Coello and Christiansen proposed a naïve constraint handling method that simply ignores the infeasible solutions. Although this method is easy to implement, it suffers the loss of selection pressure when tackling problems with a narrow feasible region. In particular, this algorithm will have no selection pressure when the population is filled with infeasible solutions. In [2], Deb et al. developed a constrained dominance relation for CMO. Specifically, a solution x 1 is said to constrained dominate another one x 2 if: 1) x 1 is feasible while x 2 is not; 2) both of them are infeasible and CV (x 1 ) < CV (x 2 ); 3) or both of them are feasible and x 1 ≺ x 2 . By simply replacing the Pareto dominance relation with this constrained dominance relation, the state-of-the-art NSGA-II and NSGA-III [21] can be readily used to tackle CMOPs. Borrowing the similar idea, several MOEA/D variants [21][22][23] use the CV as an alternative criterion in the subproblem update procedure. Different from [2], Oyama et al. [24] modified the Pareto dominance relation so that solutions who violate fewer number of constraints are preferred. To improve the interpretability of infeasible solutions, Takahama et al. [25] and Martínez et al. [26] proposed an -constraint dominance relation where two solutions violate constraints equally in case the difference of their CVs is smaller than a threshold . In particular, this threshold can be adaptively tuned according to the ratio of feasible solutions in the population. In [27], Asafuddoula et al. proposed an adaptive constraint handling method that treats infeasible solutions as feasible ones in case their CVs are less than a threshold. Analogously, Fan et al. [28] developed an angle-based constrained dominance principle by which two infeasible solutions are regarded as non-dominated from each other when their angle is larger than a threshold.
The second category aims at balancing the trade-off between convergence and feasibility during the search process. In [29], Jiménez et al. proposed a min-max formulation that drives feasible solutions to evolve toward optimality and drives infeasible solutions to evolve toward feasibility. In [30], Ray et al. suggested a Ray-Tai-Seow algorithm that uses three different methods to compare and rank nondominated solutions. Specifically, the first ranking procedure is conducted by sorting the objective values; the second one is performed according to different constraints; while the last one is based on a combination of objective values and constraints. Based on the same rigour, Young [31] proposed a constrained dominance relation that compares solutions according to the blended rank from both the objective space and the constraint space. A similar approach is developed by Angantyr et al. [32] that uses the weighted average rank of the ranks in both the objective space and the constraint space. By transforming each of the original objective functions of a CMOP into the sum of the distance measure and penalty function, [14] developed a new constraint handling technique for CMO. In particular, the modified objective functions are used in the non-dominated sorting procedure of NSGA-II to facilitate the search of optimal solutions in both feasible and infeasible regions. To improve the population diversity, Li et al. [33] developed a method that preserves infeasible solutions in case they are in the isolated regions. More recently, Ning et al. [34] proposed a constrained non-dominated sorting method where each solution is assigned a constrained non-domination rank based on its Pareto rank and constraint rank.
The last category tries to repair the infeasible solutions and thus drives them toward the feasible region. For example, Harada et al. [35] proposed a so-called Pareto descent repair operator that explores possible feasible solutions around infeasible solutions in the constraint space. However, the gradient information is usually unavailable in practice. In [36], Singh et al. suggested to use simulated annealing to accelerate the progress of movements from infeasible solutions toward feasible ones. Jiao et al. [37] developed a feasible-guiding strategy in which the feasible direction is defined as a vector starting from an infeasible solution and ending up with its nearest feasible solution. Afterwards, infeasible solutions are guided toward the feasible region by leveraging the information provided by the feasible direction.

Challenges to Existing Constraint Handling Techniques
From the above literature review, we find that most, if not all, constraint handling techniques in multiobjective optimization overly emphasize the importance of feasibility, whereas they rarely consider the balance among convergence, diversity and feasibility simultaneously. This can lead to an ineffective search when encountering complex constraints.
Let us first consider a test problem C1-DTLZ3 defined in [21], where the objective functions are the same as the classic DTLZ3 problem [38] while the constraint is defined as: Fig. 1 shows a two-objective example where r is set to 6. From this figure, we can see that the feasible region of this test problem is intersected by an infeasible ribbon. In addition, within this infeasible region, the CV of a solution increases when it moves away from the feasible boundary, and decreases otherwise. Therefore, it is not difficult to infer that a feasibility-driven strategy will be easily trapped in the outermost feasible boundary. To validate this assertion, we employ the stateof-the-art C-MOEA/D and C-NSGA-III [21] as the benchmark algorithms where the corresponding parameters are set the same as [21]. As shown in Fig. 1, solutions found by both algorithms are stuck in the outermost feasible boundary after 1,000 generations. Let us consider another test problem C2-DTLZ2 defined in [21], where the objective functions are the same as the classic DTLZ2 problem [38] while the constraint is defined as:   2 gives an example in the two-objective scenario, where three feasible regions are sparsely located on the PF. If the size of each feasible region is small, a feasibility-driven strategy will be easily trapped in some, not all, of the feasible regions. Furthermore, it is highly likely that none of the weight vectors used in the state-of-the-art decomposition-based EMO algorithms, e.g., C-MOEA/D and C-NSGA-III, cross these feasible regions if their sizes are sufficiently small. In this case, the decomposition-based EMO algorithms will be struggled to find feasible solutions. The results shown in Fig. 2 fully validate our assertions, where neither C-MOEA/D nor C-NSGA-III can find Pareto-optimal solutions on all three feasible regions when we set r to be a relatively small value, say 0.1. Based on these discussions, we find that an excessive use of the feasibility information can restrict the search ability of a constrained EMO algorithm. In Section 3, we will demonstrate how to use a two-archive strategy to balance the convergence, diversity and feasibility simultaneously in the entire search space. In particular, we find that an appropriate use of the infeasibility information can help to resolve the dilemma between exploration versus exploitation.

Proposed Algorithm
The general flow chart of our proposed C-TAEA is given in Fig. 3. As its name suggests, C-TAEA maintains two co-evolving archives, named CA and DA, each of which has the same and fixed size N . Specifically, CA, as the main force, is mainly responsible for driving the population toward the feasible region and approximating the PF; DA, as a complement, is mainly used to explore the areas under-exploited by the CA. It is worth noting that, to provide as much diversified information as possible, the update of the DA does not take the feasibility information into account. During the reproduction process, mating parents are separately selected from the CA and DA according to their evolution status, as described in Section 3.4. Afterwards, the offspring are used to update the CA and DA according to the mechanisms described in Section 3.2 and Section 3.3 separately.

Density Estimation Method
Algorithm 1: Association Procedure Input: Solution set S, weight vector set W Output: Before explaning the update mechanisms of the CA and DA in C-TAEA, we first introduce the density estimation method that is useful for both cases. To facilitate the density estimation, we borrow the idea from [39] to divide the objective space into N subregions, each of which is represented by a unique weight vector on the canonical simplex. In particular, we employ our previously developed weight vector generation method [33], which is scalable to the many-objective scenarios, to sample a set of uniformly distributed weight vectors, i.e., W = {w 1 , · · · , w N }. Specifically, a subregion ∆ i , where i ∈ {1, · · · , N }, is defined as: where j ∈ {1, · · · , N } and F(x), w is the acute angle between F(x) and the reference line formed by the origin and F(x). After the setup of subregions, each solution x of a population is associated with a unique subregion whose index is determined as: where F(x, t) is the normalized objective vector of x, and its i-th objective function is calculated as: where i ∈ {1, · · · , m}, z * and z nad are respectively the estimated ideal and nadir points, where The pseudo-code of this association procedure is given in Algorithm 1. After associating solutions with subregions, the density of a subregion is counted as the number of its associated solutions.

Update Mechanism of the CA
The effect of the CA is similar to the other constrained EMO algorithms in the literature. It first pushes the population toward the feasible region as much as possible, then it tries to balance the convergence and diversity within the feasible region. The pseudo-code of the update mechanism of the CA is given in Algorithm 2. Specifically, we first form a hybrid population H c , a combination of the CA and the offspring population Q. Feasible solutions in H c are chosen into a temporary archive S c (lines 3 to 5 of Algorithm 2). Afterwards, the follow-up procedure depends on the size of S c : • If the size of S c equals N (i.e., the predefined size of the CA), it is directly used as the new CA and this update procedure terminates (lines 6 and 7 of Algorithm 2).
• If |S c | > N , we use the fast non-dominated sorting method [2] to divide S c into several nondomination levels, i.e., F 1 , F 2 , and so on. Starting from F 1 , each non-domination level is sequentially chosen to construct a temporary archive S until its size equals or for the first time exceeds N (lines 9 to 11 of Algorithm 2). If we denote the last acceptable non-domination level as F l , solutions belonging to F l+1 onwards are exempt from further consideration. Note that S can be used as the new CA if its size equals N ; otherwise we associate each solution in S with its corresponding subregion and calculate S's density information afterwards. Iteratively, a worst solution from the most crowded subregion (tie is broken randomly) is trimmed one at a time until S's size equals N (line 11 to 21 of Algorithm 2). Note that, to improve the population diversity within a subregion, we propose the following process to identify the worst solution x w . First, we calculate the distance between each solution x in ∆ i and its nearest neighbor: where · indicates the 2 -norm. Afterwards, the solutions having the smallest distance are stored in a temporary archive S t , while x w is defined as where • Otherwise, if the feasible solutions in H c do not fill the new CA (|S c | < N ), we formulate a new bi-objective optimization problem as follows: Based on (12), we use the fast non-dominated sorting method to divide the infeasible solutions in H c into several non-domination levels (lines 24 and 25 of Algorithm 2). Solutions in the first several levels have a higher priority to survive into the new CA. Exceeded solutions are trimmed according to their CVs, i.e., the solution having a larger CV is trimmed at first (lines  16 Find the most crowded subregion ∆ i ; Use non-dominated sorting to divide S I into {F 1 , F 2 , · · · } based on the MOP defined in (12); 28 to 29 of Algorithm 2). These operations tend to further balance the convergence, diversity and feasibility.

Update Mechanism of the DA
Different from the CA, the DA aims at providing as much diversified solutions as possible. In particular, its update mechanism has two characteristics: 1) it does not take the constraint violation into consideration; 2) it takes the up to date CA as a reference set so that it complements the behavior of the CA by exploring its under-exploited areas. The pseudo-code of this update procedure is presented in Algorithm 3. Specifically, similar to Section 3.2, we at first combine the DA with the offspring population Q to form a hybrid population H d . Then, we separately associate each solution in H d and the up to date CA with its corresponding subregion according to the method introduced in Section 3. Here the best solution x b is identified as: This iterative investigation continues till the DA is filled.

Offspring Reproduction
The interaction and collaboration between two co-evolving archives is a vital step in C-TAEA. Apart from the complementary behavior of the update mechanisms of the CA and DA, the other contributing factor for this collaboration is the restricted mating selection. Generally speaking, its major purpose is to leverage the elite information from both archives for offspring reproduction. Algorithm 4 provides the pseudo code of this restricted mating selection procedure. Specifically, we first combine the CA and the DA into a composite set H m . Afterwards, we separately evaluate the proportion of non-dominated solutions of the CA and the DA in H m (lines 2 and 3 of Algorithm 4). If ρ c > ρ d , it means that the convergence status of the CA is better than that of the DA. Accordingly, the first mating parent is chosen from the CA; otherwise, it comes from the DA (lines 4 to 7 of Algorithm 4). As for the other mating parent, whether it is chosen from the CA or the DA depends on the proportion of non-dominated solutions (lines 8 to 11 of Algorithm 4). In other words, the higher proportion of non-dominated solutions, the larger chance to be chosen as the mating pool. As shown in lines 5 to 11 of Algorithm 4, we use a binary tournament selection to choose a mating parent. As shown in Algorithm 5, this tournament selection procedure is feasibility-driven. Specifically, if the randomly selected candidates are all feasible, they are chosen based on the Pareto dominance; if only one of them is feasible, the feasible one will be chosen; otherwise, the mating parent is chosen in a random manner. Once the mating parents are chosen, we use the popular simulated binary crossover [40] and the polynomial mutation [41] for offspring reproduction. In principle, any other reproduction operator can be readily applied with a minor modification.

Experimental Setup
Before discussing the empirical results, this section briefly introduces the benchmark problems, performance metrics and the state-of-the-art constrained EMO algorithms used for peer comparisons in our empirical studies.

Algorithm 5: Tournament Selection
Input: Solution set S Output: Mating parent x 1 Randomly pick two solutions x 1 and x 2 from S; 2 if x 1 and x 2 are feasible then

Performance Metrics
Two widely used metrics are chosen to assess the performance of different algorithms.
1. Inverted Generational Distance (IGD) [42]: Given P * as a set of points uniformly sampled along the PF and P as the set of solutions obtained from an EMO algorithm. The IGD value of P is calculated as: where dist(z, P ) is the Euclidean distance between z and its nearest neighbor in P .
2. Hypervolume (HV) [43]: Let z r = (z r 1 , · · · , z r m ) T be a worst point dominated by all the Pareto optimal objective vectors. The HV of P is defined as the volume of the objective space dominated by solutions in P and bounded by z r : where VOL indicates the Lebesgue measure.
To calculate the IGD, we need to sample a sufficient number of points from the PF to form P * . For C-DTLZ problem instances, we use the method developed in [33] to fulfill this purpose. Before calculating the HV, we remove the solutions dominated by the z r , which is set as (1.1, · · · , 1.1 m ) T in our empirical studies, except for C3-DTLZ4 where z r = (2.1, · · · , 2.1 m ) T . Note that both IGD and HV can evaluate the convergence and diversity simultaneously. A smaller IGD or a larger HV value indicates a better approximation to the PF. Each algorithm is independently run 51 times. The median and the interquartile range (IQR) of the IGD and HV values are presented in the corresponding tables. In particular, the best results are highlighted in boldface with a gray background. To have a statistically sound conclusion, we use the Wilcoxon's rank sum test at a significant level of 5% to validate the significance of the better performance achieved by the proposed C-TAEA with respect to the other peer algorithms.

EMO Algorithms Used for Comparisons
Five state-of-the-art constrained EMO algorithms, i.e., C-MOEA/D, C-NSGA-III, C-MOEA/DD [33], I-DBEA [27] and CMOEA [14], are chosen for peer comparisons. All algorithms use the simulated binary crossover and the polynomial mutation for offspring generation. The termination criteria is a predefined number of function evaluations. Section II of the supplementary document briefly describes these peer algorithms and lists their corresponding parameter settings.

Empirical Studies
In this section, we discuss the empirical results on different benchmark problems separately.

C-DTLZ Benchmark Suite
25E-2) † † denotes the performance of C-TAEA is significantly better than the other peers according to the Wilcoxon's rank sum test at a 0.05 significance level; ‡ denotes the corresponding algorithm significantly outperforms C-TAEA.
The comparison results of IGD and HV values are given in Table 1 and Table 2 respectively. Generally speaking, our proposed C-TAEA produces superior IGD and HV values on most test instances.
Let us first look at the Type-1 constrained problem. Although the feasible region of C1-DTLZ1 is only a narrow region above the PF, it actually does not pose any difficulty to all algorithms. In particular, all algorithms, especially those purely feasibility-driven ones, just simply push solutions toward the feasible boundary. As for C1-DTLZ3, C-TAEA shows the best performance on all 3to 15-objective problem instances. In particular, it obtains around 50 times smaller IGD values than the other peer algorithms on average; only C-TAEA obtains effective HV values while the HV values obtained by the other peer algorithms are all 0, which means that the obtained non-dominated solutions are all dominated by z r . As shown in Fig. 2 of the supplementary document, C1-DTLZ3 places an infeasible barrier in the attainable objective space, which obstructs the population for converging to the true PF. As discussed in Section 2.2, due to their feasibility-driven selection strategy, the other peer algorithms cannot provide any further selection pressure to push the population forward when it approaches the outer boundary of this infeasible barrier, as shown in Fig. 4 1 . In contrast, since the selection mechanism of the DA does not take the feasibility information into account, it can constantly push the solutions of the DA toward the PF without considering the existence of this infeasible barrier. In the meanwhile, the CA can at the end overcome this infeasible barrier via the restricted mating selection between the CA and DA. We also notice that C-TAEA cannot push solutions to fully converge on the PF in high-dimensional cases as shown in Fig. 17 to 20 of the supplementary document. This is because the size of the infeasible barrier increases with the dimensionality. It makes C1-DTLZ3 even more difficult in a many-objective scenario. Nevertheless, the solutions obtained by C-TAEA are much closer to the PF than the other peer algorithms.
The Type-2 constrained problem, i.e., C2-DTLZ2, spreads several feasible regions on disparate parts of the PF. All algorithms do not have any difficulty in finding at least one feasible PF segment, whereas only C-TAEA can find all disparately distributed small feasible PF segments as shown in Fig. 5. The reason that leads to this phenomenon is similar to C1-DTLZ3. Specifically, each feasible region is small when setting a small r in C2-DTLZ2, thus different feasible regions are separated by large infeasible barriers. In this case, if an algorithm finds one of the feasible PF segments, it hardly has a sufficient selection pressure to jump over this local feasible PF segment. 1 We only show the 3-objective scatter plots in this paper, while the high-dimensional plots, which are not as intuitive as the 3-objective scenarios, are put in the supplementary document. However, due to the existence of the DA in C-TAEA, it complements the coverage of the CA. As shown in Fig. 6, solutions in the CA and DA perfectly complements each other in terms of the coverage over the PF. As a result, the DA helps drive the CA explore new feasible regions. As for the Type-3 constrained problems, i.e., C3-DTLZ1 and C3-DTLZ4, the original PF of the baseline problem becomes infeasible when considering the constraints while the new PF is formed by the feasible boundaries. In terms of the constraint handling, this type of problems does not provide too much difficulty. From the comparison results shown in Table 1 and Table 2, we find that all algorithms obtain comparable IGD and HV values on all C3-DTLZ1 and C3-DTLZ4 problem instances. In particular, C-TAEA is outperformed by C-MOEA/D on the 5-objective C3-DTLZ1 problem instance; and it is outperformed by C-NSGA-II on the 8-and 10-objective C3-DTLZ4 problem instances. In general, due to the advanced selection mechanisms of the CA and DA for balancing convergence and diversity, C-TAEA obtains better IGD and HV values on most cases.   2.219E-1(9.16E-3)
The comparison results of IGD and HV values on the DC-DTLZ benchmark suite are given in Table 3 and Table 4 respectively. From these results, it is obvious to see the overwhelmingly superior performance of C-TAEA over the other peer algorithms, given the observation that C-TAEA obtains the best IGD and HV values in all comparisons. The following paragraphs try to decipher the potential reasons that lead to the ineffectiveness of the other peer algorithms.
Let us start from the Type-1 constrained problem. As described in Section I-B1) of the supplementary document, the constraints restrict the feasible region to a couple of narrow cone-shaped strips. Similar to C2-DTLZ2, the other peer algorithms have a risk of being trapped in one feasible region thus fail to find all feasible PF segments. However, DC1-DTLZ1 and DC1-DTLZ3 seem to be less challenging than C2-DTLZ2 with a small r setting, given the observation that some peer algorithms are able to find a good number of solutions in different feasible PF segments as shown in Fig. 7 and Fig. 8. This might be attributed to the g(x) function of the baseline test problems, i.e., DTLZ1 and DTLZ3, which can make the crossover and mutation generate offspring far apart  1.7614(9.28E-2) 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 10 2.3748(6.13E-2) 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 15 4.1326(1.28E-2) 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † 0.0000(0.00E+0) † † denotes the performance of C-TAEA is significantly better than the other peers according to the Wilcoxon's rank sum test at a 0.05 significance level; ‡ denotes the corresponding algorithm significantly outperforms C-TAEA. from their parents. Therefore, we can expect that solutions have some opportunities to jump over the locally feasible region. Nevertheless, as shown in Table 3 and Table 4, the IGD and HV values obtained by our proposed C-TAEA constantly outperform the other peer algorithms and the better results are with a statistical significance. The Type-2 constrained problem seems to be similar to C1-DTLZ1, at first glance, as shown in Fig. 8 and Fig. 9 of the supplementary document, where the constraints make the feasible region be reduced to a thin ribbon zone above the PF. However, it is more challenging due to the fluctuation in the CV of an infeasible solution when it approaches the PF, as shown in Fig. 10 of the supplementary document. As shown in Fig. 9 and Fig. 10, we can clearly see that all other peer algorithms are trapped in a region far away from the PF. As the problem definitions of DC2-DTLZ1 and DC2-DTLZ3 shown in the supplementary document, all solutions obtained by the other peer algorithms are infeasible. Their failures on this type of constrained problems can be attributed to their feasibility-driven selection mechanisms, which drive the population fluctuate between the CV's local optima. As for our proposed C-TAEA, its success can be owed to the use of the DA. In particular, the selection mechanism of the DA does not take the CV into account so that it has sufficient selection pressure to move toward the PF. As shown in Fig. 9 and Fig. 10, only C-TAEA finally find solutions on the PF. As for the Type-3 constrained problem, its constraints are a combination of the previous two. In particular, the feasible regions are restricted to a couple of segmented cone stripes. In addition, there exists the same fluctuation, as the Type-2 constrained problem, in the CV of an infeasible solution when it approaches the PF. In this case, the other peer algorithms are not only struggling on jumping over a particular locally feasible region, but also have a significant trouble with the fluctuation (back and forth) of the population. Again, the success of our proposed C-TAEA is also attributed to the collaborative and complementary effects of two co-evolving archives. As shown in Fig. 11 and Fig. 12, only C-TAEA finds all feasible PF segments while the other peer algorithms are stuck at some locally feasible regions away from the PF.

Further Analysis
From the experimental results shown in Section 5.1 and Section 5.2, we have witnessed the superior performance of C-TAEA for solving various constrained multi-objective benchmark problems. To have a better understanding of its design principles, this subsection will investigate some important algorithmic components of C-TAEA by comparing it with the following two variants.
• Variant-I: As shown in lines 15 to 21 of Algorithm 2, we iteratively remove the worst solution from the most crowded region when updating the CA. In particular, the worst solution is determined in terms of both its local crowdedness and its fitness value as defined in equation (11). This operation mainly aims to further improve the population diversity. To validate its effectiveness, we develop a variant in which the worst solution is simply defined as the one having the worst fitness value within the currently identified most crowded region.
• Variant-II: We claimed that the collaboration between the CA and DA is partially implemented by the restricted mating selection that automatically chooses the appropriate mating parents for offspring reproduction according to their evolution status. To validate the effectiveness of this operation, we develop another variant that randomly chooses mating parents from the CA and DA with an equal probability.
In the empirical studies, we use the same parameter settings as Section 5.1 and Section 5.2 and compare the performance of C-TAEA with these two variants on C-DTLZ and DC-DTLZ benchmark problems. From the comparison results, i.e., the IGD and HV values respectively shown in Table  IV and Table V of the supplementary document, we can see that the performance of C-TAEA and its two variants are comparable when the constraints are not difficult to solve, e.g., C1-DTLZ1, C3-DTLZ1/DTLZ4; whereas the superiority of C-TAEA becomes evident otherwise. More specifically, we find that Variant-I fails to maintain a good diversity when the feasible region is a small segment, e.g., C2-DTLZ2, DC1-DTLZ1/DTLZ3, DC3-DTLZ1/DTLZ3. Fig. 13 shows a comparison of the solutions found by C-TAEA and Variant-I on C2-DTLZ2 with r = 0.1. From this figure, we can see that the solutions found by Variant-I are sparsely distributed within the feasible region. This is because the purely fitness-based selection strategy tends to drive solutions toward the corresponding weight vector within the feasible region as much as possible.
As for Variant-II, its random mating selection mechanism does not take enough advantage of the complementary effect of the CA and DA, thus it fails to help the algorithm overcome the locally infeasible barrier, e.g., C1-DTLZ3, DC2-DTLZ1/DTLZ3, DC3-DTLZ1/DTLZ3.

Case Study: Water Distribution Network Optimization
Having tested C-TAEA's ability in solving various kinds of constrained benchmark problems, this section tends to investigate the performance of C-TAEA and the other peer algorithms on a realworld case study about optimal design of the water distribution network (WDN). In the past decade, multi-objective optimal design and rehabilitation of a WDN has attracted an increasing attention [44]. The shift from the least-cost design to a multi-objective performance-based design advances decision makers' understanding of trade-off relationship between conflicting design objectives [45]. This paper uses the Anytown WDN, one of the most popular benchmark networks, as the case study. Anytown WDN has many typical features and challenges that can be found in real-world networks, e.g., pump scheduling, tank storage provision, and fire-fighting capacity provision. The network layout is shown in Fig. 14, where it has 35 pipes, 2 storage tanks, and 3 identical pumps delivering water from the treatment plant into the system. To meet the city expansion and increasing demands, 77 decision variables are considered, including 35 variables related to the existing pipes (with options of cleaning and lining or duplication with a parallel pipe), six new pipe diameters, 12 variables for two potential tanks, and 24 variables for the number of pumps in operation during 24 hours of a day. In this paper, the WDN design problem is formulated as a four-objective optimization problem with two constraints. In particular, we consider costs, resilience index, statistical flow entropy and water age as the objective functions. More detailed descriptions of the problem formulation can be found in Section IV of the supplementary document.
In the experiment, C-TAEA and the other five peer algorithms use the solution encoding scheme as suggested in [46]. The population size is set to N = 100, and the number of function evaluations used for each algorithm is set to 10, 000 × N . The reproduction operators and their corresponding parameters are still set the same as before. Since the true PF is unknown for this real-world WDN model, we only use the HV as the performance metric where z r = (1.1, · · · , 1.1) T . In particular, we normalize the objective functions before calculating the HV metric. From the box plots (with respect to 51 independent runs) shown in Fig. 15, we can clearly see that our proposed C-TAEA shows better performance than the other five peer algorithms.

Conclusions and Future Directions
In this paper, we have suggested a parameter-free constraint handling technique, a two-archive evolutionary algorithm (C-TAEA), for constrained multi-objective optimization. In C-TAEA, we simultaneously maintain two co-evolving archives. Specifically, one population, denoted as CA, mainly focuses on driving the population toward the PF; while the other one, denoted as DA, mainly tends to explore the areas under-exploited by the CA (even those infeasible regions) thus provide more diversified information. In this case, the CA and DA have different behaviors and complementary effects. In particular, they complement each other via a restricted mating selection mechanism which selects complementary mating parents for offspring reproduction. The performance of C-TAEA has been investigated on a series of benchmark problems with various types of constraints and up to 15 objectives. The empirical results fully demonstrate its competitiveness on CMOPs, in comparison to five state-of-the-art constrained EMO algorithms. In addition to artificial benchmark problems, the effectiveness of C-TAEA has also been validated on a real-world case study of the WDN design optimization.
As previously also demonstrated in [15][16][17], we believe that C-TAEA is more than a specific algorithm. Instead, its basic idea, co-evolving multiple complementary archives, can be widely used in the general EMO algorithm design. In future, it is worth further investigating its underlying mechanisms from both algorithm design and theoretical foundation perspectives. Furthermore, we plan to investigate the effectiveness of this two-archive co-evolving framework on a wider range of problems, such as unconstrained MOP including those with complex properties (e.g., problems with complecated PSs [47] and imbalanced convergence and diversity [48]), dynamic optimization (e.g., problems with a changing number of objectives or constraints [49]), and other real-world applications.