Introduction
Recently the development of evolutionary multiobjective optimization (EMO) algorithms was discussed from the point of view of co-evolution with test problems in [1] where the Pareto front was compared with randomly generated initial solutions to explain the characteristic features of each test problem. For example, Fig. 1 shows a test problem in [2] used for performance evaluations of EMO algorithms in the mid-1990s. As we can see from Fig. 1, some initial solutions are very close to the Pareto front. This observation may explain why nonelitist EMO algorithms such as nondominated sorting genetic algorithm (NSGA) [3] and Niched Pareto genetic algorithm (NPGA) [4] with no strong convergence property were proposed in the mid-1990s. However, initial solutions of a test problem in Fig. 2 [5] are not close to the Pareto front. Thus we need strong convergence whereas no strong diversification is needed in Fig. 2 (since randomly generated initial solutions have a large diversity). These observations explain why elitist EMO algorithms such as strength Pareto evolutionary algorithm (SPEA) [6], NSGA-II [7], and SPEA2 [8] were proposed around 2000. In these algorithms, Pareto dominance was used as the main fitness evaluation criterion together with a secondary criterion for diversity maintenance.
Randomly generated 200 solutions and the Pareto front of a two-objective ZDT1 problem [5].
In parallel with the increase in the popularity of the Pareto dominance-based EMO algorithms [6]–[8], many-objective test problems called DTLZ [9] and WFG [10] were proposed as scalable test problems where the number of objectives can be arbitrarily specified. Multiobjective knapsack problems [6] were also generalized to many-objective test problems with up to 25 objectives [11], [12]. The DTLZ, WFG, and knapsack problems were repeatedly used for demonstrating difficulties of many-objective optimization for the Pareto dominance-based EMO algorithms [13]–[16]. When an EMO algorithm is applied to a many-objective problem, almost all solutions in a population become nondominated with each other in very early generations (e.g., within ten generations) before they converge to the Pareto front. This means that the Pareto dominance-based selection pressure toward the Pareto front becomes very weak. As a result, convergence ability of Pareto dominance-based EMO algorithms is severely degraded by the increase in the number of objectives.
For improving the convergence ability for many-objective problems, various approaches have been proposed such as the modification of the Pareto dominance relation [17] and the introduction of an additional ranking mechanism [18]–[21]. The use of a different fitness evaluation mechanism was also actively studied. Approaches in this direction can be classified into two categories. One is an indicator-based approach such as
Whereas MOEA/D [25] was not originally proposed for many-objective problems, its high performance as a many-objective optimizer was observed [16], [26]. Recently, a number of new many-objective algorithms have been proposed using the framework of MOEA/D (e.g., improved decomposition-based evolutionary algorithm (I-DBEA) [27], MOEA/D-distance-based updating strategy (MOEA/D-DU) [28], Ensemble Fitness Ranking with a Ranking Restriction scheme (EFR-RR) [28], and
MOEA/D [25] searches for well-distributed solutions using systematically generated weight vectors. As an example, we show a set of 91 weight vectors for a three-objective problem in Fig. 3. In Fig. 4, we show an example of obtained solutions by MOEA/D-PBI with
Example of obtained nondominated solutions by MOEA/D-PBI for a three-objective DTLZ2 problem using the 91 weight vectors in Fig. 3.
In Figs. 3 and 4, a one-to-one mapping was realized between the weight vectors in Fig. 3 and the obtained solutions in Fig. 4. However, this is not always the case. As examined in [33], the number of obtained nondominated solutions by MOEA/D is often much smaller than the number of weight vectors. This is because: 1) a single good solution can be shared by multiple weight vectors and 2) all solutions are not always nondominated. Recently proposed many-objective algorithms [27]–[32] with the MOEA/D framework have mechanisms for improving both the convergence of solutions toward the Pareto front and their uniformity over the entire Pareto front. Their common feature is the use of systematically generated weight vectors. In those algorithms, reference points and/or reference lines are constructed using the weight vectors.
Surprisingly good results were reported by those weight vector-based algorithms on the DTLZ and WFG problems in the literature. For example, the average inverted generational distance (IGD) over 20 runs on a 15-objective DTLZ2 problem was reported in [29] as
Reference points for the IGD calculation and a set of obtained solutions by MOEA/D-PBI with
Whereas difficulties of many-objective problems were repeatedly pointed out (e.g., see survey papers [12], [34], and [35], Fig. 5 and the reported results in [27]–[32] may suggest that some DTLZ and WFG problems are not difficult. In this paper, we first examine why surprisingly good results were obtained for some DTLZ and WFG test problems. Then we show that a slight change in the DTLZ and WFG formulations degrades the performance of weight vector-based many-objective algorithms. Our experimental results suggest the overspecialization of those algorithms for the test problems.
This paper is organized as follows. In Section II, we briefly explain the DTLZ [9] and WFG [10] problems. We focus on the shape of the Pareto front of each test problem. In Section III, we explain a common search mechanism of weight vector-based algorithms such as MOEA/D-PBI [25],
Many-Objective Test Problems
In general, an \begin{equation} {\mathrm{ Minimize}}~f_{1} ({ \boldsymbol {x}}), f_{2} ({ \boldsymbol {x}}), \ldots , f_{M} ({ \boldsymbol {x}})~{\mathrm{ subject~to}}~{ \boldsymbol {x}}\in { \boldsymbol {X}} \end{equation}
Scalable many-objective test problems called DTLZ [9] and WFG [10] have been frequently used to evaluate many-objective algorithms in the literature. Table I summarizes many-objective test problems used for performance evaluation of recently proposed weight vector-based algorithms [27]–[32]. For comparison, we also show multiobjective test problems used for evaluating MOEA/D in [25]. We can see from this table that DTLZ1-4 and WFG1-9 have often been used in the literature. In this section, we briefly explain those test problems.
A. DTLZ Test Problems
The DTLZ test suite was designed by Deb et al. [9] as a set of nine scalable test problems (DTLZ1-9). The number of decision variables (say
Let \begin{align}&{\mathrm{ DTLZ}}1:~\sum _{i=1}^{M}y_{i}^{_{*}} = 0.5~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M \\&{\mathrm{ DTLZ}}\hbox {2-4}:~\sum _{i=1}^{M}\left ({y_{i}^{_{*}} }\right )^{2} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots ,M. \notag \\ {}\end{align}
As shown in Table I, DTLZ1-4 have been frequently used for performance evaluation of many-objective algorithms. This is because their Pareto fronts are represented by the simple formulations in (2) and (3). DTLZ5-6 were originally proposed as many-objective test problems with degenerate Pareto fronts in [9]. However, their Pareto fronts are not degenerate when they have four or more objectives [10], [36], [37]. Constraint conditions were introduced to remove the nondegenerate parts of the Pareto fronts [36], [37]. The Pareto fronts of DTLZ7-9 cannot be represented by a simple form as in (2) or (3).
B. WFG Test Problems
The WFG test suite was proposed by Huband et al. [10] as a set of nine scalable test problems (WFG1-9). An
WFG1 has a complicated Pareto front, which cannot be represented by a simple form as in (2) or (3). The Pareto front of WFG2 is disconnected. WFG3 was originally designed as a many-objective test problem with a degenerate Pareto front. However, its Pareto front is not degenerate when it has three or more objectives as recently pointed out in [38]. All the other test problems (i.e., WFG4-9) have the following Pareto front:\begin{align} {\mathrm{ WFG}}\hbox {4-9}:\sum _{i=1}^{M}\left ({\frac {y_{i}^{_{*}}}{2i} }\right )^{2} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M.\notag \\ {}\end{align}
One important feature of the Pareto fronts of WFG4-9 in (4) is that the domain of each objective has a different magnitude (e.g.,
C. Common Feature of DTLZ1-4 and WFG4-9
By normalizing the range of the Pareto front for each objective into the unit interval [0, 1], the Pareto fronts of DTLZ1-4 and WFG4-9 can be rewritten as follows:\begin{align}&{\mathrm{ DTLZ}}1:~\sum _{i=1}^{M}y_{i}^{_{*}} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i=1,2,\ldots , M\qquad \\&{\mathrm{ DTLZ}}\hbox {2-4}~{\mathrm{ and~WFG}}\hbox {4-9}\notag \\&\quad \qquad \sum _{i=1}^{M}\left ({y_{i}^{_{*}} }\right )^{2} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots ,M. \end{align}
These formulations show that DTLZ1-4 and WFG4-9 are similar test problems with respect to the shape of their Pareto fronts. This means that most test problems used for evaluating many-objective algorithms in [27]–[32] in Table I are similar with respect to the shape of their Pareto fronts whereas they are different with respect to other aspects such as the curvature property of the Pareto fronts (e.g., linear and concave) and the type of the objective functions (e.g., multimodal and deceptive). In this paper, we explain our concern that the development of weight vector-based many-objective evolutionary algorithms seems to be overspecialized for the above-mentioned similarity of the Pareto fronts of DTLZ1-4 and WFG4-9.
Weight Vector-Based Many-Objective Algorithms
In MOEA/D [25], a multiobjective problem is decomposed into single-objective problems, each of which is generated by a scalarizing function with a different weight vector. Thus the number of single-objective problems is the same as the number of weight vectors. Since a single best solution is stored for each single-objective problem, the population size is also the same as the number of weight vectors. MOEA/D can be viewed as an improved version of a cellular multiobjective genetic algorithm [39]. MOEA/D is also similar to multiobjective genetic local search (MOGLS) [40]–[43]: both of them optimize scalarizing functions. Whereas weight vectors are systematically generated and fixed in MOEA/D, they are randomly updated in each generation in MOGLS.
A. Weight Vector Specification
In the original MOEA/D [25], all weight vectors \begin{align}&\sum _{i=1}^{M}w_{i} = 1~{\mathrm{ and}}~w_{i} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M \\&w_{i} \in \left \{{0,\frac {1}{H} ,\frac {2}{H} ,\ldots , \frac {H}{H} }\right \}~{\mathrm{ for}}~i = 1, 2, \ldots , M \end{align}
Weight vectors in (7) and (8) are on the hyper-plane specified by (7) in an
For handling this difficulty, a two-layered approach was proposed in NSGA-III [30] and used in recently proposed weight vector-based many-objective algorithms. In the two-layered approach, two sets of weight vectors \begin{equation} w_{i} =\left ({w_{i} +1/M}\right )/2~{\mathrm{ for}}~i = 1, 2, \ldots , M. \end{equation}
The two-layered approach is used for many-objective problems with eight or more objectives in this paper. It should be noted that all weight vectors generated by the two-layered approach are on the same hyper-plane specified by (7), which is the same as all weight vectors in the original MOEA/D.
Another frequently used modification is the use of a small number
B. Basic Idea of MOEA/D
The basic idea of MOEA/D (and its variants) is to find a set of well-distributed nondominated solutions along the Pareto front using the systematically generated weight vectors as shown in Fig. 6. The realization of this idea is often explained by two distances in Fig. 7: 1)
In Fig. 7,
It is also important to assign a single solution to each reference line. Weight vector-based algorithms [27]–[32] have their own mechanisms to minimize
The ideal point
C. Scalarizing Functions
Three scalarizing functions were examined in the original MOEA/D [25]: 1) the weighted sum; 2) the weighted Tchebycheff function; and 3) the penalty-based boundary intersection (PBI) function. For the minimization problem in (1), the weighted sum with a weight vector \begin{equation} {\mathrm{ Minimize}}~f^{\mathrm{ WS}} ({ \boldsymbol {x}}|{ \boldsymbol {w}})=w_{1} f_{1} ({ \boldsymbol {x}})+\cdots +w_{M} f_{M} ({ \boldsymbol {x}}). \end{equation}
The weighted Tchebycheff function is written using a reference point \begin{equation} {\mathrm{ Minimize}}~f^{\mathrm{ Tch}} \left ({{ \boldsymbol {x}}|{ \boldsymbol {w}}, { \boldsymbol {z}}^{_{*}} }\right )=\max \limits _{i=1,2,\ldots ,M} \left \{{ w_{i} \cdot \left |{z_{i}^{_{*}} -f_{i} ({ \boldsymbol {x}})}\right |}\right \}. \qquad \end{equation}
Each element
Using a penalty parameter \begin{equation} {\mathrm{ Minimize}}~f^{\mathrm{ PBI}} \left ({{ \boldsymbol {x}}|{ \boldsymbol {w}}, { \boldsymbol {z}}^{_{*}} }\right )=d_{1} +\theta \, d_{2} \end{equation}
\begin{align} d_{1}=&\left |{\left ({\,{ \boldsymbol {f}}({ \boldsymbol {x}})-{ \boldsymbol {z}}^{_{*}} }\right )^{T} { \boldsymbol {w}}}\right | \big / \left \|{ { \boldsymbol {w}}}\right \| \\ d_{2}=&\left \|{ \,{ \boldsymbol {f}}({ \boldsymbol {x}})-{ \boldsymbol {z}}^{_{*}} -d_{1} \frac {{ \boldsymbol {w}}}{||{ \boldsymbol {w}}||} }\right \|. \end{align}
D. Neighborhood Structure
In the original MOEA/D [25], each weight vector has its neighbors. A prespecified number of similar weight vectors are defined as neighbors for each weight vector. A weight vector itself is included in its own neighbors. When a solution is to be generated for a weight vector, parents are selected from its neighbors. The generated new solution is compared with the solution of each neighbor. If the new solution is better, the current solution is replaced. The comparison for replacement is performed against all solutions of the neighbors.
This replacement strategy has two potential difficulties. One is explained by the following case. A new solution, which is generated far from the corresponding reference line, is very good for a different reference line while its evaluation is poor for the corresponding reference line. This case is explained in Fig. 8. Let us assume in Fig. 8 that the set of the seven open circles is a current population. We also assume that solution A is generated for a reference line
E. Performance Improvement Mechanisms
These two difficulties can be remedied by the following replacement policy. A new solution is compared with its similar solutions, and only a single solution is replaced with the new solution. Instead of using the prespecified neighbors, a set of similar solutions is selected for the new solution. First the nearest reference line to the new solution is identified in the objective space (e.g., line
In NSGA-III [30], reference points are specified in the normalized objective space in a similar manner to the weight vector specification in MOEA/D. Using the reference points and the origin of the normalized objective space, reference lines are generated. Each solution is assigned to its nearest reference line. Solution comparison in NSGA-III is performed using nondominated sorting, the number of solutions assigned to each reference line, and the distance from each solution to its nearest reference line. Solutions are updated by a (
In
In MOEA/DD [31], a (
As we have already explained, weight vector-based algorithms in [27]–[30] and [32] have normalization mechanisms. The two-layered approach in NSGA-III [30] are used in [27]–[32]. In addition to these two common features, each algorithm has its own mechanisms for efficiently realizing the search for a set of well-distributed solutions along the Pareto front of a many-objective problem.
F. Relation Between Test Problems and Algorithms
As shown in Section II, the DTLZ1-4 and WFG4-9 test problems have the following Pareto front in the normalized \begin{align} {\mathrm{ Pareto~Front}}:~\sum _{i=1}^{M}\left ({y_{i}^{_{*}} }\right )^{k} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M \notag \\ {}\end{align}
Weight vectors in the weight vector-based algorithms [27]–[32] are generated in the \begin{align} {\mathrm{ Weight~Vectors}}:\sum _{i=1}^{M}w_{i} = 1~{\mathrm{ and}}~w_{i} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M. \notag \\ {}\end{align}
The shape of the Pareto front of each test problem in (15) is the same as or similar to the shape of the hyper-plane in (16) on which the weight vectors are generated. This explains why the search for a single best solution for each weight vector leads to a set of well-distributed solutions over the entire Pareto front. From the similarity between the shape of the Pareto front in (15) and the shape of the distribution of the weight vectors in (16), the following questions may arise.
How genral is the high performance of weight vector-based algorithms on the DTLZ1-4 and WFG4-9 problems?
How general are the triangular shape Pareto fronts of the DTLZ1-4 and WFG4-9 test problems?
In the next section, we discuss the first question through computational experiments. The second question is briefly discussed in Section V as a future research topic.
Computational Experiments
In this section, first we explain our test problems generated by slightly changing the DTLZ and WFG formulations. Then we demonstrate how high performance of weight vector-based algorithms on the DTLZ and WFG problems is deteriorated by the slight change in the test problem formulations.
A. Our Test Problems: DTLZ^{-1}
and WFG^{-1}
As we have already explained, the DTLZ and WFG test problems have the following form:\begin{equation} {\mathrm{ Minimize}}~f_{1} ({ \boldsymbol {x}}), \ldots , f_{M} ({ \boldsymbol {x}})~{\mathrm{ subject~to}}~{ \boldsymbol {x}}\in { \boldsymbol {X}}. \end{equation}
Our idea is to generate a slightly different test problem from each of the DTLZ and WFG problems. More specifically, we change their general form from (17) to \begin{equation} {\mathrm{ Maximize}}~f_{1} ({ \boldsymbol {x}}), \ldots , f_{M} ({ \boldsymbol {x}})~{\mathrm{ subject~to}}~{ \boldsymbol {x}}\in { \boldsymbol {X}}. \end{equation}
We use exactly the same objective functions and the same constraint conditions except for changing from “Minimize” in (17) to “Maximize” in (18). The generated problems in (18) are referred to as the Max-DTLZ and Max-WFG problems. Those problems are handled as the following minimization problems:\begin{equation} {\mathrm{ Minimize}} \,\, -f_{1} ({ \boldsymbol {x}}), \ldots , -f_{M} ({ \boldsymbol {x}})~{\mathrm{ subject~to}}~{ \boldsymbol {x}}\in { \boldsymbol {X}}. \end{equation}
All objectives in DTLZ and WFG are multiplied by (–1) in our test problems in (19). That is, a negative sign is added to all objectives in DTLZ and WFG. In this paper, the minus versions in (19) of DTLZ and WFG are referred to as DTLZ
In Fig. 9, we show the Pareto fronts of the original three-objective DTLZ2 problem and its maximization version in (18): Max-DTLZ2. The Pareto fronts of the two test problems have the same shape with different size. In Fig. 10, we show the Pareto fronts of the minus versions of the three-objective DTLZ1 and DTLZ2 (i.e., DTLZ
Pareto fronts of three-objective DTLZ2 and Max-DTLZ2 problems. (a) Original three-objective DTLZ2. (b) Maximization version of DTLZ2.
Pareto fronts of the minus versions of DTLZ1 and DTLZ2. (a) Three-objective DTLZ
As shown in Figs. 9 and 10 for DTLZ1 and DTLZ2, the minus versions of DTLZ have much larger Pareto fronts than the original DTLZ problems. When DTLZ has a concave Pareto front, its minus version has a convex Pareto front. Moreover, the difficulty of optimization is not the same. For example, the Pareto optimal solutions of DTLZ2 are obtained when all of its distance variables are 0.5. However, the Pareto optimal solutions of DTLZ
A test problem called the inverted DTLZ1 was formulated in [44] by applying the following transformation to each objective \begin{equation} f_{i} ({ \boldsymbol {x}})\leftarrow 0.5\, (1+g({ \boldsymbol {x}}))-f_{i} ({ \boldsymbol {x}}) \end{equation}
B. Examined Algorithms and Parameter Specifications
Our computational experiments on DTLZ1-4 and WFG1-9 with 3, 5, 8, and 10 objectives are performed under the same settings as in [29]. We examine the performance of
The IPBI function is defined by the distance from the nadir point \begin{equation} {\mathrm{ Maximize}}~f^{\mathrm{ IPBI}} \left ({{ \boldsymbol {x}}|{ \boldsymbol {w}}, { \boldsymbol {z}}^{N} }\right )=d_{1} -\theta \, d_{2} \end{equation}
\begin{align} d_{1}=&\left |{\left ({{ \boldsymbol {z}}^{N} -{ \boldsymbol {f}}({ \boldsymbol {x}})}\right )^{T} { \boldsymbol {w}}}\right | / \left \|{ { \boldsymbol {w}}}\right \| \\ d_{2}=&\left \|{ { \boldsymbol {z}}^{_{N}} -{ \boldsymbol {f}}({ \boldsymbol {x}})-d_{1} \frac {{ \boldsymbol {w}}}{||{ \boldsymbol {w}}||} \, }\right \|. \end{align}
In this paper, we use the same implementation of MOEA/D-IPBI as in [45]: the penalty parameter
Multiobjective search is performed in MOEA/D-IPBI by pushing each solution in the direction from the nadir point to the Pareto front by maximizing
Exactly the same set of weight vectors is used in all algorithms for each test problem. Table II shows the number of weight vectors. The two-layered approach is used for test problems with eight and ten objectives. The population size is the same as the number of weight vectors except for NSGA-II, NSGA-III, and
As a termination condition for each test problem, we use the same prespecified total number of generations as in [29]. The penalty parameter
We use our own implementations of the four versions of MOEA/D because we have already examined them on various test problems. This is also because we have not observed any clear inconsistency between our results and the reported results in [29]. With respect to NSGA-II,
C. Performance Measures
As a performance measure, we use the hypervolume in the same manner as [29] for the original DTLZ and WFG problems (the setting of the reference point for hypervolume calculation is the same as [29]). First, the objective space is normalized using the ideal point
We also calculate the IGD indicator in the normalized objective space for all test problems. The Euclidean distance is used in the normalized objective space of each test problem. A set of reference points for the IGD calculation is generated for each test problem as follows. For DTLZ1-4, DTLZ1-
D. Experimental Results on Original Test Problems
The average hypervolume values over 101 runs on the original DTLZ1-4 and WFG1-9 problems are summarized in Table III. The best average result is highlighted by bold and underlined for each test problem. The worst four average results are shaded. For DTLZ1-4, the best average results are obtained by MOEA/DD for 12 out of the 16 problems (75%). For WFG4-9, the best average results are obtained by
The best average result is not obtained by MOEA/D or MOEA/DD for WFG4-9 due to the lack of an objective space normalization mechanism. The performance of MOEA/DD on WFG4-9 can be improved by a normalization mechanism. Just for comparison, we perform computational experiments on the normalized WFG4-9 test problems after changing all objectives as
E. Experimental Results on Modified Test Problems
Table IV shows the average hypervolume values for the reference point (
The difference in the test problems between Tables III and IV is only the multiplication of (−1) to each objective. Except for this change, the test problems are the same between the two tables. However, totally different results are obtained. Especially, the performance of the best algorithms in Table III (i.e.,
In Table V, we show experimental results evaluated by the hypervolume with the reference point (
Performance evaluation results by the IGD indicator are shown in Table VI for DTLZ and WFG and Table VII for DTLZ
Discussion and Future Research Topics
A. Discussion on Experimental Results
For further discussing the experimental results in Tables III–VII, we show a single run result of each algorithm on the three-objective DTLZ
Experimental results of a single run of each algorithm on the three-objective DTLZ
Well-distributed solutions are obtained by MOEA/D-IPBI in Fig. 11(i) and (j). Such a good solution set is not obtained by the other algorithms. In Fig. 11(d), ten solutions are obtained inside the Pareto front together with many solutions on its boundary. This result is explained by the inconsistency between the shape of the Pareto front and the shape of the distribution of the weight vectors in Fig. 12. The shaded region is the projection of the Pareto front, which is the region of weight vectors intersecting with the Pareto front.
Almost the same figure as Fig. 12 was used in [44] for explaining experimental results of NSGA-III on the inverted DTLZ1 problem. As shown in Fig. 12 (and in [44]), the weight vectors are uniformly distributed over the triangle whereas the shape of the Pareto front is a rotated triangle. We can see from Fig. 12 that the ten weight vectors are inside the projection of the Pareto front. Those weight vectors correspond to the ten inside solutions in Fig. 11(d). Since each weight vector in MOEA/D-PBI always has a single solution, many solutions are obtained on the boundary of the Pareto front in Fig. 11(d). Those boundary solutions are the best solutions for the outside weight vectors in Fig. 12. For the same reason, many solutions are obtained on the boundary in Fig. 11(e).
Multiobjective search in weight vector-based algorithms in [27]–[32] as well as MOEA/D-PBI and MOEA/D-Tch can be viewed as pulling all solutions toward the ideal point using the weight vectors. The weight vectors in those algorithms are illustrated in Fig. 13(a). Figs. 12 and 13(a) show the same weight vectors. The ten inside solutions in Fig. 11(d) are obtained by the ten inside weight vectors in Fig. 12. Almost the same ten inside solutions are obtained in Fig. 11(a)–(d) since all algorithms in Fig. 11(a)–(d) have the same basic idea of multiobjective search: pulling all solutions toward the ideal point using the weight vectors in Fig. 13(a).
Weight vectors used for three-objective minimization. (a) Most weight vector-based algorithms. (b) Weighted sum and IPBI.
When IPBI is used, multiobjective search is performed by pushing all solutions from the nadir point to the Pareto front as shown in Fig. 13(b). In this case, the distribution of weight vectors (i.e., reference lines) is consistent with the shape of the Pareto front. This explains why well-distributed solutions are obtained by MOEA/D-IPBI in Fig. 11(i) and (j).
In NSGA-III,
In the same manner as Fig. 11, we show an experimental result of a single run of each algorithm on the three-objective DTLZ
Experimental results on the three-objective DTLZ
As in Fig. 11, the distribution of the weight vectors is not consistent with the shape of the Pareto front in Fig. 14 except for MOEA/D-WS and MOEA/D-IPBI. Thus, many solutions are obtained on the boundary of the Pareto front in Fig. 14(d) and (e). However, the distribution of solutions around the center of the Pareto front is similar between two groups. One is Fig. 14(b)–(d) with the inconsistent weight vector distribution, and the other is Fig. 14(f), (g), (i), and (j) with the consistent weight vector distribution. This can be explained by the convexity of the Pareto front of DTLZ
Thanks to random selection of the second solution for each reference point in NSGA-III, more solutions are obtained around the center of the Pareto front in Fig. 14(a) than the other algorithms in Fig. 14. This observation in Fig. 14(a) explains why good average results are obtained by NSGA-III in Table IV where the hypervolume is measured from the reference point (
Hypervolume contributions of solutions around the center of the Pareto front are relatively large when the reference point is close to the Pareto front. By moving a reference point away from the Pareto front, their contributions become relatively small since contributions of solutions on the boundary of the Pareto front increase. This explains why the evaluation results of NSGA-III are not good in Table V with the reference point (
We also show an experimental result of a single run of each algorithm on the ten-objective DTLZ
Experimental results on the ten-objective DTLZ
Experimental results on the ten-objective WFG
A large diversity of obtained solutions by NSGA-II in Figs. 15(h) and 16(h) also explains good evaluation results by the IGD indicator in Table VII. Fig. 16(h) also shows low convergence ability of NSGA-II. In Fig. 16, the Pareto front satisfies
B. Future Research Topics: Test Problems
As reported in [27]–[32] on evolutionary many-objective optimization, very good results are obtained by weight vector-based algorithms such as NSGA-III, \begin{equation} \sum _{i=1}^{M}\left ({y_{i}^{_{*}} }\right )^{k} = 1~{\mathrm{ and}}~y_{i}^{_{*}} \ge 0~{\mathrm{ for}}~i = 1, 2, \ldots , M \end{equation}
One question for future research is the similarity of these test problems to real-world many-objective problems. In other words, the question is their generality as test problems. We need much more research to answer this question. However, we can say that the Pareto front in (24) is very special in the following sense: an arbitrarily selected (
The minus versions of DTLZ1-4 and WFG4-9 do not have this strange feature. However, they may have some other strange features because their original versions have the above-mentioned feature. For example, in the DTLZ1-
Our experimental results show that the consistency between the shape of the Pareto front and the shape of the distribution of the weight vectors has a large effect on the performance of weight vector-based algorithms. This was demonstrated by Jain and Deb [44] for NSGA-III on the three-objective and five-objective inverted DTLZ1 problems. Our experimental results also suggest that the size of the Pareto front may have a large effect on the performance of EMO algorithms on many-objective problems. In Fig. 17, we show the Pareto fronts of the two-objective DTLZ2 and DTLZ
Pareto fronts of the two-objective DTLZ2 and its maximization version Max-DTLZ2 which is equivalent to the two-objective DTLZ
However, in DTLZ
Let us briefly explain why good solutions can be obtained for many-objective DTLZ, WFG, DTLZ
Generated solutions by changing distance variables. (a1) 2-Obj. DTLZ1. (a2) 5-Obj. DTLZ1. (a3) 10-Obj. DTLZ1. (b) 2-Obj. DTLZ2. (c) 2-Obj. DTLZ3. (d) 2-Obj. DTLZ4. (e) 2-Obj. WFG1. (f) 2-Obj. WFG2. (g) 2-Obj. WFG3. (h) 2-Obj. WFG4. (i) 2-Obj. WFG5. (j) 2-Obj. WFG6. (k) 2-Obj. WFG7. (l) 2-Obj. WFG8. (m) 2-Obj. WFG9.
Generated solutions by changing position variables. (a1) 2-Obj. DTLZ1. (a2) 5-Obj. DTLZ1. (a3) 10-Obj. DTLZ1. (b) 2-Obj. DTLZ2. (c) 2-Obj. DTLZ3. (d) 2-Obj. DTLZ4. (e) 2-Obj. WFG1. (f) 2-Obj. WFG2. (g) 2-Obj. WFG3. (h) 2-Obj. WFG4. (i) 2-Obj. WFG5. (j) 2-Obj. WFG6. (k) 2-Obj. WFG7. (l) 2-Obj. WFG8. (m) 2-Obj. WFG9.
In Fig. 18, the generated solutions from the same parent are on the same line except for WFG7 and WFG9. Since this also holds for many-objective problems from the DTLZ and WFG problem formulations, the generated solutions by changing distance variables can be compared by the Pareto dominance relation. This explains why good results are obtained by NSGA-II for some many-objective test problems. However, this is totally different from a general case of many-objective optimization where almost all solutions are nondominated with each other. Moreover, each test problem in DTLZ and WFG has only a single distance function independent of the number of objectives. By optimizing the single distance function, we can obtain Pareto optimal solutions. This means that the search for Pareto optimal solutions is single-objective optimization independent of the number of objectives.
Fig. 19 shows the generated solutions by changing position variables. Except for WFG2, generated solutions from the same solution have the same distance from the Pareto front. This also holds for many-objective problems from the DTLZ and WFG problem formulations. If a solution is Pareto optimal (i.e., if it is on the Pareto front), all solutions generated by changing position variables are also Pareto optimal. Thus, we can easily increase the diversity of solutions by changing position variables without deteriorating their convergence. From these discussions on Figs. 18 and 19, we can see that most test problems in DTLZ and WFG are not difficult as many-objective problems. This observation is consistent with the reported results by the weight vector-based algorithms in [27]–[32] where very good solutions were obtained (which are almost the same as a reference point set for the IGD calculation as shown in Fig. 5). This is also consistent with our experimental results by NSGA-II (e.g., the best results are obtained by NSGA-II for some of the WFG problems in Table VI with the IGD indicator).
Our observation from Figs. 18 and 19 also explains the features of some weight vector-based algorithms. Since the Pareto dominance relation holds among solutions generated by changing distance variables, Pareto dominance-based fitness evaluation is used in some weight vector-based algorithms (whereas it does not work well on many-objective problems in general). For the same reason, the convergence improvement is not difficult. Thus a large penalty value (i.e.,
As we have explained in this paper, the DTLZ and WFG problems are test problems with special characteristic features. Thus an important future research topic is the creation of a wide variety of test problems with respect to the shape of the Pareto front, the size of the Pareto front, the relation among decision variables, and the relation among objectives. The size of the Pareto front can be rephrased as the shape of the feasible region in the objective space.
C. Future Research Topics: Algorithm Developments
Our experimental results showed that good results were obtained when the shape of the distribution of weight vectors is the same as or similar to the shape of the Pareto front. In general, we do not know the shape of the Pareto front. It is much more difficult to find the exact shape of the Pareto front of a many-objective problem than the case of two or three objectives. So it may be desirable for many-objective optimizers to have robust search ability with respect to the shape of the Pareto front. A simple idea is to simultaneously use multiple sets of weight vectors with different distributions. For example, it may be a good idea to simultaneously use both the PBI and IPBI functions in a single MOEA/D algorithm. In [50], the simultaneous use of the weighted sum and the weighted Tchebycheff was examined. Then it was shown that better results were obtained from their simultaneous use than their individual use (i.e., MOEA/D-WS and MOEA/D-Tch) on many-objective knapsack problems.
Another idea is the adaptation of the distribution of weight vectors to the shape of the Pareto front. This idea has been suggested for MOEA/D in pa
Conclusion
In this paper, we clearly showed the similarity between the shape of the distribution of the weight vectors in weight vector-based many-objective algorithms and the shape of the Pareto fronts of frequently used many-objective test problems (i.e., DTLZ1-4 and WFG4-9). For demonstrating the high sensitivity of their performance to the shape of the Pareto front, we formulated DTLZ
One difficulty of recently developed weight vector-based algorithms is the lack of appropriate handling of reference lines outside the Pareto front (i.e., with no intersection with the Pareto front). This difficulty can be rephrased as the lack of appropriate criteria for choosing the second solution for each reference line. This difficulty was demonstrated by our experimental results on DTLZ
We also suggested that the DTLZ and WFG test problems are not difficult even when they have many objectives due to their special features in the test problem formulations. As a result, weight vector-based algorithms developed for those problems have fitness evaluation mechanisms which are not always suitable for many-objective optimization such as Pareto dominance and the emphasis on the minimization of the distance from the nearest reference line (e.g., the use of a large penalty value for the distance
Our computational experiments explained the following reasons for high performance of recently proposed weight vector-based algorithms on many-objective DTLZ1-4 and WFG4-9 test problems.
Triangular shape Pareto front of each test problem, which is the same as or similar to the shape of the distribution of the weight vectors.
Relatively small size Pareto front of each test problem in comparison with the feasible region in the objective space, which is suitable for multiobjective search by pulling solutions toward the ideal point using the weight vectors.
Easy convergence and easy diversification of solutions due to separable decision variables, which make it possible to focus on the uniformity of obtained solutions over the Pareto front.