Journals & Magazines >IEEE Transactions on Evolutio... >Volume: 20 Issue: 2

Balancing Convergence and Diversity in Decomposition-Based Many-Objective Optimizers

Abstract:

The decomposition-based multiobjective evolutionary algorithms (MOEAs) generally make use of aggregation functions to decompose a multiobjective optimization problem into...Show More

Metadata

Abstract:

The decomposition-based multiobjective evolutionary algorithms (MOEAs) generally make use of aggregation functions to decompose a multiobjective optimization problem into multiple single-objective optimization problems. However, due to the nature of contour lines for the adopted aggregation functions, they usually fail to preserve the diversity in high-dimensional objective space even by using diverse weight vectors. To address this problem, we propose to maintain the desired diversity of solutions in their evolutionary process explicitly by exploiting the perpendicular distance from the solution to the weight vector in the objective space, which achieves better balance between convergence and diversity in many-objective optimization. The idea is implemented to enhance two well-performing decomposition-based algorithms, i.e., MOEA, based on decomposition and ensemble fitness ranking. The two enhanced algorithms are compared to several state-of-the-art algorithms and a series of comparative experiments are conducted on a number of test problems from two well-known test suites. The experimental results show that the two proposed algorithms are generally more effective than their predecessors in balancing convergence and diversity, and they are also very competitive against other existing algorithms for solving many-objective optimization problems.

Published in: IEEE Transactions on Evolutionary Computation ( Volume: 20, Issue: 2, April 2016)

Page(s): 180 - 198

Date of Publication: 09 June 2015

ISSN Information:

DOI: 10.1109/TEVC.2015.2443001

Funding Agency:

Contents

SECTION I.

Introduction

Various multiobjective evolutionary algorithms (MOEAs) have been proposed to effectively solve multiobjective optimization problems (MOPs). However, most of these algorithms have been evaluated and applied to MOPs with only two or three objectives [1], [2], despite the fact that optimization problems involving a high number of objectives indeed appear widely in real-world applications [3]–[5]. In the literature, such MOPs that have more than three objectives are often termed many-objective optimization problems (MaOPs) [6], [7].

Unfortunately, experimental and analytical results [8], [9] have indicated that MaOPs would pose serious difficulties to the existing MOEAs, particularly for the popular Pareto-dominance-based MOEAs, e.g., nondominated sorting genetic algorithm II (NSGA-II) [10] and strength Pareto evolutionary algorithm 2 (SPEA2) [11]. The primary reason for this is that the proportion of nondominated solutions in a population increases exponentially with the number of objectives and it would lead to the severe loss of Pareto-dominance-based selection pressure toward the Pareto front (PF). In the existing research, there generally exist three types of techniques to overcome the shortage of Pareto-dominance-based MOEAs.

The idea of the first type of approaches is to adopt new preference relations (relaxed forms of Pareto dominance or other ranking schemes) that could induce finer grain order in the objective space. So far, many alternative preference relations have been proposed for many-objective optimization, such as average and maximum ranking [12], favor relation [13], fuzzy Pareto dominance [14], expansion relation [15], preference order ranking [16], $\theta $ -dominance [17], and so on [18]–[20]. The strength and weakness of different preference relations on MaOPs have been experimentally investigated to a certain degree by several similar studies [21]–[23].

The second type of approach intends to improve the diversity maintenance mechanism. The consideration is that since Pareto dominance could not exert sufficient selection pressure toward the PF in a high-dimensional objective space, then the selection should be almost solely based on the diversity, which is generally regarded as the secondary selection operator in MOEAs. However, most existing diversity maintenance criteria, e.g., crowding distance [10], prefer dominance resistant [8] solutions, which would bias the search toward the solutions with the poor proximity to the global PF, although these solutions may present good diversity over the objective space. Thus, the diversity preservation in many-objective scenario requires some care to avoid or weaken this phenomenon. Compared to the first type of research, the work along this direction [24]–[26] has received much less attention.

In many-objective optimization, it is indeed very difficult for a MOEA to emphasize both convergence and diversity within a single population or archive. Motivated by this, the third type of approaches maintains two archives for convergence and diversity separately during the evolutionary search, and at least one of the two archives still uses Pareto dominance. The related work can be referred in [27] and [28].

Unlike Pareto-dominance-based MOEAs, another two classes of MOEAs, indicator- and decomposition-based MOEAs, have not been criticized much on the selection pressure produced in a high-dimensional objective space. However, they would encounter their own difficulties when handling MaOPs.

The indicator-based MOEAs aim to optimize a performance indicator that provides a desired ordering among sets that represent PF approximations. The hypervolume (HV) [1] may be the most popular performance indicator ever adopted due to its nice theoretical properties [29]. There have been a few well-established indicator-based MOEAs [30]–[32] available in the literature that employ the HV as the selection criterion. Nevertheless, the computational cost of HV grows exponentially with the number of objectives increasing. To address this issue, one of the alternative strategies is to estimate the rankings of solutions induced by the HV indicator without computing the exact indicator values [33]. Another strategy is to replace the HV with the other indicators that are much less computationally expensive but also have good theoretical properties, e.g., $R2$ [34] and $\Delta _{p}$ [35]. This strategy has been implemented in some recent indicator-based MOEAs [36]–[38]. However, such kind of algorithms now still perform unsatisfactorily on MaOPs [17].

The decomposition-based MOEAs generally aggregate the objectives of a MOP into an aggregation function using a unique weight vector, and a set of weight vectors (or directions) will generate multiple aggregation functions, each of which defines a single-objective optimization problem. The diversity of the population is implicitly maintained by specifying a good spread of the weight vectors in the objective space. MOEA based on decomposition (MOEA/D) [39], [40] and multiple single-objective Pareto sampling (MSOPS) [41] are two most typical decomposition-based MOEAs. Very recently, Yuan et al. [42] proposed a new framework based on NSGA-II for many-objective optimization, called ensemble fitness ranking (EFR), which is more general than MSOPS. In terms of the form, EFR can be viewed as an extension to average ranking and maximum ranking [12]. The significant difference lies in that EFR employs more general fitness functions (aggregation functions) instead of objective functions. However, there is still deficiency for decomposition-based MOEAs to conduct many-objective search. Take MOEA/D for instance, it normally could approach the PF very well even if in the high-dimensional objective space, but it struggles to maintain the diversity in many-objective optimization, usually failing to achieve a good coverage of PF [20], [26], [43]. The reason is largely due to the nature of contour lines for the adopted aggregation functions, which will be further explained in Section III. The other decomposition-based MOEAs, e.g, MSOPS and EFR, suffer from a similar problem, since they also drive the search using aggregation functions. Note that, as a major framework to design MOEAs, MOEA/D has spawned a large amount of research work (see [44]–[47]), but almost all the newly proposed MOEA/D variants are yet to be validated on MaOPs to evaluate their suitability in many-objective optimization. Moreover, several recent decomposition-based MOEAs, e.g., MOEA/D-M2M [47] and NSGA-III [26], adopt new decomposition scheme, which decomposes the objective space into subregions without using any aggregation function. Although these algorithms address the difficulty in diversity maintenance to some extent, they struggle to converge to the PF in many-objective optimization as the Pareto dominance is still relied on to provide the selection pressure.

This paper mainly focuses on the decomposition-based MOEAs that use aggregation functions to promote the convergence in many-objective optimization. Our contributions to this topic are summarized as follows.

We have proposed a better tradeoff between convergence and diversity in decomposition-based many-objective optimizers by exploiting the perpendicular distance from a solution to the weight vector in the objective space.
We have implemented this idea in MOEA/D and EFR, respectively, to improve their performance on MaOPs, resulting in two enhanced algorithms, i.e., a MOEA/D variant (MOEA/D-DU) with a distance-based updating strategy and a new EFR version (EFR-RR) with a ranking restriction scheme.
We have provided an optional online normalization procedure that can be incorporated into MOEA/D-DU and EFR-RR to effectively tackle scaling problems, which is a new practice [26], [48] and has not been extensively studied in the literature.
We have combined for the first time different existing ideas together synergistically in a single MOEA. For example, MOEA/D-DU inherits the merit of systematic sampling (the same as [26]), notion of neighborhood (the same as [40]), adaptive normalization (similar to [26] and [48]), single first replacement (the same as [48]), and the preferred order of replacement (introduced in this paper).
We have discussed about the similarities and differences between our proposals and the other existing decomposition-based approaches. Based on the extensive experimental studies, we have also given possible explanations on why some of these approaches are inferior to ours in many-objective optimization.
Based on our experimental studies, we have suggested a practice to compare the performance of MOEAs (with or without sophisticated normalization) on benchmark problems more fairly and reasonably.

The rest of this paper is organized as follows. In Section II, the background knowledge is introduced. In Section III, the basic idea is given based on the analysis of the drawback of original MOEA/D and EFR. Section IV describes in detail how to implement the basic idea to enhance MOEA/D and EFR in many-objective optimization, respectively. Section V presents the test problems, performance metrics, and algorithm settings used for performance comparison. Section VI experimentally investigates the performance of enhanced algorithms from two aspects. In Section VII, the extensive experiments are carried out to compare the enhanced algorithms with state-of-the-art methods. Finally, Section VIII concludes this paper.

SECTION II.

Preliminaries and Background

In this section, some basic concepts in multiobjective optimization are first given. Then, we briefly introduce several most recent proposals in the literature that are somewhat similar to ours.

A. Basic Concepts

A MOP can be mathematically defined as \begin{align} \min \quad&\mathbf {f}(\,\mathbf {\mathbf {x}}) = \left ({\,f_{1}(\mathbf {x}), f_{2}(\mathbf {x}), \ldots , f_{m}(\mathbf {x})}\right )^{\mathrm {T}} \notag \\ \text {subject to} \quad&\mathbf {x} \in \Omega \subseteq \mathbb {R}^{n} \quad \end{align} View Source where $\mathbf {x} = (x_{1}, x_{2}, \ldots , x_{n})^{\mathrm {T}}$ is a $n$ -dimensional decision variable vector in the decision space $\Omega $ ; $\mathbf {f}$ : $\Omega \rightarrow \Theta \subseteq \mathbb {R}^{m}$ is an objective vector consisting of $m$ objective functions, which maps $n$ -dimensional decision space $\Omega $ to $m$ -dimensional attainable objective space $\Theta $ .

Definition 1:

A vector $\mathbf {u} = (u_{1}, u_{2}, \ldots , u_{k})^{\text {T}}$ is said to dominate another vector $\mathbf {v} = (v_{1}, v_{2}, \ldots , v_{k})^{\text {T}}$ , denoted by $\mathbf {u} \preceq \mathbf {v}$ , iff $\forall i \in \{1, 2, \ldots , k\}$ , $u_{i} \leq v_{i}$ and $\exists j \in \{1, 2, \ldots , k\}$ , $u_{j} < v_{j}$ .

Definition 2:

For a given MOP, a decision vector $\mathbf {x}^{\ast } \in \Omega $ is said to be Pareto optimal, iff $\nexists \mathbf {x} \in \Omega $ , $\mathbf {f}(\mathbf {x}) \preceq \mathbf {f}(\mathbf {x}^{\ast })$ .

Definition 3:

For a given MOP, the Pareto set, PS, is defined as \begin{equation} {\rm PS} = \left \{{\,\mathbf {x} \in \Omega | \mathbf {x}~\text {is Pareto optimal}}\right \}. \end{equation} View Source

Definition 4:

For a given MOP, the PF is defined as \begin{equation} {\rm PF} = \left \{{\,\mathbf {f}(\mathbf {x}) \in \Theta | \mathbf {x} \in {\rm PS}}\right \}. \end{equation} View Source

Definition 5:

For a given MOP, the ideal point $\mathbf {z}^{\ast }$ is a vector $\mathbf {z}^{\ast } = (z_{1}^{\ast }, z_{2}^{\ast }, \ldots , z_{m}^{\ast })^{\mathrm {T}}$ , where $z_{i}^{\ast }$ is the infimum of $f_{i}$ , for each $i \in \{1, 2, \ldots , m\}$ .

Definition 6:

For a given MOP, the nadir point $\mathbf {z}^{\text {nad}}$ is a vector $\mathbf {z}^{\text {nad}} = (z_{1}^{\text {nad}}, z_{2}^{\text {nad}}, \ldots , z_{m}^{\text {nad}})^{\mathrm {T}}$ , where $z_{i}^{\text {nad}}$ is the supremum of $f_{i}$ over the PS, for each $i \in \{1, 2, \ldots , m\}$ .

The goal of multiobjective optimization is to find a set of nondominated objective vectors that are close to the PF (convergence), and also enable them to distribute well over the PF (diversity).

B. Related Approaches

In this section, we would review several representative decomposition-based approaches relating to this paper.

Qi et al. [45] proposed an improved MOEA/D with the adaptive weight vector adjustment (MOEA/D-AWA), which mainly enhances MOEA/D in two aspects. First, MOEA/D-AWA uses a new weight vector initialization method based on the geometric analysis of the original Tchebycheff function. Second, MOEA/D-AWA adopts an adaptive weight vector adjustment strategy to deal with problems having complex PFs. This strategy adjusts weights periodically so that weights of subproblems can be redistributed adaptively to obtain better uniformity of solutions.
Li et al. [46] proposed to use a stable matching (STM) model to coordinate the selection process in MOEA/D, and the resulting MOEA/D variant is referred to as MOEA/D-STM. In MOEA/D-STM, the subproblems and solutions are regarded as two sets of agents. A subproblem prefers solutions which can lower its aggregation function values, while a solution prefers subproblems whose corresponding direction vectors are close to it. To balance the preference of subproblems and solutions, a matching algorithm is used to assign a solution to each subproblem in order to balance the convergence and diversity of the evolutionary search. Unlike MOEA/D, MOEA/D-STM uses the generational scheme.
Deb and Himanshu [26] suggested a reference-point-based many-objective NSGA-II, called NSGA-III. Similar to NSGA-II, NSGA-III still employs the Pareto nondominated sorting to partition the population into a number of nondomination fronts. In the last accepted front, instead of using the crowding distance, the solutions are selected based on a niche-preservation operator, where the solutions associated with less crowded reference line have higher priority to be chosen. Overall, NSGA-III emphasizes the nondominated solutions yet close to the reference lines of a set of supplied reference points. A sophisticated online normalization procedure is incorporated into NSGA-III to effectively handle scaled test problems.

Just around the time when this paper was originally submitted, the following two proposals have been available in the literature.

Asafuddoula et al. [48] proposed an improved decomposition-based evolutionary algorithm (I-DBEA) for many-objective optimization. In I-DBEA, the child solution attempts to enter the population via replacement only if it is nondominated with respect to the individuals in the population. The child solution competes with all the solutions in the population in a random order until it makes a successful replacement or all the solutions have been competed. I-DBEA also includes an online normalization procedure similar to that in NSGA-III.
Wang et al. [49] presented a MOEA/D variant with the global replacement (MOEA/D-GR). Once a new solution is produced in MOEA/D-GR, it is associated with a subproblem on which it achieves the minimum aggregation function value. Then, a number of closest subproblems to the associated subproblem are selected to form the replacement neighborhood, and the new solution attempts to replace the current solutions to these selected subproblems.

Note that, MOEA/D-STM and MOEA/D-GR were studied and verified only on 2- and 3-objective problems. MOEA/D-AWA was studied on MaOPs, but the problems were just limited to degenerate problems. NSGA-III and I-DBEA were designed specially for many-objective optimization. Moreover, although NSGA-III and I-DBEA use the decomposition-based idea, they still use the Pareto dominance instead of aggregation functions to control the convergence, which differ significantly from the other mentioned decomposition-based approaches.

SECTION III.

Basic Idea

The Tchebycheff function is one of the most common type of aggregation functions used in decomposition-based MOEAs. In this paper, we adopt a modified version of the Tchebycheff function. Let $ \boldsymbol {\lambda } _{j}=(\lambda _{j,1}, \lambda _{j,2}, \ldots , \lambda _{j,m})^{\text {T}}$ , $j=1,2,\ldots ,N$ , be a set of uniformly spread weight vectors and $\mathbf {z}^{\ast }$ be the ideal point, then the function for the $j$ th subproblem can be defined as \begin{equation} \mathcal {F}_{j}(\mathbf {x}) = \max _{k=1}^{m}\left \{{\frac {1}{\lambda _{j, k}}\left |{\,f_{k}(\mathbf {x}) - z_{k}^{*}}\right |}\right \} \end{equation} View Source where $\lambda _{j, k} \geq 0$ for all $k \in \{1, 2, \ldots , m\}$ and $\sum _{k=1}^{m}\lambda _{j, k} = 1$ . If $\lambda _{j, k} = 0$ , $\lambda _{j, k}$ is set to $10^{-6}$ .

This form of Tchebycheff function has two advantages over the original one [39]. First, the uniformly distributed weight vectors lead to the uniformly distributed search directions in the objective space. Second, each weight vector corresponds to a unique solution located on PF. The proof can be found in [45]. The two advantages alleviate the difficulty in diversity preservation to some extent.

However, even (4) is not perfect in practice. Ideally, if the optimal solution is obtained for each $\mathcal {F}_{j}$ defined in (4), then the most desired diversity is meanwhile achieved. But it is indeed not the case for MOEA/D and EFR. On the one hand, the final solutions yielded by them are often just near optimal ones, which could not fully ensure the diversity of the final population. For example, Fig. 1 illustrates the distribution of obtained solutions (A-E) aided by five weight vectors ($ \boldsymbol {\lambda }_{1}, \boldsymbol {\lambda } _{2}, \ldots , \boldsymbol {\lambda } _{5}$ ) in the bi-objective space, where the dashed lines depict the contour lines of subproblems decomposed by (4). It can be seen that the solutions do not spread as well as the weight vectors. This is because that although B and D achieve good values on $\mathcal {F}_{2}$ and $\mathcal {F}_{4}$ , respectively, both of them deviate far from their corresponding directions ($ \boldsymbol {\lambda } _{2}$ and $ \boldsymbol {\lambda } _{4}$ ). On the other hand, the more important issue arises in the evolutionary process, where the selection of solutions would be misled if only depending on aggregation function values. For instance, in Fig. 1, there is another solution F, which is a little worse than B in terms of $\mathcal {F}_{2}$ value. In MOEA/D, it is very possible for F to be replaced by B in the updating procedure. While in EFR, since B achieves the better ranking than F on $\mathcal {F}_{2}$ , F may be removed in the environmental selection. But intuitionally, F is in fact more preferred by the weight vector $ \boldsymbol {\lambda } _{2}$ . It is worth noting that, in the early stage of evolution, the solutions are usually far away from the PF, and the misleading selection would be more likely to occur, which may pull the search to only a part of the PF.

Fig. 1.

Illustration of the distribution of solutions in the 2-D objective space.

Show All

Both situations mentioned above could be attributed to the fact that the solution which is distant to $ \boldsymbol {\lambda } _{j}$ could also achieve relatively good $\mathcal {F}_{j}$ value. The contour lines of (4) is responsible for this problem, and the problem would be aggravated in the high-dimensional objective space due to the sparse distribution of solutions and the exponentially growing HV.

Given all these, we are motivated to consider not only the aggregation function value of a solution but also its distance to the corresponding weight vector in MOEA/D and EFR. This practice is expected to force the solutions to stay close to the weight vectors, and explicitly maintain the desired distribution of solutions in the evolutionary process, leading to the better balance between convergence and diversity in many-objective optimization. It is worth mentioning that the penalty-based boundary intersection function [39] implicitly considers the closeness of a solution to the weight vector to certain extent, but it still has the problem mentioned above. In this paper, without loss of generality, we only consider the modified Tchebycheff function defined in (4).

Suppose that $\mathbf {f}(\mathbf {x}) = (\,f_{1}(\mathbf {x}), f_{2}(\mathbf {x}), \ldots , f_{m}(\mathbf {x}))^{\text {T}}$ is the objective vector of solution $\mathbf {x}$ , $L$ is a line passing through $\mathbf {z}^{\ast }$ with direction $ \boldsymbol {\lambda } _{j}$ , and $\mathbf {u}$ is the projection of $\mathbf {f}(\mathbf {x})$ on $L$ . Let $d_{j,2}(\mathbf {x})$ be the perpendicular distance from the solution $\mathbf {x}$ to the weight vector $ \boldsymbol {\lambda } _{j}$ in the objective space, then it can be computed as \begin{equation} d_{j,2}(\mathbf {x}) = \left \|{\mathbf {f}(\mathbf {x}) - \mathbf {z}^{*} - d_{j,1}(\mathbf {x})\left ({ \boldsymbol {\lambda } _{j} \big / \left \|{ \boldsymbol {\lambda } _{j}}\right \|}\right )}\right \| \end{equation} View Source where $d_{j,1}(\mathbf {x})$ is the distance between $\mathbf {z}^{\ast }$ and $\mathbf {u}$ , and can be obtained by \begin{equation} d_{j,1}(\mathbf {x}) = \left \|{\left ({\,\mathbf {f}(\mathbf {x}) - \mathbf {z}^{*}}\right )^{\mathrm {T}} \boldsymbol {\lambda } _{j}}\right \| \Big / \left \|{ \boldsymbol {\lambda } _{j}}\right \|. \end{equation} View Source In Fig. 2, the perpendicular distance $d_{j,2}(\mathbf {x})$ is illustrated in the 2-D objective space. In the following, we will describe how to exploit $d_{j,2}(\mathbf {x})$ to enhance MOEA/D and EFR, respectively.

Fig. 2.

Illustration of the perpendicular distance from the solution to the weight vector in the 2-D objective space.

Show All

SECTION IV.

Proposed Algorithms

In this section, two enhanced algorithms, i.e., MOEA/D-DU and EFR-RR, are described in detail in Sections IV-A and IV-B, respectively. Section IV-C provides an optional normalization procedure that can be incorporated into the proposed algorithms. Section IV-D briefly analyzes the computational complexity of one generation of MOEA/D-DU and EFR-RR. In Section IV-E, we discuss the similarities and differences between the proposed algorithms and several existing ones in the literature.

A. Enhancing MOEA/D

The framework of the proposed MOEA/D-DU is depicted in Algorithm 1. First, a set of uniformly spread weight vectors $\Lambda = \{ \boldsymbol {\lambda } _{1}, \boldsymbol {\lambda } _{2}, \ldots , \boldsymbol {\lambda } _{N}\}$ is generated, where $ \boldsymbol {\lambda } _{j}$ determines the $j$ th subproblem, i.e., $\mathcal {F}_{j}$ . In MOEA/D-DU, the Das and Dennis’s systematic approach [50] is adopted to produce the diverse weight vectors just as in NSGA-III [26]. Then, a population of $N$ solutions $\mathbf {x}_{1}, \mathbf {x}_{2}, \ldots , \mathbf {x}_{N}$ is initialized, where $\mathbf {x}_{j}$ is the current solution to the $j$ th subproblem. The ideal point $\mathbf {z}^{\ast }$ is initialized in step 3. Because it is often very time-consuming to compute exact $z_{i}^{\ast }$ , it is indeed estimated by the minimum value found so far for objective $f_{i}$ , and is updated during the search. MOEA/D-DU still uses the same mating restriction scheme to generate child solutions as that in its predecessor [40], hence the neighborhood $B(i)$ for each subproblem $\mathcal {F}_{i}$ is determined in steps 4–6. Steps 7–18 are iterated until the termination criterion is met. At each iteration, for the solution $\mathbf {x}_{i}$ , the mating solution $\mathbf {x}_{k}$ is chosen from the neighborhood $B(i)$ with a probability $\delta $ and the whole population with a probability $1-\delta $ (see steps 9–12 of Algorithm 1). Then the genetic operators, i.e., simulated binary crossover (SBX) and polynomial mutation [51], are executed on them to generate a new solution $\mathbf {y}$ , and finally $\mathbf {y}$ is used to update the ideal point and the current population.

Algorithm 1 Framework of MOEA/D-DU

Generate a set of weight vectors $\Lambda \leftarrow \{ \boldsymbol {\lambda } _{1}, \boldsymbol {\lambda } _{2}, \ldots , \boldsymbol {\lambda } _{N}\}$

Initialize the population $P \leftarrow \{\mathbf {x}_{1}, \mathbf {x}_{2}, \ldots , \mathbf {x}_{N}\}$

Initialize the ideal point $\mathbf {z}^{\ast } \leftarrow (z_{1}^{*}, z_{2}^{*}, \ldots , z_{m}^{*})^{\mathrm {T}}$

for $i \leftarrow 1$ to $N$ do

$B(i) \leftarrow \{i_{1}, i_{2}, \ldots , i_{T}\}$ , where $ \boldsymbol {\lambda } _{i_{1}}, \boldsymbol {\lambda } _{i_{2}}, \ldots , \boldsymbol {\lambda } _{i_{T}}$ are $T$ closest weight vectors to $ \boldsymbol {\lambda } _{i}$ .

end for

while the termination criterion is not satisfied do

for $i \leftarrow 1$ to $N$ do

if $rand() < \delta $ then

10:

$E \leftarrow B(i)$

11:

else

12:

$E \leftarrow \{1, 2, \ldots , N\}$

13:

end if

14:

Randomly select an index $k \in E$ and $k \neq i$

15:

$\mathbf {y} \leftarrow \text {GeneticOperators($\mathbf {x}_{i}, \mathbf {x}_{k})$}$

16:

UpdateIdealPoint($\mathbf {y}, \mathbf {z}^{\ast }$ )

17:

UpdateCurrentPopulation($\mathbf {y}, \mathbf {z}^{\ast }, \Lambda , P, K$ )

18:

end for

19:

end while

20:

return all the nondominated solutions in $P$

The updating strategy (step 17 of Algorithm 1) is the characteristic procedure in MOEA/D-DU, which is significantly different from the original MOEA/D. This strategy is presented in detail in Algorithm 2, which works as follows. Once a new solution $\mathbf {y}$ is produced, its perpendicular distances to each weight vector $ \boldsymbol {\lambda } _{j}$ , i.e., $d_{j,2}(\mathbf {y})$ , $j = 1, 2, \ldots , N$ , are computed respectively. Then $K$ minimum distances are selected from all these $N$ distances, where $K \ll N$ is a parameter. Suppose the $K$ minimum distances are $d_{j_{1},2}(\mathbf {y}), d_{j_{2},2}(\mathbf {y}),\ldots , d_{j_{K},2}(\mathbf {y})$ , and are arranged in the nondecreasing order, i.e., $d_{j_{1},2}(\mathbf {y}) \leq d_{j_{2},2}(\mathbf {y}) \leq \ldots \leq d_{j_{K},2}(\mathbf {y})$ . The solution $\mathbf {y}$ is compared with the solutions $\mathbf {x}_{j_{1}}, \mathbf {x}_{j_{2}}, \ldots , \mathbf {x}_{j_{K}}$ one by one. If one solution $\mathbf {x}_{j_{k}}$ , $k \in \{1, 2, \ldots , K\}$ , satisfies that $\mathcal {F}_{j_{k}}(\mathbf {x}_{j_{k}})$ is greater than $\mathcal {F}_{j_{k}}(\mathbf {y})$ , then the solution $\mathbf {x}_{j_{k}}$ is replaced by the solution $\mathbf {y}$ , and the updating procedure is terminated. From the above, it can be seen that, MOEA/D-DU uses the steady-state form, where at most one solution in the current population can be replaced by the newly generated solution $\mathbf {y}$ .

Algorithm 2 UpdateCurrentPopulation($\mathbf {y}, \mathbf {z}^{\ast }, \Lambda , P, K$ )

for $j \leftarrow 1$ to $N$ do

Compute the perpendicular distance from $\mathbf {y}$ to the weight vector $ \boldsymbol {\lambda } _{j}$ , i.e., $d_{j, 2}(\mathbf {y})$

end for

Select $K$ minimum distances from $N$ distances $d_{j,2}(\mathbf {y})$ , $j = 1, 2, \ldots , N$ , and obtain $d_{j_{1},2}(\mathbf {y}) \leq d_{j_{2},2}(\mathbf {y}) \leq \ldots \leq d_{j_{K},2}(\mathbf {y})$

for $k \leftarrow 1$ to $K$ do

if $\mathcal {F}_{j_{k}}(\mathbf {y}) < \mathcal {F}_{j_{k}}(\mathbf {x}_{j_{k}})$ then

$\mathbf {x}_{j_{k}} \leftarrow \mathbf {y}$

return

end if

10:

end for

B. Enhancing EFR

EFR [42] adopts the generational scheme, and its framework is based on NSGA-II but with significant differences in the environmental selection phase.

At the $t$ th generation of EFR, the parent population $P_{t}$ (of size $N$ ) and the offspring population $Q_{t}$ (of size $N$ ) are first merged as the population $U_{t} = P_{t} \cup Q_{t}$ (of size $2N$ ). Then the best $N$ solutions would be chosen from $U_{t}$ as the next population, which can be described as follows. Each solution $\mathbf {x}$ in $U_{t}$ is associated with $N$ different fitness values $\mathcal {F}_{1}(\mathbf {x}), \mathcal {F}_{2}(\mathbf {x}), \ldots , \mathcal {F}_{N}(\mathbf {x})$ . In this paper, the fitness function $\mathcal {F}_{j}$ , $j \in \{1,2,\ldots ,N\}$ , is set as the modified Tchebycheff function defined in (4), and is determined in the initialization phase of EFR. For each fitness function $\mathcal {F}_{j}$ , all the solutions in $U_{t}$ are sorted in the nondecreasing order in accordance with values of such function, and each solution has its own ranking position. When all $N$ fitness functions are considered, for each solution $\mathbf {x}$ , it has $N$ ranking positions, denoted by the vector $\mathbf {R}(\mathbf {x}) = (r_{1}(\mathbf {x}), r_{2}(\mathbf {x}), \ldots , r_{N}(\mathbf {x}))^{\mathrm {T}}$ , where $r_{j}(\mathbf {x})$ is the ranking position of $\mathbf {x}$ on the fitness function $\mathcal {F}_{j}$ . Then the ensemble ranking is used to give the global rank $R_{g}(\mathbf {x})$ to each solution $\mathbf {x}$ based on $\mathbf {R}(\mathbf {x})$ . There are three alternative ensemble ranking schemes provided in [42]. Here we only consider the maximum ranking, which computes $R_{g}(\mathbf {x})$ as \begin{equation} R_{g}(\mathbf {x}) = \min _{j = 1}^{N} r_{j}(\mathbf {x}). \end{equation} View Source According to $R_{g}$ , the merged population $U_{t}$ can be partitioned into a number of solution sets $\{F_{1}, F_{2}, \ldots \}$ (analogous to “fronts” in NSGA-II), where the solutions in $F_{i}$ have the $i$ th minimum $R_{g}$ value. Unlike NSGA-II, EFR just randomly chooses solutions in the last accepted front.

From the above description, it can be seen that all the solutions in $U_{t}$ will be involved in the ranking on each of the fitness function. However, as mentioned in Section III, it may cause the misleading selection of solutions. To alleviate the problem, we propose a new version of EFR, called EFR-RR, where a ranking restriction scheme is introduced, i.e., a solution is only allowed to be ranked on the fitness functions whose corresponding weight vectors are close to it in the objective space. The detail procedure is explained as the following. For each solution $\mathbf {x} \in U_{t}$ , a set $B(\mathbf {x})$ with the size $K$ ($K \ll N$ ) is defined, which contains the indexes of $K$ closest weight vectors to $\mathbf {x}$ among all $N$ weight vectors, in terms of perpendicular distance given by (5). The solution $\mathbf {x}$ will only participate in the ranking on the fitness functions $\mathcal {F}_{j}$ , where $j \in B(\mathbf {x})$ . Thus, each solution $\mathbf {x}$ only has $K$ ranking positions, and each fitness function only assigns ranks to a part of solutions. In EFR-RR, the global rank of the solution $\mathbf {x}$ is given as \begin{equation} R_{g}(\mathbf {x}) = \min _{j \in B(\mathbf {x})} r_{j}(\mathbf {x}). \end{equation} View Source If $K = N$ , (8) is equivalent to (7).

In Algorithm 3, we describe the procedure of maximum ranking in EFR-RR. Further, we summarize the whole procedure of EFR-RR in Algorithm 4.

Algorithm 3 MaximumRanking $(U_{t}, K)$

$\{L_{1}, L_{2}, \ldots , L_{N}\} \leftarrow \{\emptyset , \emptyset , \ldots , \emptyset \}$

for each solution $\mathbf {x} \in U_{t}$ do

$R_{g}(\mathbf {x}) \leftarrow +\infty $

Compute the set $B(\mathbf {x})$

for each $j \in B(\mathbf {x})$ do

$L_{j} \leftarrow L_{j} \cup \mathbf {x}$

end for

for $j \leftarrow 1$ to $N$ do

10:

Sort $L_{j}$ according to $\mathcal {F}_{j}$ in nondecreasing order

11:

for $k \leftarrow 1$ to $|L_{j}|$ do

12:

$\mathbf {x} \leftarrow L_{j}[k]$

13:

if $k< R_{g}(\mathbf {x})$ then

14:

$R_{g}(\mathbf {x}) \leftarrow k$

15:

end if

16:

end for

17:

end for

18:

return $R_{g}$

Algorithm 4 Framework of EFR-RR

Generate a set of weight vectors $\Lambda \leftarrow \{ \boldsymbol {\lambda } _{1}, \boldsymbol {\lambda } _{2}, \ldots , \boldsymbol {\lambda } _{N}\}$

Initialize the population $P_{0} \leftarrow \{\mathbf {x}_{1}, \mathbf {x}_{2}, \ldots , \mathbf {x}_{N}\}$

Initialize the ideal point $\mathbf {z}^{\ast } \leftarrow (z_{1}^{*}, z_{2}^{*}, \ldots , z_{m}^{*})^{\mathrm {T}}$

$t \leftarrow 0$

while the termination criterion is not met do

$Q_{t}\leftarrow \text {MakeOffspringPopulation($P_{t}$)}$

$U_{t} \leftarrow P_{t} \cup Q_{t}$

$R_{g} \leftarrow \text {MaximumRanking($U_{t}, K$)}$

$\{F_{1}, F_{2}, \ldots \} \leftarrow \text {GetFronts($U_{t}, R_{g}$)}$

10:

$P_{t+1} \leftarrow \emptyset $

11:

$i \leftarrow 1$

12:

while $|P_{t + 1}| + |F_{i}| < N$ do

13:

$P_{t + 1} \leftarrow P_{t + 1} \cup F_{i}$

14:

$i \leftarrow i + 1$

15:

end while

16:

Randomly shuffle $F_{i}$

17:

$P_{t+1} \leftarrow P_{t+1} \cup F_{i}[1 : (N - |P_{t + 1}|)]$

18:

$t \leftarrow t + 1$

19:

end while

C. Optional Normalization Procedure

Normalization is particularly useful for an algorithm to solve problems having a PF whose objective values are differently scaled. For example, Zhang and Li [39] showed that even a naive normalization procedure can significantly improve the performance of MOEA/D on scaled test problems.

In essence, the task of normalization is to estimate $\mathbf {z}^{*}$ and the $\mathbf {z}^{\text {nad}}$ , because the objective of $f_{i}(\mathbf {x})$ can be replaced by (9) in the normalization \begin{equation} \tilde {f}_{i}(\mathbf {x}) = \left ({\,f_{i}(\mathbf {x}) - z_{i}^{\ast }}\right ) \Big / \left ({z_{i}^{\text {nad}} - z_{i}^{\ast }}\right ). \end{equation} View Source Generally, $z_{i}^{\ast }$ can be effectively estimated by the best value found so far for objective $f_{i}$ . However, the estimation of $\mathbf {z}^{\text {nad}}$ is a much more difficult task since it necessitates information about the whole PF [52].

In this paper, we provide an online normalization procedure similar to that in NSGA-III, which can be incorporated into the proposed algorithms to deal with scaled test problems.

The difference between normalization presented here and that in NSGA-III lies in that we use a slightly different achievement scalarizing function to identify extreme points. Suppose $S_{t}$ is the population to be normalized. In our normalization, the extreme point in the objective axis $f_{j}$ is identified by finding the solution $\mathbf {x} \in S_{t}$ that minimizes the following achievement scalarizing function with weight vector $\mathbf {w}_{j} = (w_{j,1}, w_{j,2}, \ldots , w_{j,m})^{\text {T}}$ being the axis direction:\begin{equation} {\rm ASF}\left ({\mathbf {x}, \mathbf {w}_{j}}\right ) = \max _{i=1}^{m}\left \{{\frac {1}{w_{j,i}}\left |{\frac {f_{i}(\mathbf {x}) - z_{i}^{\ast }}{z_{i}^{\text {nad}} - z_{i}^{\ast }}}\right |}\right \} \end{equation} View Source where $j \in \{1, 2, \ldots , m\}$ , $w_{j, i} = 0$ , when $i \neq j$ and $w_{j, j} = 1$ . In (10), for $w_{j, i} = 0$ , we replace it with a small number $10^{-6}$ , and $z_{i}^{\text {nad}}$ is the intercept of the constructed hyperplane with $i$ th objective axis in previous one generation.

It is worth pointing out that $m$ extreme points may fail to constitute an $m$ -dimensional hyperplane. Even if the hyperplane is built, it is also likely to obtain negative intercepts in some axis directions. The original NSGA-III study [26] did not indicate how to deal with these cases. In our normalization, when these cases occur, we assign $z_{i}^{\text {nad}}$ to the largest value of $f_{i}$ in the nondominated solutions of $S_{t}$ , for each $i \in \{1, 2, \ldots , m\}$ . For other details of the normalization, please refer to [26].

D. Computational Complexity

For MOEA/D-DU, the major computational costs are in the updating procedure depicted in Algorithm 2. Steps 1–3 require $O(mN)$ computations to compute $d_{j,2}(\mathbf {y})$ , $j = 1, 2, \ldots , N$ . In step 4, $O(N\log K)$ computations are needed to select $K$ minimum distances and sort them. Steps 8–13 at most require $O(mK)$ computations. Thus, the overall complexity of MOEA/D-DU to generate $N$ trial solutions in one generation is $O(mN^{2})$ , given that $m$ is generally larger than $\log K$ for many-objective optimization.

As for EFR-RR, the major computation costs lie in the calculation of global ranks. First, the computation of perpendicular distances totally requires $O(mN^{2})$ computations. To obtain $B(\mathbf {x})$ for each solution $\mathbf {x}$ totally requires $O(N^{2} \log K)$ computations. Suppose that $C_{j}$ is the number of solutions that are involved in the ranking on the aggregation function $\mathcal {F}_{j}$ , then the ranking on all the aggregation functions requires $O\left({\sum _{j = 1}^{N} C_{j} \log C_{j}}\right)$ computations. Because $\sum _{j = 1}^{N} C_{j} \log C_{j} < N^{2} \log N$ and generally $m > \log N$ in the many-objective scenario, the overall complexity in the worst case is $O(mN^{2})$ .

E. Discussion

After describing the details of MOEA/D-DU and EFR-RR, we would like to discuss main similarities and differences between the proposed algorithms and the ones listed in Section II-B. It should be pointed out that all these algorithms employ a set of weight vectors to guide the search process.

1) MOEA/D-DU Versus MOEA/D-AWA:

Although both of them make sure that the uniformly spread weight vectors lead to uniformly spread search directions in the objective space, they use slightly different means to implement this. MOEA/D-AWA adopts a new weight vector initialization method, whereas MOEA/D-DU uses a modified form of aggregation function. More importantly, MOEA/D-AWA aims to adjust the weights periodically so as to obtain a better uniformity of solutions on problems having complex PFs, whereas MOEA/D-DU intends to maintain a better balance between convergence and diversity using a set of fixed weight vectors. In other words, MOEA/D-AWA emphasizes weight adjustment strategy, whereas MOEA/D-DU emphasizes a new strategy to update the population.

2) MOEA/D-DU Versus MOEA/D-STM:

Both of them use the modified version of Tchebycheff function and employ $d_{j,2}(\mathbf {x})$ to balance the convergence and diversity. But they differ significantly in the implementation mechanism. In MOEA/D-STM, the aggregation function value and $d_{j,2}(\mathbf {x})$ are regarded as preferences of two kinds of agents (subproblem and solution), respectively. Then the deferred acceptance procedure is used to find a STM of preferences between subproblems and solutions and thus to select half of solutions from the combined population. In MOEA/D-DU, the compromise between aggregation function value and $d_{j,2}(\mathbf {x})$ is achieved in a more straightforward way, i.e., the newly generated solution first chooses its $K$ closest weight vectors in terms of $d_{j,2}(\mathbf {x})$ , and it only has the chance to compete with the solutions corresponding to these $K$ weight vectors. Moreover, MOEA/D-STM uses the generational scheme, while MOEA/D-DU is a steady-state algorithm. In Section VII-A, we would show that, in many-objective optimization, the performance of MOEA/D-STM cannot compare with that of MOEA/D-DU.

3) MOEA/D-DU Versus I-DBEA:

Both of them enhance MOEA/D with major modifications in the updating procedure and need to compute $d_{j,2}(\mathbf {x})$ in the evolutionary process. And also, they both adopt the steady-state form. However, the idea of MOEA/D-DU is in essence different from that of I-DBEA. For I-DBEA, although it uses $d_{j,1}(\mathbf {x})$ besides $d_{j,2}(\mathbf {x})$ to select solutions, the simple precedence of $d_{j,2}(\mathbf {x})$ over $d_{j,1}(\mathbf {x})$ makes $d_{j,1}(\mathbf {x})$ work with very low probability since $d_{j,2}(\mathbf {x})$ is a real value and it can almost always distinguish two solutions. Thus, similar to NSGA-III, I-DBEA also emphasizes nondominated solutions yet close to the reference lines. In contrast to I-DBEA, MOEA/D-DU does not rely on any Pareto dominance like the original MOEA/D, and it pulls the population toward PF by simultaneously minimizing a number of aggregation functions. Further more, $d_{j,2}(\mathbf {x})$ plays quite different roles in MOEA/D-DU and I-DBEA. In I-DBEA, $d_{j,2}(\mathbf {x})$ is used as a metric to directly compare solutions. Whereas in MOEA/D-DU, $d_{j,2}(\mathbf {x})$ is used to select solutions that have the chance to be replaced by the newly generated solution, and the solutions are still compared in terms of aggregation function values. In Section VII-A, we would show that MOEA/D-DU is particularly competitive to I-DBEA for solving MaOPs.

4) MOEA/D-DU Versus MOEA/D-GR:

Both of them conduct a careful selection of solutions that can be replaced by the new solution. However, the two algorithms have rather different original intentions. MOEA/D-GR aims to find a most suitable subproblem for the newly produced solution, and the neighborhood concept of subproblems defined in MOEA/D is still used in the replacement scheme of MOEA/D-GR. Unlike MOEA/D-GR, MOEA/D-DU does not use the neighborhood of subproblems any longer in the updating procedure. Instead, it takes a pure geometric view and just exploits $d_{j,2}(\mathbf {x})$ to avoid the preservation of solutions that achieves the good aggregation function value but far from the region represented by the corresponding weight vector as much as possible, and so as to maintain a desired distribution of solutions in high-dimensional objective space. In Section VII-A, we would demonstrate the superiority of MOEA/D-DU over MOEA/D-GR on MaOPs.

5) EFR-RR Versus NSGA-III:

Both of them use the generational scheme and split the combined population into a number of “fronts” to select solutions. However, there exists significant difference between them in how to rank solutions in the environmental selection phase. In NSGA-III, the Pareto dominance is still used to promote the convergence, and a niche-preservation operator aided by a set of weight vectors is adopted to select solutions in the last accepted front, intending to maintain the population diversity. In contrast to NSGA-III, EFR-RR promotes the convergence using the aggregation functions instead of Pareto dominance. Note that EFR-RR also performs the niching procedure with the aid of weight vectors, thus each weight vector in EFR-RR serves the purpose of promoting both convergence and diversity. Whereas in NSGA-III, each weight vector is mainly used to emphasize the diversity. Even when it comes to niching methodology, there is a distinction between NSGA-III and EFR-RR. In NSGA-III, each solution can only be associated with a sole weight vector; but in EFR-RR, each solution can be associated with several weight vectors. In Section VII-B, we would show that EFR-RR is compared favorably with NSGA-III on scaled problems.

SECTION V.

Experimental Design

This section is devoted to the experimental design for investigating the performance of MOEA/D-DU and EFR-RR. First, the test problems and performance metrics used in the experiments are given. Then, we list eight MOEAs that are used to validate the proposed algorithms. Finally, the experimental settings adopted in this paper are provided.

A. Test Problems

For comparison purposes, we use the test problems from DTLZ [53] and WFG [54] test suites. We divide the problems into two groups.

The first group of problems are all normalized test problems including DTLZ1–DTLZ4, DTLZ7, and WFG1–WFG9. Note that the objective values of the original DTLZ7 and WFG1–WFG9 problems are scaled differently. Since their true PFs are known, we modify their objective functions as \begin{equation} f_{i} \leftarrow \left ({\,f_{i} - z_{i}^{\ast }}\right ) \Big / \left ({z_{i}^{\text {nad}} - z_{i}^{\ast }}\right ), \hspace {5pt} i = 1, 2, \ldots , m. \end{equation} View Source Thus, for each of normalized DTLZ7 and WFG1–WFG9 problems, the ideal point and the nadir point are $\mathbf {0}$ and $\mathbf {1}$ , respectively. This group of problems are used to test the performance of the algorithms without explicit normalization procedure. The main features of these problems are summarized in Table I.

TABLE I Features of the Test Problems

The second group of problems are all scaled test problems, including scaled DTLZ1 and DTLZ2 problems [26] and the original WFG4–WFG9 problems. The scaled DTLZ1 and DTLZ2 are the modifications of DTLZ1 and DTLZ2, respectively. To illustrate, if the scaling factor is $10^{i}$ , the scaled DTLZ1 modifies the objective functions of the original DTLZ1 as \begin{equation} f_{i} \leftarrow 10^{i-1}f_{i}, \hspace {5pt} i = 1, 2, \ldots , m. \end{equation} View Source This group of test problems are used to test the ability of the algorithms with normalization procedure to deal with differently scaled objective values.

All these problems can be scaled to any number of objectives and decision variables. In this paper, we consider the number of objectives $m \in \{2,5,8,10,13\}$ . For DTLZ problems, the number of decision variables is given by $n = m + k - 1$ where $m$ is the number of objectives, $k$ is set to 5 for DTLZ1, 10 for DTLZ2–DTLZ6, and 20 for DTLZ7. For all WFG problems, the number of decision variables is set to 24, and the position-related parameter is set to $m - 1$ . The scaling factors used for scaled DTLZ1 and DTLZ2 with different number of objectives are shown in Table II.

TABLE II Scaling Factors for Scaled DTLZ1 and DTLZ2 Problems

B. Performance Metrics

The performance metrics are needed to evaluate the performance of the concerned algorithms. The inverted generational distance (IGD) [55] is one of the most widely used metrics in multiobjective scenario, providing a combined information about convergence and diversity of a solution set. However, in high-dimensional objective space, a huge number of uniformly distributed points on the PF are required to calculate IGD in a reliable manner [56]. In this paper, we use another very popular metric, HV [1], as the primary comparison criterion. The HV metric is strictly Pareto compliant [55], and its nice theoretical qualities make it a rather fair metric [31]. HV can measure both convergence and diversity of a solution set in a sense, and larger values means better quality.

For calculating HV, the choice of reference point is a crucial issue. In our experiments, following the recommendation in [57] and [58], we set the reference point to be $1.1\mathbf {z}^{\text {nad}}$ , where $\mathbf {z}^{\text {nad}}$ can be analytically obtained for each test problem. Moreover, according to the practice in [7] and [19], the solutions that do not dominate reference point are discarded for the HV calculation. In the event of scaled problems, the objective values of the obtained solution set and the reference point are first normalized using $\mathbf {z}^{\text {nad}}$ and $\mathbf {z}^{\ast }$ ($\mathbf {z}^{\ast }$ is $\mathbf {0}$ for all the adopted test problems) before computing HV. Thus, for an $m$ -objective problem, $\text {HV} \in [0, 1.1^{m}-V_{m}]$ in our experiments, where $V_{m}$ is the HV of the region enclosed by the exact normalized PF and the coordinate axes. For problems having no more than ten objectives, HV is exactly calculated using the recently proposed WFG algorithm [59]. As for problems with 13 objectives, the HV is approximated by the Monte Carlo simulation proposed in [33], and 10 000 000 sampling points are used to ensure the accuracy.

Additionally, we use generational distance (GD) [60] and diversity comparison indicator (DCI) [61] as two assistant performance metrics. GD indicates how close the obtained solution set between the true PF, and can only reflect the convergence of an algorithm. For GD, smaller value is preferred. DCI is recently proposed for assessing the diversity of PF approximations in many-objective optimization. $\text {DCI} \in [{0, 1}]$ only has a relative sense, and larger value means better. To calculate DCI, a parameter div (the number of divisions) is required to set the grid environment. Following the guidance in [61], we set div as shown in Table III.

TABLE III Setting of div in DCI

C. MOEAs for Comparison

Eight MOEAs are considered to evaluate MOEA/D-DU and EFR-RR, four out of which are MOEA/D variants.

The first MOEA/D variant is a slightly modified version of MOEA/D-DE [40]. For a fair comparison, we replace the differential evolution (DE) operator in MOEA/D-DE with the recombination schemes in MOEA/D-DU. Another reason for this is that the results in [26] have indicated that DE operator seems not quite suitable for MaOPs. In our experimental studies, this MOEA/D version is referred as MOEA/D for short, which is the base algorithm of MOEA/D-DU.

The other three compared MOEA/D variants are MOEA/D-STM [46], MOEA/D-GR [49], and I-DBEA [48], since they are somewhat similar to MOEA/D-DU. Based on the same reasons mentioned above, we replace the DE operators in MOEA/D-STM and MOEA/D-GR with the recombination schemes in MOEA/D-DU.

The original EFR [42] is also involved in comparison because it is the predecessor of EFR-RR.

Moreover, two nondecomposition-based MOEAs are adopted for comparison, i.e., grid-based evolutionary algorithm (GrEA) [20] and shift-based density estimation (SDE) [25], both of which were proposed specially for many-objective optimization and compared favorably with several state-of-the-art many-objective optimizers. For SDE, the version SPEA2+SDE is used here as it achieves the best overall performance among the three considered versions (NSGA-II+SDE, SPEA2+SDE, and PESA-II+SDE) in [25].

All seven MOEAs mentioned above except MOEA/D-STM and I-DBEA do not incorporate any explicit normalization techniques in their original studies. MOEA/D-STM uses a naive normalization procedure, while I-DBEA uses a sophisticated one. The normalization is not the major contribution of this paper. So, for a fair comparison, we just remove the normalization procedures in MOEA/D-STM and I-DBEA. The seven MOEAs together with MOEA/D-DU and EFR-RR are compared on normalized test problems in Section VII-A. This practice is to eliminate the influence of differently scaled objective values and the normalization procedure on the performance of MOEAs and purely verify the effectiveness of the proposed strategies.

To investigate the performance on scaled test problems, we further incorporate the normalization procedure provided in Section IV-C into MOEA/D-DU and EFR-RR, and compare them in Section VII-B with NSGA-III [26], one of whose features is to include an ingenious normalization procedure to handle differently scaled objective values.

Except GrEA and SDE, all the other concerned MOEAs use the same approach [50] to generate structured weight vectors. MOEA/D, MOEA/D-STM, MOEA/D-GR, MOEA/D-DU, EFR, and EFR-RR all use the modified Tchebycheff function defined in (4).

All MOEAs including MOEA/D-DU and EFR-RR are implemented in the jMetal framework [62], and run on an Intel Core i7 2.9 GHz processor with 8 GB of RAM.

D. Experimental Settings

The experimental settings include general settings and parameter settings. The general settings are as follows.

Number of Runs and Termination Criterion: Each algorithm is run 30 times independently on each test instance, and the average metric values are recorded. The algorithm terminates after $20\,000 \times m$ function evaluations per run.
Significance Test: To test the difference for statistical significance in some cases, the Wilcoxon signed-rank test [63] at a 5% significance level is conducted on the metric values obtained by two competing algorithms.

As for parameter settings, we first list several common ones.

Population Size: The population size is the same with the number of weight vectors in all the algorithms except GrEA and SDE. Because of the generation mode of weight vectors, the population size for these algorithms cannot be arbitrary and is controlled by a parameter $H$ ($N = C_{H + m -1}^{m - 1}$ ). As for GrEA and SDE, the population size can be set to any positive integer. But for ensuring a fair comparison, all the algorithms adopt the same population size for each problem instance. Table IV lists the population sizes used for varying number of objectives. To avoid only producing boundary weight vectors, the two layers of weight vectors [26] are adopted for problems having more than five objectives.
Parameter for Crossover and Mutation: All the algorithms use the SBX crossover and polynomial mutation to produce new solutions. The settings are shown in Table V. In EFR, EFR-RR, I-DBEA, and NSGA-III, a larger distribution index for crossover ($\eta _{c} =30$ ) is used according to [26], [42], and [48].
Neighborhood Size $T$ and Probability $\delta $ : In MOEA/D, MOEA/D-STM, MOEA/D-GR and MOEA/D-DU, $T$ is set to 20 and $\delta $ is set to 0.9.
Parameter $K$ in MOEA/D-DU and EFR-RR: $K$ is set to 5 for MOEA/D-DU and 2 for EFR-RR. In Section VI-A, the influence of $K$ will be investigated.

TABLE IV Setting of the Population Size

TABLE V Parameter Settings for Crossover and Mutation

MOEA/D, MOEA/D-GR, GrEA, and SDE have their specific parameters. In MOEA/D, the maximal number of solutions replaced by each child solution ($n_{r}$ ) is set to 1. In MOEA/D-GR, the size of replacement neighborhood ($T_{r}$ ) is set to 5 and only one solution can be replaced by the child solution. In GrEA, the setting of grid division (div) for each test instance is summarized in Table VI, which is adjusted according to guidelines provided in [20]. In SDE, the archive size is set as the same with the population size.

TABLE VI Setting of Grid Division in GrEA

SECTION VI.

Analysis of the Performance of Enhanced Algorithms

In this section, the performance of enhanced algorithms is to be analyzed. First, the influence of parameter $K$ on the performance of MOEA/D-DU and EFR-RR is examined. Then, we investigate the ability of proposed algorithms to balance the convergence and diversity in many-objective optimization.

A. Influence of Parameter $K$

$K$ is a major control parameter to balance the convergence and diversity in MOEA/D-DU and EFR-RR. To study their sensitivity to $K$ , we have tested the settings of $K \in [{1, 20}]$ with a step size 1 for both of them on all the normalized problem instances. All other parameters except $K$ are kept the same as in Section V-D.

Due to space limitation, Fig. 3 only shows the variation of HV values across all $K$ values on the normalized DTLZ4, DTLZ7, WFG3, and WFG9 problems with 5, 8, 10, and 13 objectives. From Fig. 3, we can observe the following.

$K$ exerts the similar impact on the performance of MOEA/D-DU and EFR-RR for each test instance.
The most suitable setting of $K$ not only depends on what problem to be solved but also depends on the number of objectives the problem has.
For some problems, e.g., DTLZ4, the performance of MOEA/D-DU or EFR-RR is not very sensitive to the settings of $K$ . Both of them can perform quite stably over a wide range of $K$ values.
For some problems, e.g., WFG3 and WFG9, different setting of $K$ may only lead to a slight variation of performance.
For other problems, e.g., DTLZ7, a more careful setting of $K$ should be conducted. In such cases, a slight variation of $K$ may result in a huge change of their performance.

Fig. 3.

Examination of the influence of $K$ on the performance of MOEA/D-DU, EFR-RR for the normalized (a) and (e) DTLZ4, (b) and (f) DTLZ7, (c) and (g) WFG3, and (d) and (h) WFG9 test problems with varying number of objectives $m$ . The figures show the average HV of 30 independent runs each.

Show All

Overall, although the performance of MOEA/D-DU and EFR-RR varies with different $K$ values, they can achieve the balance between convergence and diversity under the proper setting of $K$ . In general, our experiments suggest that it may be good to adjust $K$ between $[1, 8]$ since they can usually achieve the optimal or near optimal performance with the setting of $K$ in this range. When tuning $K$ for the considered problem instance, if the solutions do not converge well, larger $K$ value may be more suitable; if the diversity of solutions is not desirable, smaller $K$ value is recommended.

B. Investigation of Convergence and Diversity

This section is devoted to investigate the convergence and diversity of the proposed algorithms using GD and DCI metrics, respectively. Here, we only consider normalized DTLZ1–DTLZ4 and WFG4–WFG9 problems. Because the PFs of these problems are all regular geometries, GD value can be easily determined analytically without sampling points on the PF. The PF shape of DTLZ1 is a hyperplane, while that of the other problems is a hypersphere.

We decide to compare our MOEA/D-DU and EFR-RR to their corresponding predecessors (MOEA/D and EFR) in the many-objective scenario. Table VII provides the average GD and DCI results. From Table VIII, both MOEA/D-DU and EFR-RR are generally better than their predecessors at maintaining the diversity, which achieves the better DCI results in almost all the concerned instances. For the convergence, MOEA/D-DU is still superior overall to MOEA/D and it obtains better GD results in most cases; EFR-RR is especially competitive to EFR and they perform better in terms of average GD with respect to different problem instances. It is worth noting that, compared with EFR, EFR-RR obtains much poorer GD and DCI results on 5-objective DTLZ3, which has a huge number of local PFs. The exact reasons for EFR-RRs underperformance on this instance is the subject of future investigation.

TABLE VII Comparison Between MOEA/D-DU and MOEA/D (EFR-RR and EFR) in Terms of Average GD and DCI Values on Normalized Test Problems. The Better Average Result for Each Instance is Highlighted in Boldface

TABLE VIII Performance Comparison on Normalized DTLZ Problems With Respect to the Average HV Values. The Best Average HV Value for Each Instance is Highlighted in Boldface

To describe the distribution of solutions in high-dimensional objective space more intuitively, Fig. 4 plots the final nondominated solutions of MOEA/D-DU, MOEA/D, EFR-RR, and EFR in a single run on 10-objective WFG4 instance by parallel coordinates. This particular run is associated with the result closest to the average HV value. As can be seen from Fig. 4, both MOEA/D-DU and EFR-RR can find a widely distributed solutions over $f_{i} \in [{0, 1}]$ for all ten objectives and a better tradeoff among them is clear from the plot, whereas both MOEA/D and EFR fail to cover most of intermediate solutions.

Fig. 4.

Final nondominated solution set of (a) MOEA/D-DU, (b) MOEA/D, (c) EFR-RR, and (d) EFR on the normalized 10-objective WFG4 instance, shown by parallel coordinates.

Show All

Based on the above comparisons, it can be concluded that MOEA/D-DU and EFR-RR can generally achieve better tradeoff between convergence and diversity than their corresponding predecessors.

SECTION VII.

Comparison With State-of-the-Art Algorithms

In this section, we compare the proposed MOEA/D-DU and EFR-RR with several state-of-the-art MOEAs according to the experimental design described in Section V.

A. Comparison on Normalized Test Problems

In this section, the algorithms without explicit normalization are compared on the first group of problems, i.e., normalized test problems. Table VIII shows the comparative results of nine algorithms in terms of HV metric on normalized DTLZ test problems, and Table IX on normalized WFG test problems. To have statistically sound conclusions, the significance test is conducted between MOEA/D-DU (EFR-RR) and each of seven other algorithms. In the two tables, the results significantly outperformed by MOEA/D-DU, EFR-RR, and both of them are marked in different symbols, respectively. Table X provides a summary of significance test of HV results between the proposed MOEA/D-DU (EFR-RR) and the other algorithms on all the considered 70 test instances. In this table, $\text {Alg}_{1}$ versus $\text {Alg}_{2}$ , “B” (“W”) means that the number of instances on which the results of $\text {Alg}_{1}$ are significantly better (worse) than those of $\text {Alg}_{2}$ , and “E” indicates the number of instances where there exists no statistical significance between the results of $\text {Alg}_{1}$ and $\text {Alg}_{2}$ .

TABLE IX Performance Comparison on Normalized WFG Problems With Respect to the Average HV Values. The Best Average HV Value for Each Instance is Highlighted in Boldface

TABLE X Summary of the Significance Test of HV Between the Proposed Algorithms and the Other Algorithms

From the above results, we can obtain the following observations for the proposed MOEA/D-DU and EFR-RR.

MOEA/D-DU is clearly superior to MOEA/D. Indeed, MOEA/D-DU significantly outperforms MOEA/D in 52 out of 70 instances, whereas MOEA/D only wins in eight instances and half of them are 2-objective ones. This strongly indicates that the proposed updating strategy in MOEA/D-DU is more effective than the original one in MOEA/D for solving MaOPs.
MOEA/D-DU also shows certain advantages over the other three MOEA/D variants, i.e., MOEA/D-STM, MOEA/D-GR and I-DBEA. Specifically, MOEA/D-DU significantly outperforms both MOEA/D-STM and MOEA/D-GR on vast majority of instances. Compared with I-DBEA, MOEA/D-DU is generally significantly better on 2, 5, and 8-objective instances, and it remains very competitive on 10-objective instances. But on 13-objective instances, I-DBEA generally shows higher performance. These results verify that the selection mechanism in MOEA/D-DU is comparable with or even better than the other alternative ones in state-of-the-art MOEA/D variants.
EFR-RR achieves an obvious improvement over EFR in many-objective scenario. Among 56 many-objective ($m > 3$ ) instances, EFR-RR performs significantly better in 46 out of them, whereas EFR is significantly better in only 5 of them. For 2-objective instances, there exists no obvious performance difference between EFR-RR and EFR. These results clearly demonstrate the effectiveness of ranking restriction scheme used in EFR-RR on MaOPs.
MOEA/D-DU and EFR-RR are both particularly competitive to two nondecomposition-based MOEAs designed specially for many-objective optimization, i.e., GrEA and SDE. Each of the four algorithms performs best on some specific problems.

Based on the above extensive results, we further introduce the performance score [33] to quantify the overall performance of compared algorithms, which makes it easier to gain some insights into the behaviors of these algorithms. For a problem instance, suppose there are $l$ algorithms $\text {Alg}_{1}, \text {Alg}_{2}, \ldots , \text {Alg}_{l}$ employed in comparison, let $\delta _{i,j}$ be 1, if $\text {Alg}_{j}$ significantly outperforms $\text {Alg}_{i}$ in terms of HV metric, and 0 otherwise. Then, for each algorithm $\text {Alg}_{i}$ , the performance score $P(\text {Alg}_{i})$ is determined as \begin{equation} P\left ({\text {Alg}_{i}}\right )=\sum _{\substack {j=1\\ j \neq i}}^{l}\delta _{i,j}. \end{equation} View Source This value means how many other algorithms are significantly better than $\text {Alg}_{i}$ on the considered problem instance. So, the smaller the value, the better the algorithm. Fig. 5 summarizes the average performance score for different number of objectives and different test problems, respectively.

Fig. 5.

Average performance score over all (a) normalized test problems for different number of objectives and (b) dimensions for different normalized test problems, namely DTLZ(Dx) and WFG(Wx). The smaller the score, the better the PF approximation in terms of HV metric. The values of proposed MOEA/D-DU and EFR-RR are connected by a solid line to easier assess the score.

Show All

From Fig. 5(a), both MOEA/D-DU and EFR-RR rank very well on MaOPs. For 2-objective problems, MOEA/D-DU is overall slightly better than MOEA/D, whereas EFR-RR is overall slightly worse than EFR. It is interesting to observe that, GrEA and I-DBEA, which show competitive performance on MaOPs, do not perform well on 2-objective problems. Conversely, MOEA/D, MOEA/D-STM, and MOEA/D-GR, which perform satisfactorily on 2-objective problems, scale poorly on MaOPs. Moreover, I-DBEA presents an interesting search behavior, i.e., it shows more competitiveness on problems having higher number of objectives.

From Fig. 5(b), EFR-RR achieves the best overall performance in 10 out of total 14 test problems, and it takes the second place on WFG9 problem. But EFR-RR does not show such excellent performance on WFG1 and WFG3. MOEA/D-DU performs overall best on WFG3, and it ranks second in eight instances, all following EFR-RR. However, MOEA/D-DU may struggle on DTLZ7 and WFG1, where it is only overall better than MOEA/D-GR.

Table XI presents the average performance score over all 70 problem instances for nine concerned algorithms, and the overall rank is given to each algorithm according to the score. The proposed EFR-RR and MOEA/D-DD are ranked first and second, respectively, followed by SDE and GrEA. But it is worth pointing out that this is just a comprehensive evaluation of algorithms on all the problem instances considered in this paper. Indeed, no algorithm can beat any of the other algorithms on all possible instances, and some algorithm is more suitable for solving certain kinds of problem, e.g., I-DBEA may show advantages over the others on problems with a very high number of objectives. In addition, the overall excellent performance of GrEA in our experiments is obtained by suitably setting div for each test instance. In this regard, GrEA takes advantage of the other compared algorithms.

TABLE XI Ranking in the Average Performance Score Over All Normalized Test Instances for the Nine Algorithms. The Smaller the Score, the Better the Overall Performance in Terms of HV Metric

Since GrEA and SDE are two competitive many-objective optimizers to MOEA/D-DU and EFR-RR, we further show the average CPU time over total $70 \times 30 = 2100$ runs for each of the four algorithms in Table XII. Because all the four algorithms are implemented in the same framework, and run on the same computing environment, the data can give us a rough understanding of how fast each algorithm is in a relatively fair manner. The results indicate that, although GrEA and SDE also achieve decent overall performance, their efficiency cannot match that of MOEA/D-DU and EFR-RR, clearly requiring much more computational effort.

TABLE XII Average CPU Time Per Run for MOEA/D-DU, EFR-RR, GREA, and SDE

B. Comparison on Scaled Test Problems

In this section, we incorporate the normalization procedure presented in Section IV-C into the proposed MOEA/D-DU and EFR-RR, and compare them with NSGA-III on the second group of test problems, i.e., scaled test problems.

Table XIII shows the average HV results obtained by MOEA/D-DU, EFR-RR, and NSGA-III. From Table XIII, EFR-RR shows obvious advantage over NSGA-III. To be specific, EFR-RR significantly outperforms NSGA-III on 30 out of 40 instances, whereas NSGA-III is significantly better than EFR-RR only on five instances including 5- and 13-objective DTLZ1, 13-objective DTLZ2, and 5- and 13-objective WFG5. As for MOEA/D-DU, it is generally superior to NSGA-III on 2-, 5-, and 8-objective problems. On 10-objective problems, MOEA/D-DU and NSGA-III have equal shares, and either of them performs significantly better than the other on half of problems. Nevertheless, MOEA/D-DU does not perform as well as NSGA-III on 13-objective problems, where NSGA-III wins in 7 of them.

TABLE XIII Performance Comparison on Scaled Test Problems With Respect to the Average HV Values. The Best Average HV Value for Each Instance is Highlighted in Boldface

Fig. 6 plots the evolutionary trajectories of HV with generation numbers for MOEA/D-DU, EFR-RR and NSGA-III on 8-objective WFG9 instance. From Fig. 6, although all the three algorithms can stably increase HV with elapse of generation numbers, MOEA/D-DU and EFR-RR converge to the PF faster than NSGA-III and finally achieve higher HV values.

Fig. 6.

Evolutionary trajectories of HV over generations for MOEA/D-DU, EFR-RR, and NSGA-III on the 8-objective WFG9 instance.

Show All

From the above results, it can be concluded that either MOEA/D-DU or EFR-RR with an elaborate normalization procedure is at least competitive to or even better than NSGA-III for solving scaled problems.

C. Further Discussion

In this section, we would further discuss several issues concerning with the experimental studies.

The first issue is about the setting of population size in our experiments. Intuitively, exponentially more points are needed to represent a tradeoff of higher dimensional PF. Thus, a large population size is required for MOEAs to approximate the entire PF. However, just as mentioned in [26], it is difficult for a decision maker to comprehend and select a preferred solution in this scenario. Another drawback is that a large population size would make MOEAs computationally inefficient, and sometimes may be impractical. For example, if we use a population size of 100 in 2-objective case, then a population size of $10^{8}$ could be used in the 8-objective case, in the premise that we keep the same dense distribution. Clearly, it is generally unaffordable and impractical for a MOEA to evolve a population having such a huge number of solutions. So, as illustrated in [56], one of the practical search strategies in many-objective optimization is to globally search for sparsely distributed nondominated solutions over the entire PF. In our experiments, the setting of population size is shown in Table IV, and the similar settings can be seen in most existing studies on many-objective optimization (see [14], [20], [25], [26], [48]).

The second issue is why MOEA/D-DU and EFR-RR do not exhibit clear superiority over their predecessors on 2-objective problems, although they have shown much better performance on many-objective ones. We suspect there are two possible reasons. First, just as mentioned above, we can only in fact use a population with the sparse distribution of solutions (and weight vectors) in a high-dimensional objective space, and a particular region may be associated with only one solution or one weight vector. So, if the aggregation function value is emphasized too much, the solution is very likely to deviate far from the corresponding region and thus to deteriorate the diversity preservation. In a 2-D objective space, just a normal population size, e.g., 100, can achieve a very dense distribution of solutions, so it is much less likely to fail to capture some regions of PF in this case even for MOEA/D or EFR. Second, there are more degrees of freedom in a higher dimensional objective space, so it is much more likely to produce the solutions that achieve a good aggregation value but far from the weight vector. Hence, the distance-based updating strategy in MOEA/D-DU and raking restriction scheme in EFR-RR would be more helpful to maintain the desired distribution of solutions in the evolutionary process.

The third issue is why two similar proposals to MOEA/D-DU, i.e., MOEA/D-STM and MOEA/D-GR, fail to show performance on par with MOEA/D-DU on MaOPs. For MOEA/D-STM, it uses a matching algorithm to coordinate the preferences of subproblems and solutions. Thus, it indeed implicitly gives equal importance to the two kinds of preferences. Nevertheless, it may be more appropriate to have a bias toward the preference of solutions in many-objective optimization since it is much more possible to lose track of some regions in high-dimensional objective space using sparsely distributed solutions. This is also experimentally verified by the parametric study of $K$ in MOEA/D-DU and EFR-RR, where a small $K$ value is recommended. As for MOEA/D-GR, when handling MaOPs, it is not very reasonable to find a subproblem for a new solution using the aggregation functions of subproblems. This is because, in high-dimensional objective space, the sparsely distributed weight vectors are far from each other, and thus the optimal values to each of subproblems (aggregation functions) are rather different, leading to the incomparability of the absolute values for different aggregation functions. Another potential drawback of MOEA/D-GR is that to use the neighborhood of subproblems as the replacement neighborhood may be questionable for many-objective optimization. It can also be attributed to the sparsity of weight vectors in the objective space, which limits the information sharing between subproblems.

The last issue is about the setting of parameter $K$ in MOEA/D-DU and EFR-RR. In our experiments, we just use the constant $K$ value for all the test instances when comparing the two proposed algorithms with the others. However, just as shown in Section VI-A, the most desirable $K$ depends on the problem instance to be solved. Indeed, $K$ plays an important role in controlling the balance between convergence and diversity. If a large $K$ is used, the aggregation function value is more emphasized, thus the convergence is encouraged. In contrast, if a small $K$ is used, $d_{j,2}(\mathbf {x})$ is more stressed, and thus the diversity is promoted. For a specific problem instance, there is supposed to be an optimal balance between the two. This is the meaning of the existence of $K$ in MOEA/D-DU and EFR-RR.

SECTION VIII.

Conclusion

In this paper, MOEA/D-DU and EFR-RR are proposed for many-objective optimization, which enhance two decomposition-based MOEAs, i.e., MOEA/D and EFR, respectively. The basic idea is to explicitly maintain the distribution of solutions during the evolutionary process by exploiting the perpendicular distance from a solution to a weight vector in the objective space. Specifically, in MOEA/D-DU, the updating procedure first determines $K$ closest weight vectors to the newly generated solution in the objective space in terms of the perpendicular distance, then the new solution can only try replacing these solutions corresponding to the $K$ weight vectors by comparing the aggregation function values. In EFR-RR, we argue that each solution involves in the ranking on a part of aggregation functions could be better than on all, where the selection of these aggregation functions is also aided by the perpendicular distance.

In the experimental studies, we have investigated the influence of a key parameter $K$ on the performance of MOEA/D-DU and EFR-RR, and several suggestions are provided to set $K$ properly. We have also shown that MOEA/D-DU and EFR-RR are generally superior to their predecessors in balancing the convergence and diversity in many-objective optimization. To demonstrate the competitiveness, the performance of MOEA/D-DU and EFR-RR has been compared to seven state-of-the-art MOEAs. A total of 70 normalized problem instances with up to 13 objectives from DTLZ and WFG test suites are used for comparison. The results indicate that both MOEA/D-DU and EFR-RR show obvious advantage over their predecessors in many-objective optimization and both are superior overall to other peer algorithms. Further more, we incorporated a sophisticated online normalization procedure into MOEA/D-DU and EFR-RR, respectively, and compared them with NSGA-III on scaled test problems. The results verify the effectiveness of MOEA/D-DU and EFR-RR with the normalization procedure in handling different scales of objectives.

In the future, we will study how to dynamically control the parameter $K$ in MOEA/D-DU and EFR-RR. It would also be interesting to incorporate a local search into the two algorithms so as to achieve faster convergence without loss in diversity. Moreover, it is nontrivial for the two algorithms to better balance the exploration and exploitation in the decision space [64], [65], which may deserve further attention. Finally, further comparative studies are needed to understand the strength and weakness of the two enhanced algorithms in this paper and the two latest MOEAs for MaOPs [28], [66].

References is not available for this document.

Balancing Convergence and Diversity in Decomposition-Based Many-Objective Optimizers

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction

Preliminaries and Background

A. Basic Concepts

Definition 1:

Definition 2:

Definition 3:

Definition 4:

Definition 5:

Definition 6:

B. Related Approaches

Basic Idea

Proposed Algorithms

A. Enhancing MOEA/D

Algorithm 1 Framework of MOEA/D-DU

Algorithm 2 UpdateCurrentPopulation($\mathbf {y}, \mathbf {z}^{\ast }, \Lambda , P, K$ )

B. Enhancing EFR

Algorithm 3 MaximumRanking $(U_{t}, K)$

Algorithm 4 Framework of EFR-RR

C. Optional Normalization Procedure

D. Computational Complexity

E. Discussion

1) MOEA/D-DU Versus MOEA/D-AWA:

2) MOEA/D-DU Versus MOEA/D-STM:

3) MOEA/D-DU Versus I-DBEA:

4) MOEA/D-DU Versus MOEA/D-GR:

5) EFR-RR Versus NSGA-III:

Experimental Design

A. Test Problems

B. Performance Metrics

C. MOEAs for Comparison

D. Experimental Settings

Analysis of the Performance of Enhanced Algorithms

A. Influence of Parameter $K$

B. Investigation of Convergence and Diversity

Comparison With State-of-the-Art Algorithms

A. Comparison on Normalized Test Problems

B. Comparison on Scaled Test Problems

C. Further Discussion

Conclusion

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?