Methods That Optimize Multi-Objective Problems: A Survey and Experimental Evaluation

Most current multi-optimization survey papers classify methods into broad objective categories and do not draw clear boundaries between the specific techniques employed by these methods. This may lead to the misclassification of unrelated methods/techniques into the same objective category. Moreover, most of these survey papers classify algorithms as independent of the specific techniques they employ. Toward this end, we introduce in this survey paper a methodology-based taxonomy that classifies multi-optimization methods into hierarchically nested, fine-grained, and specific classes. We provide a methodological taxonomy to classify methods into the following hierarchical fashion: objective categories objective functionsoptimization methodsoptimization sub-methods. We introduce a comprehensive survey on the methods that are contained under each optimization method, the optimization methods contained under each objective function, and objective functions contained under each objective category. We selected the objective functions that should be maximized for solving most real-word multi-objective optimization problems, which are pairs of the following: partitions separability, internal density, dynamic similarity, and structural similarity. For each optimization method, we surveyed the various algorithms in literature that pertain to the method. We experimentally compared and ranked the optimization methods that fall under each objective function, the objective functions that fall under each objective category, and the objective categories used for solving a specific optimization problem.


I. INTRODUCTION
Multi-objective optimization (MOO) is a scheme for optimizing more than one conflicting objective function simultaneously based on specific constraints [15]. It refers to identifying the optimal solutions of more than one desired objective. Examples of MOO problems are to select an investment source for an investor that minimizes the risks and maximizes the returns or select a manufacturing procedure that maximizes production and minimizes fuel consumption. Therefore, a single-objective optimization algorithm that identifies a single optimal solution is not applicable to multi-objective problems (MOPs). Some objectives can conflict with one another. Therefore, it is impossible to make them all optimal simultaneously. As a result, solving a MOP requires a MOO algorithm to perform a tradeoff (compromise) to determine the contradictory objective that should be The associate editor coordinating the review of this manuscript and approving it for publication was Kaitai Liang . optimized. In MOO, each vector of the objective function is constructed, which is a function of the solution vector. A general MOO problem can be formalized as follows: where f i (x) is the ith objective function that needs to be minimized (min) or maximized (max), n is the number of these functions, and x is a solution, S is a feasible set.
A conventional technique for solving MOO problems requires prior knowledge about each of the objectives. Alternatively, a multi-objective evolutionary algorithm (MOEA) can be used. In a MOEA, a set of solutions (population) is processed iteratively. In each iteration, a subset of solutions is generated and returned. A MOEA attempts to identify a set of non-dominated solutions that approximates the Pareto front in the objective space. A solution u is called non-dominated if it is better in (1) all objectives than solution y or (2) at least VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ one objective than solution y. A solution that does not satisfy the above is called dominated. We introduce in this paper a comprehensive survey on multi-objective algorithms that are contained under each optimization method, the optimization methods contained under each objective function, and the objective functions contained under each objective category. We provide a methodologybased taxonomy that classifies multi-optimization methods into hierarchically nested, fine-grained, and specific classes. We survey and discuss the algorithms and techniques that fall under 28 fine-grained optimization methods.

A. MOTIVATION AND KEY CONTRIBUTATIONS
Most real-word MOO problems require maximizing a pair of the following objective functions [21], [82]: partitions separability, internal density, dynamic similarity, and structural similarity. After investigating this, we found that capturing most real-word MOO problems requires maximizing one of the following four pairs of objective functions: (1) population's separability and dynamic similarity, (2) population's internal density and dynamic similarity, (3) population's separability and structural similarity, or (4) population's separability and internal density. In this paper, we therefore classify multioptimization methods based on the above pairs of objective functions that they seek to maximize. To the best of our knowledge, this is the first paper to survey and classify multioptimization methods based on the mentioned classification.
Unfortunately, most current multi-optimization survey papers classify methods into broad objective categories and do not draw clear boundaries between the specific techniques employed by these methods [53]. This may lead to the misclassification of unrelated methods/techniques into the same objective category. Moreover, most of these survey papers classify algorithms as independent of the specific techniques they employ. To overcome this, we introduce in this paper a methodology-based taxonomy that classifies multi-optimization methods into hierarchically nested, finegrained, and specific classes. To the best of our knowledge, this is the first survey paper to use a methodological taxonomy to classify methods into the following hierarchically nested form: objective categories©objective func-tions© optimization methods©optimization sub-methods. The lowest subclass in a hierarchy is a fine-grained and specific optimization method; the classifications resulted in 28 fine-grained and specific methods.
We introduce a comprehensive survey on the methods that are contained under each optimization method, the optimization methods contained under each objective function, and the objective functions contained under each objective category. We experimentally evaluated, compared, and ranked the following: 1) The algorithms contained under each optimization method.
2) The optimization methods contained under each objective function.
3) The objective functions contained under each objective category. 4) The objective categories for solving a specific optimization problem.
Our methodology-based taxonomy enables a researcher reader to gain knowledge about the following: 1) The very specific method under which the researcher's proposed algorithm falls.
2) The objective's category and function under which the researcher's proposed method falls.
3) The advantages and limitations of the method, objective category, and objective function under which the researcher's proposed algorithm falls. 4) The research works in literature that are the most comparable with the researcher's proposed algorithm.

B. CURRENT SURVEY PAPERS ON THE TOPIC
Current survey papers can be classified into the following broad categories: multi-objective and many-objective methods. Multi-objective papers can be further classified into evolutionary-based and decomposition-based methods. In this Subsection, we outline the most notable survey papers based on the mentioned classifications.

1) SURVEY PAPERS ON MULTI-OBJECTIVE OPTIMIZATION a: EVOLUTIONARY-BASED SURVEY PAPERS
Antonio and Coello [2] presented a survey on the state-of-theart coevolutionary algorithms, which are extensions of traditional MOEAs by addressing global optimization problems that are relatively large scale. The paper presented a taxonomy of approaches. Under each approach, the paper selected representative algorithms and described them. Tanabe and Ishibuchi [85] presented a survey on the state-of-the-art multi-modal MOO evolutionary methods. Huang et al. [45] classified MOEAs and their research status into four categories. They analyzed the advantages and disadvantages of the algorithms under each category. Mukhopadhyay et al. [69] surveyed MOEAs for data mining problems. Specifically, they surveyed MOEAs related to classification and feature selection data mining tasks. Mukhopadhyay et al. [70] again surveyed MOEAs for data mining problems, but this time, they targeted algorithms used for association rule mining and clustering tasks. Gunantara [40] surveyed MOO applications and methods and proposed an enhanced settlement method.
Fan et al. [36] surveyed both traditional and machinelearning-based MOEAs. They selected model-based MOEAs as representatives and described them. Zhang and Xing [106] classified MOEAs into three categories. They selected representative algorithms from each category and described them. Purshouse et al. [76] surveyed methods that combine multiple criteria decision making and evolutionary MOO techniques. They classified these methods into priori, posteriori, and interactive methods and described them.
Von et al. [94] surveyed methods that use MOEAs to solve many-objective problems, and they described these methods and their findings. Zhou et al. [105] surveyed coevolutionary, multimodal, constraint-handling, dynamic, combinatorial, and decomposition-based MOEAs. Azzouz et al. [5] surveyed dynamicity constraints with time-varying MOEAs. Vimal et al. [96] surveyed MOEAs that reduce the size of the Pareto set and maintain solutions diversity.

b: DECOMPOSITION-BASED SURVEY PAPERS
Xu et al. [102] surveyed decomposition-based MOEAs and its variants evolutionary operator and weight vector generation methods. They discussed the extension of the methods to many-objective optimization. Trivedi et al. [84] surveyed decomposition-based MOEAs and their different directions and components. Specifically, they surveyed works that studied each of the following: weight vector generation, modifications in the reproduction operation, decomposition approaches, mating selection, allocation of computational resources, dominance-based approaches, and hybridizing decomposition. In addition, they surveyed works on extending decomposition-based methods to constrained MOO.
Santiago et al. [79] surveyed multi-objective decomposition approaches and described how these approaches (1) achieved Pareto optimal solutions and (2) decompose a MOO problem. The authors discussed also the trends in decomposing a multi-objective problem into single-objective problems. Cho et al. [15] surveyed the state-of-the-art modeling techniques that solve decomposition-based MOO problems. Furthermore, they outlined the advantages and disadvantages of each modeling technique. Peitz and Dellnitz [74] surveyed multi-objective optimal control methods that solve complex problems. In addition, the authors discussed predictive control methods that use online and offline decomposition.

2) SURVEY PAPERS ON MANY-OBJECTIVE OPTIMIZATION
Bechikh et al. [7] surveyed many-objective optimization evolutionary algorithms (MaOEAs). The authors described also how these MaOEAs solve many-objective optimization problems (MaOPs). The authors classified these MaOEAs and described the evolution of each in the field. Li et al. [64] categorized MaOEAs into the following seven classes: diversity-based, indicator-based, relaxed-dominance-based, preference-based, aggregation-based, reference-set-based, and dimensionality-reduction-based. The authors surveyed the related works under each class.
He and Yen [44] categorized the visualization methods used in MaOPs into five classes. The authors surveyed the methods under each class and compared these methods to evaluate their performance for decision making. Chand and Wagner [16] surveyed MaOEAs and how they addressed the real-world MaOPs. The authors outlined also the challenges associated with many-objective optimization. Mane and Rao [81] surveyed MaOEAs from 2005 to 2017. The authors also described how these algorithms solve MaOPs. Ishibuchi et al. [49] surveyed the different methods for handling MaOPs using evolutionary algorithms. Here, the authors discussed also the scalability of MOEAs to many-objective problems by investigating the behaviors of NSGA-II objectives.

C. STRUCTURE OF THE PAPER
The rest of this paper is structured as follows.
• Sections III -IV survey and describe the optimization methods that maximize the objective functions that fall under the Pareto dominance-based, indicator-based, and reference-based objective categories, respectively.
• Section V presents empirical experiments that evaluate and compare the different optimization methods, objective categories, and objective functions.
• Section VI presents our conclusions.

II. BASIC CONCEPTS
The MOO problems considered in this survey can be formulated as follows: = is the vector of k decision variables, f denotes objective, g is a constraint, m is the number of objectives, n is the number of constraints, x (u) and x (v) are the lower are upper values of decision variable, respectively.
The concepts of maximizing a population's separability, internal density, dynamic similarity, and structural similarity are defined as follows: • Maximizing a population's separability: This refers to clustering the Pareto front in such a way that each population has dense similarity among its points and sparse similarity with the points of other populations. This can be achieved by maintaining diversity by obtaining well-distributed solutions and selecting points uniformly. Maximizing populations' separability reflects the strength of dividing the Pareto front into populations of similar characteristics. A good population has a larger number of points with similar characteristics and a smaller number of inter-population points with similar characteristics.
• Maximizing a population's internal density: This refers to achieving a strong and well-converged set of solutions as close as possible to the Pareto front. This can be achieved by (1) increasing the selection pressure toward the Pareto front according to the density of the population, (2) converging points to a density that is proportional to a specific value (i.e., to manage convergence), (3) employing a density-based adaptive sampling routine, or (4) employing an agglomerative internal density function.
• Maximizing a population's dynamic similarity: This refers to clustering the Pareto front in such a way that VOLUME 8, 2020 each point within a population is adjacent to a large number of points confined within the population that have similar characteristics. This can be achieved by dynamically minimizing the distances between a population's points with similar characteristics using techniques such as random probability distribution and random walk to gain knowledge about the Pareto front's topology. Thus, it requires a technique that combines the accuracy of global processing with the efficiency of local search.
• Maximizing a population's structural similarity: This refers to clustering the Pareto front in such a way that if the structural similarity of a point and a subset of points is high (e.g., small cluster variance), the point and the subset should belong to a same population. It characterizes the internal structure of a population, such as being difficult to split into two sub-populations. This can be achieved by techniques such as non-dominated sorting with crowding distance and a greedy procedure for reproducing the approximated optimal distributions. Fig. 1 shows our proposed methodology-based taxonomy, which classifies multi-optimization methods into hierarchically nested, fine-grained, and specific classes.

III. PARETO DOMINANCE-BASED OBJECTIVE CATEGORY
Pareto dominance is one of the most used general approaches for solving a multi-objective problem and evaluating the quality of a population. In this approach, each objective function is treated separately. The approach maintains the individual elements of the solution vectors as independent (separate from one another) during optimization. Dominated and non-dominated solutions are differentiated. A multi-objective problem is not transformed into single-objective problems, and the approach produces a set of non-dominated solutions. For example, in a minimization problem, a solution u is said to dominate a solution v (u ≺ v), if: Condition (1) indicates that v should not be better than u, and condition (2) indicates that u should be better than v in at least one objective function. A Pareto optimal solution is achieved when one objective function cannot be improved without deteriorating the other objective function. In Subsection A, we describe the methods that maximize both, the population separability and dynamic similarity objective functions.
In Subsection B, we describe the methods that maximize both, the internal density and dynamic similarity objective functions.

A. MAXIMIZING A POULATION'S SEPARABILITY AND DYNAMIC SIMILARITY OBJECTIVE FUNCTIONS
We describe in Subsections 1-3 the diversity managementbased, ε-approximate-based, and nondominated sortingbased methods that maximize a population's separability and dynamic similarity objective functions.

1) DIVERSITY MANAGEMENT-BASED METHODS
Adra and Fleming [1] proposed a multi-objective mechanism called DM that maximizes the separability of populations by managing their diversity by maintaining diverse sets of solutions. DM improves the performance of many-objective optimization problems regarding diversity and convergence. The diversity management mechanism of DM includes an adaptive mutation operator that ensures a diversity-preserving process. Furthermore, the operator ensures locally nondominated approximate solutions. The magnitude of the variation of each decision variable is controlled by the following: a) A value of the spread indicator that measures the extent of the diversity of the locally non-dominated set of solutions. b) A dynamic ''crowding'' measure that maximizes a population's dynamic similarity. This is an improvement of the measure adopted by NSGA-II that computes the diversity in each single solution.
Cheng et al. [13] proposed a multi-objective method based on the enhancement of two selection strategies that facilitates diversity management and convergence. The first selection mechanism maximizes the separability of populations using the directional diversity (DD) procedure, while the second mechanism maximizes a population's dynamic similarity using the favorable convergence (FC) procedure. The diversity maintenance mechanism considers the information of both diversity and convergence to balance the diversity and convergence of the algorithm. Favorable convergence combined with the Pareto dominance criterion ensures the construction of a mating pool to generate offspring with satisfactory convergence performance. In the algorithm, DD and FC indicators are used to measure the performance of diversity and convergence, respectively, of an individual.
Vrugt et al. [93] proposed a multi-objective method that maximizes the separability of populations by preserving their diversities and maximizes a population's dynamic similarity by employing an iterative procedure for comparing different populations using a Euclidean distance measure; the method is an improvement of the diversity and search mechanisms presented in [101]. Diversity is maintained in a solution using a diversity management distance parameter α, which is user defined. The input to the algorithm is a population P sorted in a decreasing order. Initially, the first element in P (the best in P) is assigned to cluster C as its first element. In a dynamic iterative procedure, thereafter, the subsequent element in P is compared with the current elements in C. If the Euclidean distance of this element to all elements in C is greater than α, this element will be included in C. The procedure is repeated until the size of C becomes N , which is half the size of P.
Korkmaz et al. [57] proposed a multi-objective method that maximizes the following two objective functions: (a) the separability of populations by minimizing the total intracluster variation and (b) the dynamic similarity of a population by minimizing the number of populations to minimize clustering overhead. Since minimizing the number of populations and the total intracluster variation can conflict, the proposed method employs Pareto dominance to manage the discovery of (a) a diverse set of non-dominated solutions and (b) the smallest possible total intracluster variance for each different number of populations. This produces a set of solutions that have different tradeoffs between the two objective functions VOLUME 8, 2020 so that the user can determine the option that suits the problem under consideration.
Wu and Pan [100] proposed a multi-objective method with a local search procedure. Let i be a non-dominated individual. The average distance between each pair of non-dominated individuals on each side of i along each of the two objectives is considered the fitness value of i. One of the two objectives is to maximize the separability of populations by managing their diversities by minimizing their intra modularity. The other objective is to maximize the dynamic similarity of a population by minimizing the average distance between its points.
Mukhopadhyay et al. [73] proposed a multi-objective method that optimizes two objectives. One of the two objectives is to maximize the separability of populations by managing their diversities by minimizing their intracluster variations. The other objective is to maximize the dynamic similarity of a population by minimizing the distances between its points using a dynamic global measure.
Shi et al. [82] proposed a multi-objective method that maximizes the separability of populations by managing their diversities through minimizing their inter variations. The local search procedure is executed dynamically after applying a crossover operator to the current non-dominated individuals of the Pareto front and using label propagation to change the class membership of individuals.

2) ε-DOMINANCE-BASED METHODS
Laumanns et al. [58] proposed an ε-(approximate) Pareto for constructing a method that manages diversity and convergence. Based on the concept of ε-dominance, the method employs selection (archiving) strategies that lead to the desired distribution and convergence properties by maximizing populations' separabilities and dynamic similarities, respectively. The proposed selection strategies based on the concept of ε-dominance cover a wide range of non-dominated solutions that is iteratively updated and ensure progression toward the Pareto-optimal set. This is achieved by providing the decision maker the opportunity to select an appropriate ε value. Using a dynamic update operator and a selection mechanism, the proposed method guarantees convergence and stochastic convergence to an ε-Pareto set.
Takahashi et al. [90] proposed an extension of the ε-dominance scheme, called cone ε -dominance, which combines the following: (a) diversity management scheme, (b) the convergence properties of the ε-dominance scheme, and (c) a dominance scheme less sensitive to geometrical features of the Pareto front than the ε-dominance scheme. The archive adopted by the proposed method employs a twolevel concept and dynamically updates the function in the cone ε-dominance strategy. The proposed method discretizes the objective space into boxes, each containing a single vector. Diversity management is achieved by applying the cone ε-dominance relation at these boxes to maintain a set of cone ε-dominated solutions, which maximize the separability of populations. The convergence property is achieved by dynamically storing the non-dominated solutions in the archive, which maximizes the dynamic similarity between the points of a population.
Deb et al. [27] proposed a steady-state multi-objective evolutionary method based on the concept of ε-dominance. The method adopts an efficient archive and parent update strategies by maximizing the dynamic similarity of each population. Two solutions with a difference ε i in the i-th objective are not allowed to be non-dominated to each other, which maintains good diversity management of the populations and in turn maximizes their separability. A parent and an archive population are created simultaneously. Two offspring solutions come from the two populations. Each of the two offspring is utilized to dynamically update the archive and parent populations based on the ε-dominance concept, which maximizes the dynamic similarity of each population. The user can choose an ε i value based on the required resolution in the i-th objective.
Soo-Yong et al. [83] proposed an ε-dominance-based evolutionary steady-state genetic algorithm by employing elite archive and the ε-dominance relation. The proposed design aims to improve the reliability of the probe set by combining a (a) diverse criteria fitness calculation, which maximizes the separability of populations, (b) dynamic sequence similarity search, which maximizes the dynamic similarity of a population, and (c) user-defined criteria. In the design, u ε-dominates v if the difference between u and v is (a) equal or greater than a predefined ε value in all objectives and (b) greater than v by ε in at least one objective. A decision maker can select the desired solution among Pareto solutions. [92] proposed an evolutionary optimization method that employs a population-based elitism search procedure. The method manages diversity by producing a well-distributed set of Pareto solutions in only a single optimization run, which maximizes the separability of populations. The method creates an offspring population adaptively to produce the highest possible reproduction, which maximizes the dynamic similarity of each population. Each parent is given a rank by employing a non-dominated sorting algorithm. An offspring population of size N is created by using a multimethod search procedure. The method uses multiple reproduction operators simultaneously to generate the offspring. By comparing current and previous offspring, elitism can be demonstrated to be ensured because all nondominated members are always included. Members of a next offspring population are selected from subsequent nondominated fronts based on their crowding distance and rank.

Vrugt and Robinson
Ripon et al. [77] proposed a multi-objective evolutionary clustering method using variable-length real jumping genes genetic algorithms (JGGA) [1]. The method manages diversity by minimizing the average intra cluster variation of the points of a population, which maximizes the separability of populations. This results in detecting populations with strongly associated nodes using a non-dominated sorting procedure. The method measures the intra cluster distances to minimize the distances between points, which maximizes the dynamic similarity of each population. The average value across populations was used to compute a normalized value of the measure. The value of each feature is randomly initialized and confined within the upper and lower boundaries of its values. For strength fitness evaluation, an in-cluster similarity procedure is utilized to ensure that the generated random clustering solutions are valid. Each feature is assigned to its corresponding population based on its nearest Euclidean distance to the center of the population.

B. MAXIMIZING A POPULATION'S INTERNAL DENSITY AND DYNAMIC SIMILARITY OBJECTIVE FUNCTIONS
We describe in Subsections 1-4 the ε-dominance-based, strength Pareto-based, and nondominated sorting-based methods that maximize a population's internal density and dynamic similarity objective functions. [66] proposed an ε-dominate general archiving scheme to construct a dynamic iterative randomized algorithm whose intermediate solutions converge to an optimal solution. The scheme ensures strong convergence of the results, which maximizes the internal density of a population. Specifically, for a given cardinality, a sequence of solution sets converges with probability one to an ε -Pareto set. The algorithm adopts a dynamic randomized scheme to modify the current value of ε to adapt to the information obtained from the run for that algorithm, which maximizes the dynamic similarity of a population. That is, the algorithm learns the best achievable approximation value and adapts its internal value accordingly, which maximizes also the dynamic similarity of a population. This results in achieving convergence to the smallest ε value and to the solutions that ε-dominate all other solutions.

Laumanns and Zenklusen
Cai et al. [20] proposed an ε-dominance-based differential evolution algorithm for multi-objective optimization. The algorithm employs the ε-dominance and parent update strategies proposed in [27] to update the population and archive, which maximizes the dynamic similarity of a population. The algorithm shifts some points closer to the global minimum during each iteration to improve the population of points, which maximizes the internal density of a population. The algorithm initializes the population by scattering the points uniformly over the solution space. This is performed to identify good points to be explored in subsequent iterations.

2) STRENGTH PARETO-BASED METHODS
Zitzler et al. [108] proposed an improved version of the strength Pareto evolutionary algorithm (SPEA) [109], called SPEA2. The proposed method maximizes the internal density of a population by including a nearest neighbor density estimation technique, which causes the search process to become more precise. SPEA2 maintains an archive that contains a representation of the non-dominated front among all solutions.
All non-dominated members of the population are copied to the archive; the only way for an individual member to survive several generations is to be copied to the archive. If the size of the updated archive exceeds a predefined limit, some archive members are deleted. The deleted archive members are selected in such a manner that the characteristics of the non-dominated front are preserved, which maximizes the dynamic similarity of a population. An archive member is deleted if a) it has been dominated by another solution or b) it is located in an overcrowded portion of the front.
Corne et al. [14] likewise proposed an improved version of the strength Pareto evolutionary algorithm (SPEA) [109], called PESA. The proposed method includes a nearest neighbor density estimation technique, which maximizes the internal density of a population. PESA maintains an archive that contains a representation of the non-dominated front among all solutions, and it employs an external population archive to store the current approximation to the Pareto front. In addition, PESA employs an internal population storage to store new candidate solutions that will be incorporated in the archive. Furthermore, PESA uses a crowding distance measure to keep track of the degree of crowding in each region of the archive, which maximizes the dynamic similarity of a population. The diversity maintenance adopted by PESA employs a fine-grained fitness assignment strategy.

3) NONDOMINATED SORTING-BASED METHODS
Deb et al. [28] proposed a multi-objective framework for nondominated sorting, called (NSGA-II). NSGA-II constructs a population of contended individuals and ranks the individuals based on the non-dominance procedure, which maximizes the dynamic similarity of a population. The internal density of a population is maximized by increasing the number of groups and decreasing the number of populations. Solutions are selected if they have a higher number of groups and lower number of populations. Thus, the Pareto front solution has the highest value of modularity. The final solutions reflect the network's hierarchical organization, which allows the network to be analyzed at various hierarchical levels.
Ferringer et al. [35] proposed a general MOEA framework adapted for use with large heterogeneous clusters. It is based on the Non-dominated Sorting Genetic Algorithm-II (ε-NSGA-2) and employs the crossover and mutation operators. The search shifts from one generation to another by looping through the following processes: a) non-domination sorting procedure of a combined child and parent population of size 2m, 2) crowded tournament selection procedure that produces a new parent population of size m, which maximizes the internal density of a population, and c) mutation and crossover selection procedure for creating a new child population of size m. The features of the network include the dynamic auto-adaptive sizing of a population, which maximizes its dynamic similarity [23] and epsilon-dominance archiving [22].
Pizzuti [75] proposed a multi-objective framework to detect community structures. It is based on the framework VOLUME 8, 2020 of non-dominated sorting GA (NSGA-II), proposed by Deb et al. [28]. The framework maximizes the internal density of a population by maximizing the in-degree of its points, and it minimizes the clustering overhead by minimizing the community fitness, proposed in [65]. Furthermore, it constructs a population of contended individuals and ranks the individuals based on the non-dominance procedure, which maximizes the dynamic similarity of a population. Thus, the Pareto front solution has the highest value of modularity. The final Pareto front solutions reflect the network's hierarchical organization, which allows the network to be analyzed at various levels.

4) FUZZY DOMINANCE-BASED METHODS
Mukhopadhyay et al. [73] proposed a multi-objective genetic method based on a fuzzy clustering procedure that optimizes the fuzzy compactness, which maximizes the internal density of each population. This procedure employs the fuzzy separation of populations, which maximizes their dynamic similarities. The crowded binary tournament is used as a selection operation, which maximizes the internal density of a population. Then, the final generated solution is a set of non-dominated solutions. The method employs a uniform crossover with a random mask for constructing offspring solutions, and for each non-dominated solution, the population label vector is obtained from the solution by giving each point to the population with the highest membership. The method employs the following dissimilarity measure: The distance measurement between two points in two category objects is the total mismatches of their corresponding attribute categories, which maximizes the dynamic similarity of a population.
Farina and Amato [37] introduced a method for fuzzybased dominated and optimality solutions.
This method extends the notion of k-optimality and (1 − k)dominance using fuzzy relations. Specifically, the method considers fuzzy arithmetic and the fuzz number for computing the degree to which a point v 1 is equal to or different from a point v 2 in each objective function. The method considers the dominance relation as a fuzzy relation. In addition, it maximizes the internal density and dynamic similarity by applying fuzzy arithmetic on a given objective domain search space to associate each point with the following fuzzy set: a fuzzy number for ''less than,'' a fuzzy number for ''greater than,'' and a fuzzy number for equality.

IV. INDICATOR-BASED OBJECTIVE CATEGORY
Indicator-based methods permit user preferences to be implicitly incorporated into the search to solve MOO problems. Indicator-based algorithms employ quality indicators to direct the selection process by assigning individuals to an objective fitness. They aim to identify a set of solutions that maximizes the fundamental quality indicator rather than optimizing the objective functions directly. The most widely used indicators are hypervolume and R2. The hypervolume indicator can be used as a measure for the quality of Pareto front approximations and a criterion for guiding the search algorithms toward Pareto fronts. On the other hand, the R2 indicator can be employed as a mutual preference based on the contribution of the population to each weight vector in a set of weight vectors by ranking them.
In Subsection A, we describe the optimization methods that maximize a population's internal density and dynamic similarity objective functions. In Subsection B, we describe the optimization methods that maximize a population's structural similarity and population separability objective functions. In Subsection C, we describe the optimization methods that maximize a population's dynamic similarity and population separability objective functions.

A. MAXIMIZING A POPULATION'S INTERNAL DENSITY AND DYNAMIC SIMILARITY OBJECTIVE FUNCTIONS
We describe in Subsections 1 and 2 hypervolume-based and indicator-based methods that maximize a population's internal density and dynamic similarity objective functions.

1) HYPERVOLUME INDICATOR-BASED METHODS
Bader and Zitzler [12] proposed a hypervolume-based estimation method for MOO. In this method, the available computing resources and accuracy of estimates are traded off so that the multi-objective problems using hypervolume-based search become feasible, and the runtime becomes flexibly adapted. The method uses a greedy heuristic-based search to achieve an approximation. Furthermore, it maximizes the internal density of a population by employing a densitybased adaptive sampling routine to estimate the hypervolume contributions, and it maximizes the dynamic similarity of a population by employing Monte Carlo simulation to approximate hypervolume values, then dynamically ranking solutions based on these values. Solutions are evaluated based on their usefulness, and those solutions that do not exceed a predefined parameter are considered unimportant and are removed.
Cinalli et al. [17] employs a collective intelligence operator that biases the search and limits the objective space during the optimization phase. This causes successive stages of evolution to be improved using dynamic group contributions. The method changes the optimization goal through an interactive procedure by employing a weighted hypervolume indicator. The preferred solutions in a current population are consequently indicated, which impacts the weight function used by the hypervolume indicator. Through the interaction, collective intelligence reference points are identified. This method maximizes the internal density of a population by aggregating multiple indicator points based on various opinions, which results in an accurate representation of preferences. In addition, the method maximizes the dynamic similarity of a population by dynamically measuring the minimum distance between a current approximation set and the Pareto-optimal front using the front coverage indicator.
Cao et al. [18] proposed a multi-objective framework for determining the sets of points that maximize the influence of the hypervolume indicator on optimal point distributions. In terms of a density of points, the framework derives a limit result for points going to infinity. To achieve extreme points in an optimal point distribution in the Pareto front, the framework derives lower bounds for placing the reference points. The framework maximizes the internal density of a population by increasing the number of points to infinity, which leads to optimizing the density of points associated with the hypervolume indicator for the sake of optimal point distributions. The framework maximizes the dynamic similarity of a population by minimizing the distances between reference points dominated by the nadir points.
Auger et al. [4] proposed a framework for spreading finite sets of solutions over the Pareto front of multi-objective problems to maximize the hypervolume indicator. Toward this end, the framework characterizes the density via the optimal distribution of the points that maximize the hypervolume indicator. These distributions are performed based on the density that approximates the percentage of points in each portion of the front. Let N d be the negative of the derivative of the front. The framework maximizes the internal density of a population by converging points to a density proportional to (N d ) 2 , and it maximizes the dynamic similarity of a population by distributing points dynamically and uniformly with similar distance. The authors concluded that the combination of the shape and the shape of the Pareto front can determine the optimal distribution of the points that maximize the hypervolume indicator.
Emmerich et al. [32] proposed a framework that uses conebased hypervolume indicators (CHI) as a generalization of the hypervolume indicator (HI) in Pareto optimization. The framework replaces the classical HI by CHI through the use of γ -cones in hypervolume-based algorithms. The framework distributes points uniformly and maximizes the internal density of the cone-based in the Pareto front approximations as follows: (1) computing for each finite set the cone nondominated subset and (2) building the base vectors of each point based on its angle parameter. The framework maximizes the dynamic similarity of a population by employing the Manhattan distance to compute the optimal µ-distribution for each γ approaching zero.

2) R2 INDICATOR-BASED METHODS
Wei et al. [99] proposed a many-objective particle swarm optimizer based on the R2 indicator to achieve better convergence, which maximizes the internal density of a population. This method employs a bi-level archive-maintaining strategy based on the R2 indicator and objective space decomposition to maintain well-distributed solutions, which maximizes the dynamic similarity of a population. A leader pool that connects the decision variable space and the objective space includes the following components: an objective space decomposition leader, a personal-best leader, and a globalbest leader. The method maximizes the internal density of a population by employing a parametric probability density function that uses the selected personal-best leader, globalbest leader, and current particle. By maximizing the dynamic similarity, the objective space decomposition leader and the global-best leader are selected dynamically by fetching feedback information from the bi-level archive. The decomposition procedure prunes the candidate solutions; however, no explicit archive maintenance strategy is introduced.
Wagner et al. [97] proposed a multi-objective method that integrates preferences into the R2 indicator. Specifically, this method optimizes the generation of weight vectors in such a way that preferences regarding the extremes of the front are increased. This is because the optimal distribution of solution sets based on R2 can be influenced by the weight vector distribution. The internal density of a population is maximized by (1) shifting the weight vectors' density away from the center of the Pareto front, (2) moving the positions of solutions toward the extremes of the front for weight vectors' coarser density at the center of the front, and (3) restricting the weight space when moving the reference point. The dynamic similarity of a population is maximized by more strongly dynamically skewing the initial uniform distribution.
Brockhoff et al. [10] proposed a method to achieve the optimal approximate µ-distributions for the R2 indicator based on uniform weight distributions. For the optimal placement of a point according to the R2 indicator, this method places each point based on its neighbors and weight vectors. The method shifts the R2 contributions' points to the center of the front's center, which maximizes the internal density of a population. The distances between neighboring points are smaller at the front's middle and larger at the extremes. These distances are larger in angle space than in weight space. By maximizing the dynamic similarity of a population, each point is driven dynamically to cover a subset of weight vectors in its direct neighborhood to select the best solution for each weight vector of the R2 indicator.

B. MAXIMIZING A POPULATION'S STRUCTURAL SIMILARITY AND SEPARABILITY OBJECTIVE FUNCTIONS 1) HYPERVOLUME INDICATOR-BASED METHODS
Igel et al. [48] proposed a variant of the covariance matrix adaptation evolution strategy (CMA-ES) for MOO, called MO-CMA-ES, which maintains a population of individuals that adapt their search strategy, similar to CMA-ES. The strategy adaptation technique of MO-CMA-ES is combined with the following two multi-objective sorting criteria: (1) contributing hypervolume to improve the selection and (2) non-dominated sorting with crowding distance, which maximizes the structural similarity of a population. This method employs the non-dominated sorting approach NSGA-II [28] and maximizes the separability of populations by employing the combination of crowding distance and non-dominance sorting, which preserves diversity.
Ulrich et al. [91] proposed a method that integrates decision space diversity into hypervolume-based multi-objective search. This method employs a modified version of the hypervolume indicator and integrates it into an evolutionary algorithm. Since structural characteristics in decision space can reveal valuable insights, the method maximizes the structural similarity of a population by searching for structurally diverse Pareto-set approximations. Furthermore, the method maximizes the separability of populations by weighting them based on the diversity of their dominating points and then summing them. Consider a population with width b and a set of solutions A ⊆ X . The coverage diversity Dc(A) is computed as follows: x i and z i is the i-th decision variable value of solutions x and z, respectively.

2) R2 INDICATOR-BASED METHODS
Li et al. [67] proposed a many-objective optimization method based on the enhanced R2 indicator, called TS-R2EA. This method employs the following two-stage selection strategy: (1) R2 indicator as the primary selection and (2) reference vector guided as the secondary selection. The method combines the R2 indicator and reference vector guided selections for the sake of achieving both convergence (which maximizes the structural similarity of a population) and diversity (which maximizes the separability of populations). To maximize the structural similarity of a population, TS-R2EA employs, as the primary selection strategy, an R2 indicatorbased achievement scalarizing function. To maximize the separability of populations, TS-R2EA employs a secondary selection strategy, the reference vector guided objective space methodology used in diversity management. By maximizing both the structural similarity and population separability, the two selection strategies achieve a balance between diversity and convergence.
Manriquez et al. [68] proposed a method that ranks individual solutions of multi-objective evolution algorithms by integrating the R2 indicator and a modified version of the non-dominated sorting approach proposed by Goldberg [41]. This method maximizes the structural similarity of a population using variation operators, which adjust the weight vectors at each iteration. The utopian point continues to be updated following the generation of each new solution using the variation operators. The point is identified based on the best obtained values for each objective by checking whether there exists an objective value better than the current utopian point in each new solution. The method maximizes the separability of populations by employing a combination of the differential evolution recombination operator and the nondominated sorting procedure.
Trautmann et al. [86] proposed an indicator-based evolutionary multi-objective method that employs the R2 indicator as a secondary selection criterion. This method maximizes the separability of populations by focusing the search behavior. It does this by adjusting the distributions of the R2 indicator's weight vector. The method maximizes the structural similarity of a population by employing a greedy procedure by reproducing the approximated optimal distributions of µ points.

C. MAXIMIZING A POPULATION'S DYNAMIC SIMILARITY AND SEPARABILITY OBJECTIVE FUNCTIONS
We describe in Subsections 1 and 2 the hypervolume-based and R2 indicator-based methods that maximize a population's dynamic similarity and separability objective functions.

1) HYPERVOLUME INDICATOR-BASED METHODS
Beume et al. [11] proposed an evolutionary multi-objective optimization method that combines the non-dominated sorting procedure and hypervolume measure to feature a selection operator. This method maximizes the separability of populations by (1) employing the non-dominated sorting approach proposed by Goldberg [41] as a ranking criterion and (2) applying the hypervolume as a selection criterion to discard individuals, which contributes to the lowest hypervolume. The dynamic similarity of a population is maximized by handling the reference point and dynamically ranking the hierarchical levels of domination. After initializing the population, randomized variation operators are used to generate new individuals. A new individual is considered a member of the subsequent population if the population's quality can be improved by replacing another member with it.
Ishibuchi et al. [50] proposed a two-objective optimization method that maximizes the hypervolume in the objective space and maximizes diversity in the decision space. This method maximizes the separability of populations by employing the Solow-Polasky diversity measure. Furthermore, it maximizes the dynamic similarity of a population by (1) employing the Euclidean distance to dynamically measure the distances between solutions and (2) maximizing the objective space hypervolume. Let H (S) and D(S) denote the hypervolume measure of a solution in set S and the Solow-Polasky diversity measure in the decision space, respectively. The two-objective problems are formulated as follows: Maximize H (S) and D(S), where |S| = m, in which m is a predefined integer parameter, the number of solutions in S.

2) R2 INDICATOR-BASED METHODS
Fei et al. [34] proposed an evolutionary method based on the R2 indicator and decomposition procedure to achieve a well-converged and -distributed Pareto front. The method maximizes the separability of populations by (1) adopting the objective space population strategy and (2) ensuring diversity by generating a set of reference vectors using a twolayer reference vector generation procedure. The dynamic similarity of a population is maximized by dynamically (1) bridging between the decomposition R2 indicator using the set of reference vectors and (2) discarding a solution that achieves the worst performance in crowded subspaces. First, the objective space is repopulated into a number of subspaces if the ideal point's position is changed. Then, the R2 indicator and decomposition procedure are performed to prune the combined solutions.
Gómez and Coello [39] proposed a many-objective optimization method based on the R2 indicator. This method updates reference points based on statistical information of previous generations of individuals and ranks the solutions dynamically. The method maximizes the separability of populations by adopting a technique that serves as a cut-off for objective space, where outliers are removed. The technique utilizes the statistical information of previous generations to keep track of the parent population's nadir point at each generation to determine the closeness of individuals to the true Pareto front. Reference points are updated accordingly. A small variance indicates that the solutions are too close, which requires small movements to be performed. Meanwhile, a large variance indicates that the solutions are far apart. The method maximizes the dynamic similarity of a population by dynamically (1) ranking solutions by placing those that optimize the set of weight vectors at the top and (2) using the Euclidean distance to remove weaker solutions that have the same utility value as another solution but a lower Euclidean distance.
Gómez and Coello [38] proposed a many-objective optimization method based on the R2 indicator. This method employs a non-dominated sorting scheme and Tchebycheff values to rank and group the solutions. Furthermore, the method maximizes the dynamic similarity of a population by employing a non-dominated sorting scheme to group similar solutions. The solutions are ranked dynamically by placing those that optimize the set of weight vectors on top. The method maximizes the separability of populations by using Tchebycheff values to serve as a cut-off for objective space. For every two individuals that have a Tchebycheff value, only the individual with the lower Tchebycheff is retained.

V. REFERENCE POINT-BASED OBJECTIVE CATEGORY
Reference-based methods aim to solve MOO problems by interactively representing a decision maker's preferences through points in the objective space called reference points. Reference-points-based methods enhance the pressure of selection toward the Pareto front. Moreover, they maintain uniform distribution among different solutions. A typical interactive multi-criterion optimization method requires a decision maker to indicate reference points that yield preferred solutions on the Pareto-optimal front. A decision maker's preferences constitute the components of a reference point and are conveyed as desirable values on each objective. Most preference-based MOO methods employ only one reference point. However, employing a series of reference points can solve many-objective optimization problems and obtain the whole Pareto front. In interactive-based methods, the preferences of the decision maker can expand by using the following iterative procedure: obtaining optimization results using current preferences and obtaining new preferences based on feedback from the decision maker regarding current solutions.
In Subsection A, we describe the optimization methods that maximize a population's internal density and population separability objective functions. In Subsection B, we describe the optimization methods that maximize a population's dynamic similarity and population separability objective functions. In Subsection C, we describe the optimization methods that maximize a population's internal similarity and dynamic similarity objective functions.

A. MAXIMIZING A POPULATION'S INTERNAL DENSITY AND SEPARABILITY OBJECTIVE FUNCTIONS
We describe in Subsections 1 and 2 the local search-based and global search-based methods that maximize a population's internal density and separability objective functions.

i. DECOMPOSITION-BASED METHODS
Hu et al. [46] proposed a method that solves the multi-objective minimum-weighted node problem by combining neighborhood search and decomposition procedure to decompose the problem into scalar optimization subproblems. This method uses one of the objective space's points as a reference point to result in the population's convergence. The method maximizes the internal density of a population by searching for nodes that have both high degrees and large weights. Toward this end, the method assigns a score to each node based on its degree and weight, called WDscore. To maximize the separability of populations, this method ensures the diversity of the population by restricting the list of nodes based on their WDscores. For each individual, an iterated neighborhood search is performed to improve the current solution.
Konstantinidis et al. [55] proposed a multi-objective evolutionary framework to search for objects/users in a mobile social community. The framework maximizes the separability of populations by employing decomposition to identify a diverse set of non-dominated objects in a single run. The method employs a priori reference point to manage a tradeoff between the following two objectives: (1) maximizing the internal density of a population by increasing the recall rate of user querying and (2) minimizing the query response time in performing a search. The pre-processing phase includes (1) representing a query and (2) dynamically decomposing the problem into m solutions for the initial population by employing any internal density aggregating function. The method employs an optimizer for identifying a diverse set of non-dominated objects, which facilitates the query's resolution. VOLUME 8, 2020 ii. DOMINANCE-BASED METHODS Li and Deb [62] proposed a multi-objective evolutionary method that combines dominance and decomposition procedures to balance the convergence and diversity of the process. This method employs a weight vector as a reference point to guide the selection procedure for each solution. The method maximizes the separability of populations by (1) preserving diversity in the population by estimating its density using the local niche count of a sub-region, (2) updating the population of the last non-domination level's worst solution if it is associated with an isolated sub-region, and (3) using a penalty-based boundary intersection function. Furthermore, the method maximizes the internal density of a population by (1) decomposing a multi-objective problem into singleobjective optimization problems using aggregation functions, and (2) updating the population in a hierarchical manner using local density estimation. The method applies a nondominated sorting procedure to divide the population into non-domination levels based on the Pareto dominance.
Deb et al. [22] employed the concept of the reference point in a multi-objective optimization to identify a set of Pareto-optimal solutions close to the decision maker's regions of interest. This method employs the elitist non-dominated sorting NSGAII [28] and predator-prey procedure [63]. The proposed method employs a reference-point-based procedure in which a decision maker can provide reference points. The method maximizes the separability of populations by preserving diversity by accepting a newly created child only if (1) it weakly dominates all prey and (2) it is not within a predefined region of prey. The method maximizes the internal density of a population by employing the elitist nondominated sorting NSGA-II [28]. After combining parent and offspring populations, non-dominated sorting is performed to classify the population into various levels of non-domination. The Euclidean distance of each front's solution is computed for each reference point. The solutions are ranked based on their distances.

i. DECOMPOSITION-BASED METHODS
Liu et al. [60] proposed a decomposition-based multiobjective evolutionary method that employs adaptively generated reference points to achieve convergence and good distribution of a population. This method maximizes the separability of populations by increasing their diversities as follows: (1) defining a set of well-distributed weight vectors to direct the population's individuals to search in different directions simultaneously, and (2) selecting an individual only once if it is the best candidate, using several reference points. The method maximizes the internal density of a population by increasing the selection pressure according to the density of the population at the Pareto front. Toward this end, it computes the Tchebychev distances between individuals and the reference points and selects the individuals with small distances.
Asafuddoula et al. [3] proposed a decomposition-based evolutionary method with systematic sampling and adaptively generated reference points. This method maximizes the separability of populations by preserving diversity through an adaptive epsilon control scheme. It maximizes the internal density of a population by ensuring that the neighborhood of each reference point p consists of several reference points whose Euclidean distances to p are small. A systematic sampling is used to generate the reference directions, and an epsilon comparison is used to deal with constraints.

ii. DOMINANCE-BASED METHODS
Thiele et al. [89] proposed a dominance-based multiobjective evolutionary method that employs adaptively generated reference points for maximizing the internal density of each population and the separability of the populations. This method maximizes the internal density of a population by employing an agglomerative internal density function. In addition, the method maximizes the separability of populations by preserving diversity. This is performed by applying knowledge about an interesting search space's regions on the initial mating pool. A non-dominated or Pareto-optimal solutions is employed. A concept called weakly non-dominated solutions is adopted, in which all inequalities are replaced by strict inequalities. The decision maker reveals the desirable reference point at each iteration, which is used to generate better solutions.
Deb and Jain [26] proposed an evolutionary method based on adaptively generated reference points and a modified procedure of the non-dominated sorting approach NSGA-II [28]. This method maximizes the separability of populations by preserving diversity by ensuring population members that emphasize structural similarity. It selects only members that are closest to the reference points, which results in a wide diversity of solutions. The clustering-based selection procedure employed by the method emphasizes the maximization of the internal density of each population.
In Wang and Yao [98], the authors proposed a novel twoarchive algorithm (TAA) and its improved version (Two Arch2), which separates non-dominated solutions of each generation into two archives, namely the convergence archive (CA) and diversity archive (DA). The CA can be seen as an online-updated real reference set and contains only non-dominated solutions that once dominated some existing archive members. When the total solutions in the union of CA and DA overflow, the solution in DA with the shortest distance to CA is removed iteratively until the archives satisfy the constraint.

2) GLOBAL SEARCH-POSTERIORI REFERENCE
Dujardin and Chadès [30] proposed a global referencepoint-based multi-objective method for solving environmental investment decision-making problems. This method was demonstrated to solve spatial allocation resource management and dynamic multi-species management problems. It maximizes the separability of populations by preserving diversity by ensuring that each cluster contains individuals of different types. In addition, the maximization of the internal density of a population is managed by maximizing the total number of individuals in each cluster.
Emmerich et al. [33] proposed a steady-state evolutionary method using a global posteriori reference point that covers a maximal hypervolume. It employs the non-dominated sorting approach of NSGA-II [28] as a ranking criterion and a hypervolume measure as a selection operator. This method maximizes the internal density of a population by increasing the density in regions with fair trade-offs. The method maximizes the separability of populations by preserving diversity through the adoption of a steady-state scheme, which is easily parallelized. The method joins a new point p to the population if an increase of the hypervolume covering the population can be achieved by replacing an existing member by p. The hypervolume is applied to discard individuals contributing the lowest hypervolume to the worst-ranked Pareto-optimal front.
Reihanian et al. [78] proposed a multi-objective evolutionary method that employs a rank-based global migrant operator reference to solve global optimization problems needed for finding overlapping communities in a social network with available node attributes. This method maximizes the internal density of a population by increasing the connection density among the individuals of the population. In addition, the method maximizes the separability of populations by preserving the diversity by producing populations with similar nodes' attributes. The result is a set of non-dominated populations of a network with dense connections and similar nodes' attributes. The following equation is used to measure the link closeness (LC) between two neighboring sub-graphs G KN 1 and G KN 2 : is the number of links between G KN 1 and G KN 2 , which is defined as: where G is the social network and A is its adjacency matrix.

B. MAXIMIZING A POPULATION'S DYNAMIC SIMILARITY AND SEPARABILITY OBJECTIVE FUNCTIONS
We describe in Subsections 1 and 2 the local search-based and interactive reference-based methods that maximize a population's dynamic similarity and separability objective functions.

i. DECOMPOSITION-BASED METHODS
Li et al. [59] proposed a decomposition-based multiobjective method that employs a selection operator to serve as a reference point to construct interrelationships between solutions and sub-problems based on their mutual preferences.
This method maximizes the separability of populations by ensuring the diversity of the search process by promoting the mutual preferences between solutions and sub-problems using the reference point selection operator. Specifically, the diversity is promoted by ensuring that sub-regions are as sparse as possible in the objective space. Furthermore, the method maximizes the dynamic similarity of a population by employing a distance-ordering matrix and dynamically associating a solution s to a sub-problem p if the perpendicular distance between s and p is small.
Zhang et al. [107] proposed a multi-objective evolutionary method that decomposes a problem into scalar optimization sub-problems. This method optimizes each sub-problem based on its neighboring sub-problems' information. The method applies a priori reference point on scalar optimization problems by minimizing the scalar function. Furthermore, the method maximizes the separability of populations by preserving diversity by applying scalar optimization on problems rather than directly solving them as a whole. The diversity of the resulting sub-problems results in diversity in the entire population. The method maximizes the dynamic similarity of a population by dynamically measuring the closeness between neighboring weight vectors using the Euclidean distance. The solutions to two neighboring sub-problems are considered optimal if they are similar.

ii. DOMINANCE-BASED METHODS
Branke and Deb [9] proposed a method that employs a user-predefined reference point for identifying solutions close to the best-found solution of the utility function based on elitist non-dominated sorting. The method maximizes the separability of populations by preserving diversity through a parameter that controls the extent of diversity required for achieving optimal solutions. The method maximizes the dynamic similarity of a population by computing the crowding distance values as the ratio of neighboring solutions' distances in the original objective space and projected hyperplane. The preferred solutions have a large crowding distance and reside on a plane parallel to the selected hyperplane.
Deb and Jain [25] proposed a multi-objective evolutionary method that employs a predefined set of reference points and the non-dominated sorting approach NSGA-II [28] to identify the set of points close to the reference points to ensure diversity in the achieved solutions. This method maximizes the separability of populations by promoting diversity preservation by providing (1) a set of well-distributed reference points and (2) a niching procedure for identifying a Paretooptimal solution associated with each reference point. The method maximizes the dynamic similarity of a population by dynamically associating a point p to a reference line l if the perpendicular Euclidean distance between p and l is small. The point with the smallest perpendicular distance from the reference line is selected.
Sindhya et al. [80] proposed a method that applies a priori reference point on scalar optimization problems and employs it as a search operator of an evolution multi-optimization algorithm. The method employs the non-dominated sorting approach of NSGA-II [28] as a ranking criterion. Furthermore, it employs a population-based evolutionary algorithm to serve as a global optimizer in addition to a mathematical programming approach to serve as a local search procedure. The method maximizes the separability of a population by preserving the diversity by minimizing the scalar function and applying the NSGA-II crowding distance operator, which leads to diversity in the entire population. A local search procedure is employed to solve an augmented achievement scalarizing function. The method maximizes the dynamic similarity by dynamically applying the crowding distance operator.

i. DECOMPOSITION-BASED METHODS
Mohammadi et al. [71] proposed a decomposition-based multi-objective evolutionary method that employs adaptively generated reference points for providing the efficient tracking of problems. The method selects the weight vector associated with the closest point to each of the reference points to serve as the base vector, and the set of new generated weight vectors resides around this base vector. The method maximizes the dynamic similarity of a population by dynamically (1) adapting the weight vectors and (2) measuring the closeness between the closest point in the objective space and each of the user reference points using the Euclidean distance. The method maximizes the separability of populations by constructing a small-sized set of weight vectors to identify a solution as close as possible to (1) the user reference point and (2) the Pareto-optimal front. Deb and Jain [26] proposed a decomposition-based multiobjective evolutionary method that employs adaptively generated reference points to maintain a good distribution of the reference points. This method maximizes the separability of populations by promoting diversity preservation through combining a clustering-based selection procedure and a preference scheme for less crowded reference points to maintain well-diversified solutions. The method maximizes the dynamic similarity of a population by dynamically computing (1) the Euclidean distance between the closest obtained point and the projected reference point and (2) the average distance between each neighboring pair of reference points. The method systematically maintains members of a population that are both non-dominated and close to the set of well-distributed reference points. At the end, the number of identified trade-off points depends on the number of selected reference points.
Tian et al. [88] proposed a decomposition-based multiobjective evolutionary method that employs a reference point adaptation procedure to improve the versatility of the algorithm. This method maximizes the separability of populations by preserving diversity through adjusting the reference points according to candidate solutions' indicator contributions in an external archive. The reference points' adjustment occurs at each generation for the indicator calculation. Furthermore, the method maximizes the dynamic similarity of a population by dynamically generating the distance indicator to adjust the reference points. The method is parameterless, which makes it easily deployed in most decomposition-based algorithms.

ii. DOMINANCE-BASED METHODS
Branke et al. [8] proposed a multi-objective method that employs adaptively generated reference points and a dominance scheme. According to the proposed dominance scheme, a guided non-domination ranking is applied, which is a topological sorting. This method maximizes the separability of populations by supporting diversity along the Paretooptimal front, as follows. The niche count is computed for each individual i, which is the number of individuals in the neighborhood i. This diversity procedure prevents different levels from converging in the same Pareto-optimal front. In addition, the method maximizes the dynamic similarity of a population by dynamically ranking the individual i based on the described procedure. Each ranked individual is then removed from the population, and the same procedure is repeated for the remaining individuals, for which the new best rank is awarded adaptively.
Jain and Deb [54] proposed a multi-objective method that adaptively associates a reference point based on its proximity to the ideal point obtained. This method employs the nondominated sorting approach of NSGA-II [28] as a ranking criterion. The proposed method ensures population diversity by spreading the population along the entire front and finding an associated Pareto-optimal solution for each reference point. It maximizes the dynamic similarity of a population by ranking and constructing a hierarchical organization of the population dynamically by employing a dynamic similarity procedure. The method provides a denser representation of the Pareto-optimal front NSGA-II. The method solves constrained problems of the following type: i , i = 1, 2, . . . , n. Jain and Deb [52] proposed a multi-objective method that employs adaptively generated reference points and a dominance scheme to achieve better distribution of the Paretooptimal points. This method maximizes the separability of populations by selecting members in such a manner that the desired diversity is maintained in a population. Toward this end, the method yields an identical range for reference points and objective values by normalizing them. Moreover, the method maximizes the dynamic similarity of a population by dynamically joining the ideal member and a reference point by computing the orthogonal distance between each member and each of the reference lines. The member with the smallest orthogonal distance from a reference point is associated with this point. Finally, populations are combined and sorted based on their level of domination.
Yuan et al. [104] proposed a multi-objective method that employs adaptively generated reference points and a new 80868 VOLUME 8, 2020 dominance relation for achieving better diversity and convergence. The method maximizes the separability of populations by preserving diversity through ensuring that selected solutions are distributed evenly among clusters using the nondominated sorting scheme. Toward this end, only solutions with a competitive relationship within the same cluster are retained. This is ensured by making certain that a fitness function similar to the penalty-based boundary intersection function is constructed. In addition, this prefers solutions that have good fitness values in each cluster. The method maximizes the dynamic similarity of a population by dynamically ranking solutions in the selection phase by employing the non-dominated sorting scheme. The solution with the shortest perpendicular distance to the reference point is selected.

2) GLOBAL SEARCH-POSTERIORI REFERENCE
Miettinen et al. [72] proposed a multi-objective method in which the analyst selects a global posteriori reference point to be established as the ideal objective vector. This method maximizes the separability of populations by preserving diversity by ensuring that each solution is unique. The method maximizes the dynamic similarity of a population by dynamically minimizing the distance between the objective region and desirable posterior reference points. The achieved solution depends on the employed distance measure. For 1 ≤ p < ∞, we have the following problem: subject to X ∈ S, The exponent 1/p can be plunged. The following weighted Chebyshev problem can also be used: The problem above is weakly Pareto optimal for positive weights and has at least one Pareto-optimal solution.
Kim et al. [56] proposed a multi-objective framework based on global migrant reference points that solves dynamically changing social networks. The framework employs a dynamic adaptability procedure to control the ratio of immigrants, which is the ratio of the observed and predicted fitness landscape. The method maximizes the separability of populations by preserving diversity by employing the migrant reference points to replace a proportion of the population. This is performed by optimizing the min-max cut and the global silhouette value that measures the similarity between an object and its own cluster. The method maximizes the dynamic similarity of a population by dynamically adapting to changes.

C. MAXIMIZING A POPULATION'S INTERNAL DENSITY AND DYNAMIC SIMILARITY OBJECTIVE FUNCTIONS
We describe in Subsections 1 and 2 the local search-based and global search-based methods that maximize a population's internal density and dynamic similarity objective functions.

i. DECOMPOSITION-BASED METHODS
Giagkiozis et al. [42] proposed a generalized decomposition-based framework that employs priori reference points represented by a set of weighting vectors close to the regions of interest. This framework allows the decision maker to guide the proposed algorithm toward certain regions of interest in the population. First, the method generates N equally spaced vectors. The method then maximizes the dynamic similarity of a population by dynamically evaluating the newly generated solution and updating the ideal vector for each individual i ∈ {1, . . . , N }. If the new solution outperforms the previous solution in the archive, the ith problem is swapped with the new solution. Furthermore, the method maximizes the internal density of a population by applying a non-parametric density estimation that minimizes the variance of the estimator.
Deb [24] proposed a method that combines decomposition strategies with priori reference points for enabling the algorithm to direct the search on more desirable regions to effectively optimize many-objective problems. The internal density of a population is maximized by employing an internal density procedure based on the importance weight vector to specify the objectives' relative importance. The method maximizes the dynamic similarity of a population by dynamically adapting the weight vectors such that the algorithm can converge near the desired regions.

ii. DOMINANCE-BASED METHODS
Lancichinetti et al. [65] proposed a method that combines dominance-based strategies with a predefined resolution parameter that acts as a reference point for identifying both hierarchical structure and overlapping communities. The structure of a community is based on the peaks in the fitness histogram. The method maximizes the dynamic similarity of a community by dynamically investigating all hierarchical levels of a network based on the influences of its nodes. In addition, the method maximizes the internal density of a community by (1) maximizing the internal degree of nodes by doubling the number of internal links of the module and (2) maximizing the external degree of nodes by increasing the number of links that connect each node in the module with the remaining nodes. By maximizing the internal and external degrees of nodes, a subgraph is built by maximizing the following function: where α is a parameter that controls the size of the module, and k ∂ in and k ∂ out are the internal and external degrees of the nodes of module ∂ respectively. A subgraph is detected starting from node n in such a way that the inclusion or elimination of a new non-dominating node from the subgraph lowers f ∂ .
Branke et al. [8] proposed a method that combines dominance-based strategies with predefined reference points VOLUME 8, 2020 represented by user preferences. This method maximizes the internal density of a population by maximizing the pair-wise links based on their information. In addition, the method maximizes the dynamic similarity of a population by allowing the DM to dynamically specify the trade-offs for each pair of objectives. For example, consider the following DM preference scenario and corresponding dominance scheme. The DM considers that an improvement by one unit in an objective f 2 outweighs the lessening of an objective f by a 12 units. A gain an increase in objective f by one unit is considered to outweigh by a 21 units of objective f 2 . The above information can be employed to adjust the dominance scheme as follows:

i. DOMINANCE-BASED METHODS
Liu et al. [61] proposed an evolutionary many-objective optimization method that generates reference points adaptively and selects new offspring individuals based on nondominated sorting solutions. A series of reference points with good performance in convergence and distribution is generated according to the current population to guide the evolution. The dynamic similarity of a population is maximized by dynamically measuring the distances between individuals and reference points and selecting superior individuals accordingly. The internal density of a population is maximized by applying evolution structural similarity on its individuals. Each temporarily generated population is checked by computing the distances between its individual and the reference points. The resulting population is constructed by selecting superior individuals through the use of non-dominated sorting.
Agrawal [6] proposed a bi-objective genetic method for community detection that employs adaptively generated reference points and the non-dominated sorting approach NSGA-II [28]. This method maximizes the internal density by maximizing the in-degree of a cluster's nodes, which in turn increases the modularity of the cluster. Moreover, the method maximizes the dynamic similarity by constructing a cluster of competing nodes and dynamically ranks them based on their non-dominance status. The following are the objectives to minimize: where Q denotes modularity and CS denotes a community score that maximizes the in-degree of the nodes of a cluster. The maximum modularity of a cluster is achieved successive decomposition bi-populations of the network.
Yang et al. [103] proposed a grid-based evolutionary method that employs adaptively generated reference points and grid dominance relations for solving many-objective optimization problems. This method aims to focus the selection toward the optimal direction and maintain uniform distribution among solutions. Furthermore, the method adopts a fitness adjustment strategy by adaptively punishing individuals based on their grid dominance relations and neighborhoods to (1) direct the search toward different directions and (2) avoid partial overcrowding. The method maximizes the internal density of a population by increasing the range of considered regions by adaptively increasing the neighborhood, whose range varies with the number of objectives. The method maximizes the dynamic similarity of a population by dynamically computing the crowding distance between an individual and its neighbors to identify those that increase density.

ii. FUZZY-BASED METHODS
Jin and Sendhoff [51] proposed a method that incorporates adaptive human preferences represented by fuzzy preferences into evolutionary multi-objective optimization. Adaptive reference points are selected by employing random weighted aggregation, and fuzzy preferences are converted into interval-based weights. This method converts fuzzy preferences to real-valued weight intervals. In addition, the method maximizes both the internal density and dynamic similarity of a population by dynamically combining the dynamic weighted aggregation and the weight intervals to achieve the desired Pareto-optimal solutions. A slow change in the weights of an individual will force it to continue moving gradually along the Pareto front.
Cvetkovi'c and Parmee [19] proposed a multi-objective method that adopts coarse adaptive guidance by transforming fuzzy preferences into specific quantitative weights. This method identifies not only the dominance scheme in terms of which solution is better than another for every criterion but also by how much it is better. The method maximizes the dynamic similarity of a population by dynamically assigning each criterion a weight w i . The method maximizes the internal density of a population by employing a weighted aggregation procedure. Dominance with a strict inequality for at least one objective is defined as follows: where τ is a minimum level for dominance.

2) GLOBAL SEARCH-POSTERIORI REFERENCE
Taha and Yoo [87] proposed a multi-objective framework using a global posteriori reference point represented by dominant keywords in the messages associated with a specific social group. This framework detects overlapping communities of nodes based on their attribute information that describes human characteristics such as culture, ethnicity, demographic, religion, and age, among others. The framework aims to detect the smallest sub-communities with the largest number of domains to which a specific user belongs. Furthermore, the framework employs a graphical model that depicts the ontological relationships between communities.
The framework maximizes both the internal density and dynamic similarity of a population by combining the attribute information and the structural topology of a network. That is, the framework groups nodes are based on both the density of their connectivity and their common attribute similarities. The framework accounts for the sub-communities with multiple domains that exist as a result of the interrelations between communities.

VI. EMPIRICAL EXPERIMENTS AND EVALUATIONS
Since an adequate MOEA should achieve the convergence and diversity of a population, in this section, we experimentally evaluate and rank the methods presented in this paper based on their performance in achieving convergence and diversity. In addition, we performed the following: • We rank the performances of the different optimization methods contained under a same objective function in achieving convergence and diversity.
• We rank the performances of the different objective functions contained under a same objective category in achieving convergence and diversity.
• We rank the performances of the different objective categories in achieving convergence and diversity.
We performed the following procedure for the experimental evaluations: 1) For each optimization method, we selected one of the proposed algorithms that falls under the method. That is, for each optimization method, we selected a paper whose proposed algorithm employs the underlying principles of the method. We considered the selected algorithm/paper to be a representative of the optimization method. From among all papers in which the proposed algorithms adopted the same optimization method, we selected the most influential one. We based the influence of a paper on factors such as its number of citations, recency, and state of the art. 2) We performed the rankings by averaging the convergence and diversity scores achieved by each optimization method, objective function, and objective category.
We ran the prototypes adopting the different algorithms using Windows 10 Pro and Intel(R) Core(TM) i7-6820HQ processor. The CPU and RAM of the machine have 2.70 GHz and 16 GB, respectively.

A. BENCHMARK TEST PROBLEMS
We conducted the empirical experiments on the following popular test suites for MOEAs: • Deb-Thiele-Laumanns-Zitzler (DTLZ) [23]: We selected the following four normalized problems of the DTLZ set [31]: DTLZ1: with multi-frontal and linear Pareto front. DTLZ2 and DTLZ3: with a concave geometry (multi-frontal). DTLZ4: with a concave geometry (biased). We followed the recommendation of [23] by setting the number of decision variables as where v is the variable concerning position and it is equal 5 for DTLZ1and equal 10 for DTLZ2, DTLZ3, and DTLZ4. m is the number of objectives and it is set as m ∈ (3, 5, 8, 10, 15).
• Walking-Fish-Group (WFG) [43]: We selected the following four normalized concave with different scale problems of the WFG set [47]: WFG2: a problem with disconnected Pareto front. WFG3: a non-separable problem with no bias in parameters. Its front a lines regardless of the number of dimensions. WFG4: a multi-frontal optimization problem with a concave Pareto front. WFG5: a separable, unimodal, and deceptive problem with concave Pareto front and no bias. We followed the recommendation of [43] by setting the number of decision variables as u = v + i, where v = 2 * (m − 1) is the variable concerning position, and i = 20 is the variable concerning distance. In addition, m is the number of objectives, and it is set as m ∈ (3, 5, 8, 10, 15).

B. EXPERIMENTAL SETUP
We employed the following parameters in all experiments. We set the size of population to be n = 100. The results of computations were measured based on the average number of function evaluations (NFE), equal to 20,000, until the criterion of stop was reached after 100 runs of a test problem. We set the rate of crossover to be 2, the rate of mutation to be 0.001, and the scaling factor to be 0.6. The success rate (SR), which is the number of times that an optimal solution is found, was determined after 100 runs. A success was computed after the best individual satisfied |f (x) − f (x * )| ≤ ε, where f (x) is the value of function for the best individual x, f (x * ) is the value for the optimal solution x * , and ε = 0.01. Table 1 presents the ∈ values for different problems. We compared the performances of the algorithms in a pairwise fashion using the Wilcoxon rank sum test with the Bonferroni correction [110].

C. PERFORMANCE INDICATOR
Since an adequate MOEA should achieve the convergence and diversity of a population, we adopted the following widely used quality metrics for the evaluations: (1) generational distance (GD) indicator [95] for assessing the convergence property and (2) diversity measure (DM) [29] for assessing the diversity property. That is, our goal for the comparative studies was to validate and compare the performance of the methods in achieving convergence and diversity.
We employed the GD indicator introduced in [95], which measures the distance (Dis) of the elements in a set S from VOLUME 8, 2020 the nearest point of the reference Pareto front. It is defined as follows: Dis(S, P * ) = s∈S min P * 1 − s 2, ..., P * N − s 2 |S| P * : Reference Pareto optimal set. |P |: Cardinality of P * . S: Approximation of the true Pareto front. The DM metric measures the degree of spread and diversity among non-dominated solutions. Specifically, it measures the diversity of a set of solutions with reference to a set that represents the Pareto front. The metric returns an indicator value ranging from zero to one, in which a larger value indicates better coverage of the Pareto front. A detailed description of the DM metric can be obtained from [29]. Table 2 to V present the results of the evaluations, as follows: • Table 2 lists the mean and standard deviation of the GD values of the algorithms on DTLZ problem set.
• Table 3 lists the mean and standard deviation of the GD values of the algorithms on WFG problem set.
• Table 4 lists the mean and standard deviation of the DM values of the algorithms on DTLZ problem set.
• Table 5 lists the mean and standard deviation of the DM values of the algorithms on WFG problem set.
• Each table ranks the performances of the different optimization methods that fall under a same objective function in achieving convergence and diversity.
• Each table ranks the performances of the different objective functions that fall under a same objective category in achieving convergence and diversity.
• Each table ranks the performances of the different objective categories in achieving convergence and diversity.

1) CONVERGENCE PROPERTY
The ''maximizing Population's Internal Density and Dynamic Similarity'' objective category obtained the best results for the convergence metric on both DTLZ and WFG, followed by the ''maximizing Population's Separability and Internal Density'' objective category. Regarding the objective categories, the ''Pareto Dominance-Based'' obtained the best results for the convergence metric on both DTLZ and WFG, followed by the ''Reference Point-Based''. Regarding the specific optimization methods, the ''Strength Pareto-Based'' [108] obtained the best results for the convergence metric on both DTLZ and WFG, followed by the ''e-Dominance-Based'' [66]. The outstanding convergence property performance of the ''Strength Pareto-Based'' method is its inclusion to a nearest-neighbor density-estimation technique, which causes the search process to be more precise. The good convergence property of [108] was also caused by the assurance that an archive member would be deleted only if it has been dominated by another solution or is located in an overcrowded portion of the front. This method ensures a better search ability by ensuring that the archive preserves the previous best solutions. Regarding the ''e-Dominance-Based'' method, its good performance is attributed to its methodology of causing extreme solutions to be dominated by those within e, which exhibits a good convergence objective. This can also improve the proximity to the Pareto-optimal front and can help in avoiding premature convergence to the final solution.
The ''maximizing population's Separability and Structural Similarity'' objective category is the farthest from the true front in the case of DTLZ, while the ''maximizing Population's Separability and Dynamic Similarity'' objective category is the farthest from the true front in the case of WFG. In general, the ''maximizing Population's Separability and Dynamic Similarity'' exhibited a slower convergence compared with the ''maximizing population's Separability and Structural Similarity.'' Regarding individual optimization methods, the ''Decomposition-Based'' method was the worst among all methods in terms of the rate of convergence to the true Paretooptimal front. This was due in part to the lack of extreme solutions on the Pareto-optimal front. The method becomes trapped in a local Pareto front in many runs. We observed from the experimental results that the method experienced difficulties in multi-frontal problems for some problems and was unable to reach the true Pareto front in approximately 23% of test instances.

2) DIVERSITY PROPERTY
The ''maximizing population's Separability and Structural Similarity'' objective category obtained the best results for the diversity metric on both DTLZ and WFG, followed by the ''Maximizing Population's Separability and Internal Density'' objective category. Regarding objective categories, the ''Indicator-Based'' obtained the best results for the diversity metric on both DTLZ and WFG, followed by the ''Reference Point-Based.'' Regarding the specific optimization methods, the ''Hypervolume Indicator-Based'' [48] obtained the best results for the diversity metric on both DTLZ and WFG, followed by the ''R2 Indicator-Based'' [86]. The crowding distance comparison procedure employed by [48] enables it to preserve the explicit diversity of Pareto-optimal solutions. Moreover, the combination of crowding-distance and non-dominance sorting employed by [48] conserve already-obtained Paretooptimal solutions. In few test instances, the crowding distance function failed to produce a crowd of solutions, even with multiple objectives. Its local search procedure helped it to identify diverse non-dominated solutions. Even in the DTLZ4 problem's non-uniformly distributed Pareto-optimal front, the method maintained a very good distribution of points.
By adjusting the R2 indicator's weight vector distributions, a solution is selected by [86] only if it is better than the current solution. Otherwise, the solution is not updated. This helps [86] in obtaining Pareto optimality and solution diversity. The  ''Maximizing Population's Internal Density and Dynamic Similarity'' achieved the worst diversity and distribution of solutions. The ''Reference Point-Based'' objective category achieved the worst diversity among all objective categories, followed by the ''Indicator-Based'' objective category. The ''Fuzzy-based Interactive Reference'' optimization method [51] achieved the worst diversity among all optimization methods. The method achieved poor results in the DTLZ1 and DTLZ3 diversity metrics due to the lack of extreme solutions in the Pareto-optimal front. In addition, it produced a poor distribution of solutions. The ''Decomposition-Based'' method [42] maintained a poor distribution of points in the non-uniformly distributed Paretooptimal front of the DTLZ4 problem; it generated disconnected Pareto fronts.

VII. CONCLUSION
Most real-world MOO problems require maximizing one of the following four pairs of objective functions: (1) population's separability and dynamic similarity, (2) population's internal density and dynamic similarity, (3) population's separability and structural similarity, or (4) population's separability and internal density. In this survey paper, we therefore classify multi-optimization methods based on the above pairs of objective functions that they seek to maximize. In addition, we introduce a comprehensive survey on the multi-objective algorithms contained under each optimization method, the optimization methods contained under each objective function, and the objective functions contained under each objective category. We provide a methodology-based taxonomy that classifies multi-optimization methods into hierarchically nested, finegrained, and specific classes. We experimentally compared and ranked the optimization methods that fall under each objective function, the objective functions that fall under each objective category, and the objective categories for solving a specific optimization problem. We found the following: 1) The ''maximizing Population's Internal Density and Dynamic Similarity'' objective category obtained the best results for the convergence metric. 2) The ''Pareto Dominance-Based'' objective category obtained the best results for the convergence metric.
3) The ''Strength Pareto-Based'' optimization method obtained the best results for the convergence metric. 4) The ''maximizing Population's Separability and Dynamic Similarity'' objective category was the farthest from the true front. 5) The ''Decomposition-Based'' method was the worst among all methods in terms of the rate of convergence to the true Pareto-optimal front. 6) The ''maximizing population's Separability and Structural Similarity'' objective category obtained the best results for the diversity metric. 7) The ''Indicator-Based'' objective category obtained the best results for the diversity metric. 8) The ''Hypervolume Indicator-Based'' optimization method obtained the best results for the diversity metric. 9) The ''Maximizing Population's Internal Density and Dynamic Similarity'' achieved the worst diversity and distribution of solutions.
10) The ''Reference Point-Based'' objective category achieved the worst diversity among all objective categories. 11) The ''Fuzzy-based Interactive Reference'' optimization method achieved the worst diversity among all optimization methods. 12) The ''Decomposition-Based'' method [42] maintained a poor distribution of points in the non-uniformly distributed Pareto-optimal front.