A Multi-Objective Evolutionary Algorithm With Hierarchical Clustering-Based Selection

This paper proposes an evolutionary algorithm with hierarchical clustering based selection for multi-objective optimization. In the proposed algorithm, a hierarchical clustering is employed to design the environmental and mating selections, named local coverage selection and local area selection, respectively, for multi-objective evolutionary algorithm. The local coverage selection strategy aims to preserve well-distributed individuals with good convergence. While, the local area selection strategy is devised to deliver a balanced evolutionary search. This is achieved by encouraging individuals for exploration or exploitation according to the $I_{\epsilon }+$ indicator. In both strategies, a hierarchical clustering method is employed to discover the population structure. Based on the clustering results, in local coverage selection, the individuals of different clusters will be retained according to their coverage areas and crowding distances, such that distributing as evenly as possible in the Pareto front. In local area selection, the individual(s) with the best value of $I_{\epsilon }+$ indicator in each cluster will be selected to perform mating, with the purpose of achieving a balanced exploration and exploitation. The proposed algorithm has been evaluated on 26 bench-mark problems and compared with related methods. The results clearly show the significance of the proposed method.


I. INTRODUCTION
Multi-objective optimization problems (MOPs) refer to the problems with more than two conflicting objectives. They are commonly seen in the real world, e.g., electrical engineering [1], scheduling [2], engineering modeling [3] and transport engineering [4]. The MOPs can be formulated as: where x is the decision vector, X is the decision space and m is the number of objectives [5]. Due to the conflicting nature of different objectives in MOPs, there is no single solution The associate editor coordinating the review of this manuscript and approving it for publication was Christian Pilato .
that can optimize all the objectives simultaneously. Instead, a set of tradeoff solutions, which cannot be improved in any objective without degenerating at least one other objective, can be obtained. These tradeoff solutions are called Pareto optimal solutions. The collection of all Pareto optimal solutions is called Pareto optimal set (PS), and the image of projection of PS in the objective space is called Pareto optimal front (PF). Over the last decades, a number of multi-objective evolutionary algorithms (MOEAs) have been proposed, which have demonstrated good performance in dealing with MOPs [6]. Generally, MOEAs can be classified into three categories. The first category is called Pareto-based MOEAs [7], which utilizes the non-dominated sorting mechanism along with a diversity maintenance scheme to select candidates. VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ In this approach, when the proportion of non-dominated solutions increases, the Pareto dominance could fail to provide sufficient selection pressure, resulting in poor population diversity and convergence. In recent years, a few domination principles have also been proposed to relax the comparison criteria, so that feasible solutions can be compared with each other. Examples methods include -domination principle [8] and fuzzy Pareto dominance [9]. It is worth noting that the performance of these methods depends on their parameter settings. Consequently, although the diversity and convergence can be improved, the final non-dominated solutions provided by these methods may not cover the entire PF uniformly [10]. The indicator-based MOEAs fall in the second category. It adopts an indicator (e.g., hypervolume (HV) [11], inverted generational distance (IGD) [12] and R2 [13]) to measure both convergence and diversity. These indicators, however, may prefer some specific regions of PF, causing certain optimal solutions to disappear during evolution. As a result, final solutions may not be evenly distributed along the PF. For example, in the HV indicator based MOEAs [14], [15], the candidate solutions could be biased to distribute in the middle of a convex/concave PF [16].
The decomposition-based MOEAs tend to decompose the original MOP into a number of single-objective optimization problems or simpler MOPs to be solved in a collaborative manner [17], [18]. In these methods, since the predefined weight vectors are uniformly sampled on the unit hyperplane, the distribution of candidate solutions in the middle of convex/concave PFs will be more/less crowded than those on the border. The situation could become even worse when the PF has a sharp peak or long tail [19]. Further, as the distribution of candidate solutions is mostly determined by the predefined weight vectors, the difference between PF shape and distribution of weight vectors could also lead to a substantial deterioration of the performance of decomposition-based MOEAs [20].
To alleviate the above issues, a clustering-based environmental selection (named local coverage selection) along with a clustering-based mating selection (named local area selection) have been proposed and incorporated into an evolutionary algorithm for multi-objective optimization. The main contributions of this study are two-fold: • A local coverage selection strategy, which is designed to preserve a group of well-distributed individuals with good convergence. In this strategy, a non-dominated fast sorting is first employed to eliminate individuals, which are far away from the Pareto front. Then, the population is divided into clusters by a hierarchical clustering method. The number of individuals in each cluster, which will be retained, is determined based on the coverage area of each cluster. These individuals are subsequently selected based on the crowding distance, so that the individuals in the same cluster can be evenly distributed. Consequently, the local coverage selection strategy can be used to preserve evenly distributed individuals in Pareto front.
• A local area selection strategy, which is designed to deliver a balanced evolutionary search. In this strategy, the population is divided into clusters by a hierarchical clustering method. The individuals are selected for exploration or exploitation according to the I + indicator, with the purpose of achieving a balanced exploration and exploitation. The proposed algorithm has been evaluated on widely used bench-mark problems and compared with related methods. The experimental results clearly show the significance of our devised strategies. The results also show that the proposed algorithm generally outperforms the related methods to be compared.
The remainder of this paper is structured as follows. Section II introduces related work. Section III describes the proposed algorithm including a detailed explanation of the proposed local coverage selection and local area selection strategies. The experimental setting and results are reported and analyzed in Section IV. Section V provides our conclusion.

II. RELATED WORK
In literature, many MOEAs have been proposed. For example, an adaptive reference vector generation approach for inverse model-based MOEA [21] was developed for problems with degenerated and disconnected Pareto fronts. In this method, at the early stage of evolution, a reference vector in the crowded area is replaced by a randomly generated vector. Since the vector for replacement is created randomly, this method may have difficulty to guarantee an even distribution of the obtained solutions. A variant of MOEA/D that adjusts weight vectors, termed as MOEA-ABD, was proposed in [22] for bi-objective optimization problems with discontinuous Pareto fronts. This method detects the discontinuous part of Pareto front by calculating the deviation between the weight vector and its corresponding normalized objective function vector, and adjusts the number of weight vectors according to the length of each continuous part. Some other variants of MOEA/D have also been proposed, such as an improved MOEA/D with a two-phase strategy and a niche-guided scheme, called MOEA/D-TPN [23], MOEA/D with sorting and selection [24], AMOEA/D [25] with an auto-switching strategy based on the aggregation function enhancement, and angle-based adaptive penalty (AAP) scheme for MOEA/D [26]. These methods, however, still subject to drawback due to employing a fixed weight vector set. A fast method for pruning non-dominated solutions in many-objective problems, known as MOEA/D-AWA, was introduced in [19]. This method first calculates the sparsity level of each individual using a vicinity distance. Then, overcrowded subproblems are deleted while new subproblems are added into the sparse regions. This method is helpful to reduce invalid subproblems at the cost of introducing extra computation time and cannot guarantee an even distribution of the solutions. In [27], a reference vector guided evolutionary algorithm (RVEA) [27] was proposed. In this method, predefined reference vectors are adjusted at certain generation using the Hadamard product [28] to adapt the shape of front. In [27], a variant of RVEA, termed as RVEA*, was proposed. In this method, if no individual is associated with a reference vector, it will be replaced by a randomly generated vector. Consequently, it is difficult to guarantee an even distribution of the vectors. In [29], an improved version of A-NSGA-III [30], termed as A 2 -NSGA-III [29], was introduced to address several limitations of A-NSGA-III. As in A-NSGA-III, reference points, which have no individual to be associated will be deleted and relocated around a reference point associated with more than two individuals. This method also requires relocation of reference points, which may lead to an uneven distribution of the solutions. In MOEA/D-SCS [31], the evolutionary process is separated into three stages, where different indicators are utilized to screen elite and inferior solutions. However, it is difficult to distinguish early, middle, and late stages with fixed values for different types of MOPs.
Clustering algorithms have been previously employed to improved MOEAs [32], [33]. These methods generally employ clustering to adapt reference vectors or reference points. For example, the clustering-ranking method for manyobjective optimization (termed as crEA) [32] employed a series of predefined reference lines to cluster individuals into subregions and the individual nearest to the Pareto front in each cluster will be selected. Since the reference lines are fixed, by employing clustering approach, it is difficult to adapt to MOPs with irregular Pareto fronts. A clusteringbased MOEA (termed as CLUMOEA) [34] tended to adopt the k-means clustering algorithm o divide individuals of population into clusters and individuals in the same subproblem are allowed to perform crossover, thereby accelerating the convergence. However, the limitations of k-means, such as sensitivity to initial cluster centers and assuming spherical data distribution, could reduce the performance of CLUMOEA. An evolutionary many-objective optimization algorithm with clustering-based selection (termed as EMyO/C) [33] tried to first calculate the sum of Euclidean distance between the individuals of two clusters and then merge those clusters, which have the minimum sum value.

III. PROPOSED ALGORITHM
In this section, we propose an evolutionary algorithm with hierarchical clustering-based selection (HCCA) for multiobjective optimization. During the process of method, firstly, two populations, DP and PP, with N individuals are randomly sampled from the search space to form initial populations. Then, individuals are randomly selected from the DP to produce offspring, in which the differential evolution (DE) [35] and polynomial-based mutation (PM) [36] are adopted. For PP, in the proposed local area selection (LAS) strategy, simulated binary crossover (SBX) [37] and polynomial-based for i ← 1 to |M DP | do 6: Offspring ← crossover and mutation (M i , DP) 7: PP ← local coverage selection(PP) 14: end while 15: P ← crowding degree strategy(PP, DP) mutation (PM) are employed to generate offspring. Subsequently, a decomposition-based criterion and the proposed local coverage selection (LCS) strategy are performed to select N individuals for DP and PP, respectively. It should be noted that the decompositon-based criterion used in our method adopts the Tchebycheff method [17]. Finally, at the end of evolution, the crowding degree strategy, as presented in [38], will be implemented to select a group of individuals from DP and PP as the final output P. The main loop of HCCA is shown in Algorithm 1.
In the following subsections, we shall describe the details of proposed LAS and LCS strategies in the proposed algorithm.

A. LOCAL AREA SELECTION STRATEGY
Mating selection aims at selecting a group of parents for producing offspring. Typically, parents with high-quality performance would have high chances to produce offspring. The traditional Pareto dominance criterion, which have been adopted in exiting methods, may fail to discriminate the convergence degrees of individuals. In addition, mating strategies in existing methods generally emphasize convergence and ignore population diversity [39]. To improve the situation, here, a local area selection strategy is proposed.
In LAS, a convergence fitness Fitness(x) based on a convergence indicator (indicator I + in IBEA [40]) is introduced to measure the convergence performance of an individual, which is defined as: where x 1 and x 2 are individuals from the population PP and m is the number of objectives. The convergence fitness VOLUME 11, 2023 Algorithm 2 Local Area Selection Strategy Input: Population PP, number of individuals to be generated K and the size of population N Output: if |Q| ≥ N then 6: break 7: end if 8: end for 9: AF ← calculate the convergence fitness of each individual in the population PP according to Eq. 2 10: C 1 , C 2 , . . . , C K ← divide individuals in population Q into K clusters by a hierarchical clustering algorithm 11: OFP ← ∅ 12: OP ← ∅ 13: for i ← 1 to K do 14: OP ← select the individual with the best convergence fitness from C i 15: if |C i | >= 2 then 16: [x 1 , x 2 ] ← select two individuals with the best convergence fitness from C i 17: Offspring ← crossover and mutation(x 1 , x 2 ) 18: OFP ← OFP∪ Offspring 19: end if 20: end for 21: if |OFP| < K then 22: for i ← 1 to K − |OFP| do 23: [x 1 , x 2 ] ← randomly select two individuals from OP 24: Offspring ← crossover and mutation(x 1 , x 2 ) 25: OFP ← OFP∪ Offspring 26: end for 27: end if value can reflect the convergence performance of each individual. A smaller convergence fitness value implies a better convergence performance. More importantly, a hierarchical clustering is used to divide the population into clusters, and parents are selected from different clusters to maintain diversity.
A detailed process of the proposed LAS can be summarized as follows. Firstly, the population PP is sorted into different fronts (F1, F2, . . . , Ft) based on the principle of non-dominated sorting [7]. Then, the fronts are moved one by one into population Q from the lowest level to the highest level until a front F t is encountered. Subsequently, assign the convergence fitness to each individuals in the population PP. After that, a hierarchical clustering method is employed to divide the population into K clusters. Here, the parameter K is dynamically adjusted according to the fitness improvement value of PP and DP (i.e., FI PP and FI DP ) at each generation as follows: where Off PP and Off DP denote a group of offspring generated from PP and DP, respectively. While, o 1 and o 2 stand for the offspring in Off PP and Off DP , respectively. N is the population size and K is the value of K at previous generation. f i (x, y|λ i , z * ) is the enhancement brought by the new offspring y associated to the parent x, which is defined as: where g tch (x|λ i , z * ) is the fitness value, which is assigned using the following TCH decomposition function: where W = (λ 1 , λ 2 , . . . , λ n ) is a set of weight vectors, f j (x) represents the j th objective value. Z * = (z * 1 , z * 2 , . . . , z * m ) is the ideal vector for m objectives, which is approximated by the minimum value of each objective in the current population, i.e., for all j ∈ {1, . . . , m} The normalized fitness improvement value (FI PP and FI DP ) can be obtained by: where 0 ≤ FI PP , FI DP ≤ 1. Based on FI PP and FI DP , the parameter K can be finally calculated as: where max() and min() return the maximum and minimum values, respectively. It is worth noting that the minimum K is set to be 3, which ensures that the population PP has at least three individuals to be selected into the mating pool. After obtaining the clusters, the individuals with the best convergence performance within each cluster will be selected and added into the mating pool. If the number of individuals in the cluster is greater than or equal to two, two individuals with the best convergence fitness value will be selected to produce offspring. Finally, when the number of individuals in the offspring pool OFP is less than the number of individuals that need to be generated, the parent will be randomly selected from the OP, and K -OFP offspring will be generated and added to the OFP. The pseudo-code of LAS is shown in Algorithm 2.
To help understand the LAS strategy further, an example is given in Fig. 1 to illustrate its process. In the Fig. 1, eight solutions are allocated to five clusters. For the clusters C 2 and C 3 , in which the number of solutions is greater than or equal to two, two solutions with the best convergence fitness value will be selected, i.e., solutions D and E from C 2 and solutions F and H from C 3 will be selected for mating and the generated offspring will be inserted into offspring pool OFP. At the same time, the solution with the best convergence fitness value in C 1 to C 5 will be added into the mating pool. If the number of generated offspring is less than the number of offspring to be generated, K -OFP solutions will be selected from mating pool to generate offspring and added to OFP.

B. LOCAL COVERAGE SELECTION STRATEGY
The main purpose of environment selection is to preserve a group of individuals from the union of current population and their offspring to create a new population for the next generation. These surviving individuals should be well-distributed and have good convergence. In case that the PF has complicated geometrics, the diversity maintenance mechanism based on Pareto-based MOEAs could have difficulty to guarantee the individuals to be distributed evenly in the entire PF, thus reducing their performance. To alleviate this issue, we propose to employ a hierarchical clustering to divide the population into clusters for discovering population structure. The number of individuals to be retained is determined according to the coverage area of each cluster, while the crowding distance is used to determine the final retained individuals. This allows individuals to be distributed as evenly as possible throughout the entire PF, thus maintaining population diversity.
The proposed local coverage selection works as follows. Firstly, the efficient non-dominated-sorting method is used to maintain the convergence of population. Then, the crowding distance is applied to reflect the crowding degree among individuals. The crowding distance dis(X ) can be defined as: where m is the number of objectives, f k max and f k min denote the maximum and minimum, respectively, of the k th objective value. f k (X n+1 ) and f k (X n−1 ) are two nearest solutions on either side of the member along the k th objective. X n−1 and X n+1 are two closest solutions to X n on both sides of the membership of the k th objective. For the non-dominated members, which have maximum or minimum value for any objective, the crowding distance will be assigned with an infinite value. i ← i + 1 6: end while 7: if |Q| > N then 8: AD ← calculate the crowding distance of each individuals in population Q according to Eq. 9 9: Normalize objective values of individuals in population Q according to Eq. 10 10: C 1 , C 2 , . . . , C N ← divide individuals in population Q into N clusters by a hierarchical clustering algorithm 11: U ← ∅, j ← 1 12: for i ← 1 to |Q| do 13: if the individual in x i is not in U then 14: C ← the cluster to which x i belongs 15: U j ← C 16: if |U j | = 1 then 17: while x i 's nearest individual X is a scattered individual do 18: end while 20: end if 21: U ← U ∪ U j 22: j ← j + 1 23: end if 24: end for 25: CA ← calculate the size of coverage area according to Eq.11 26: for i ← 1 to |U | do 27: keep Num(C) individuals with the best crowding distance 28: end for 29: Q ← U 30: end if A bigger crowding distance value for the individual implies a better diversity performance.
Since Euclidean distance is used in hierarchical clustering, objective values will be normalized to the range of [0, 1] for robust clustering. The objective values of each individual in population Q are scaled as: Based on the normalized values, a hierarchical clustering is then used to divide population into N clusters. As too many singular clusters are not helpful in recovering the population structure, the following steps have been employed to reduce the number of singular clusters. First, identify the nearest VOLUME 11, 2023 neighbor, X i , of the individual X i from a singular cluster. If X i is also a singular cluster, individuals X i and X i will be merged to form a new cluster. This process, termed as cluster recombination, will be repeated until the nearest neighbor X i is not a singular cluster. After obtaining the clusters, the coverage area of each cluster is subsequently calculated and the number of individuals that need to be retained in each cluster is determined according to: where C represents a cluster, f k max (C) and f k min (C) denote the maximum and minimum, respectively, of the k th objective value in the cluster C. T and sN is the number of clusters and the number of singular clusters, respectively. The function, floor(), returns the nearest integer in the direction of negative infinity, and function, min(), returns the minimum value. Finally, each cluster retains Num(C) individuals with the best crowding distance to form a new population, so that the distribution of individuals within each cluster to be as uniformly as possible over the entire PF. The pseudo-code of the proposed LCS strategy is shown in Algorithm 3.
In order to facilitate the understanding, a simple example of implementing the LCS has been illustrated in Fig. 2. In Fig. 2, suppose the individuals A, B, C, D, E, F, G and H to be partitioned into four clusters C 1 , C 2 , C 3 and C 4 according to a hierarchical clustering method. The number of retained individuals Num(C) will then be calculated based on the coverage area of clusters C 1 , C 2 , C 3 and C 4 , respectively. Assuming the calculated values of Num(C) for clusters C 1 to C 4 to be 1, 2, 2 and 1, respectively. Also, assuming the crowding distance to be B > D > C, E > F and G > H . Then, for cluster C 1 , the individual A will be retained. For cluster C 2 , two individuals with the best crowding distance (i.e., B and D) will be preserved. Similarly, the individuals E, F and G will also be kept. Finally, based on the procedure of LCS, a set of solutions containing A, B, D, E, F and G will be obtained for evolution of next generation.

IV. EXPERIMENTS
In this section, we first evaluate the significance of LAS and LCS strategies in the proposed algorithm. Then, we compare the proposed method with state-of-the-art multi-objective evolutionary algorithms. Unless otherwise stated, the median and corresponding interquartile range (IQR) over 30 independent trials of each algorithm are reported and formatted as (median ± IQR). For each row in the table, we highlight the best value in bold. To obtain a statistically sound conclusion, Wilcoxon rank sum test is performed at a significance level α = 0.05. In the tables, the symbols ''+'', ''-'', and ''≈'' indicate that the results of methods to be compared are significantly better, worse and similar, respectively, to our method. The experiments are carried out on a machine with  Microsoft Window 10 Pro, Intel Core i5-5200 2.40 GHz and 16GB RAM.

A. TEST PROBLEMS AND PARAMETER SETTINGS
The MOPs to be tested include WFG [41], DTLZ [42] and UF [43] problems. The characteristics and parameter settings of these problems are shown in Table 1. These parameter settings are the same as recommended in [44].
We compare our proposed method HCCA with seven related algorithms including MOEA/D-PaS [45], EAG-MOEA/D [46], CA-MOEA [47], SPEA/R [48], DEAGNG [49], A-NSGA-III [30] and EMyO/C [33]. The parameters of these algorithm are listed in Table 2, which are specified or chosen according to the original setting with the best performance. Here, P c is the crossover probability, P m is the mutation probability, η c and η m are the distribution indexes of SBX and PM, respectively. For the DE operator, CR and  F are the crossover rate and scaling factor, respectively, T denotes the size of neighborhood for weight vectors, δ is the probability to select the parents from T neighbors, and nr is the maximum number of parent solutions to be updated by each offspring solution. |A S | and HP max are the signal archive and maximum, respectively, of Hit point of node. age max is the maximum cluster age and λ represents the cycle for topology reconstruction. a stands for learning coefficient and nb denotes learning coefficient of neighbor. α indicates VOLUME 11, 2023  nodes r1max and r2max error reduction constant, while β is error reduction coefficient.

B. PERFORMANCE METRICS
Two widely employed performance metrics, inverted generational distance (IGD) [12] and HV [11], have been adopted for evaluation. IGD can reflect both convergence and diversity. Let P * denote a subset of PS that is distributed evenly along the PF and P indicate the approximate set obtained by an algorithm. The IGD value of P to P * is calculated as: where d(x, P) is the minimum Euclidean distance from the point x to P and |P * | returns the size of P * . Generally, a smaller IGD value means that the approximate set is closer to the PF and distributed more evenly.
The HV [11] is also believed to be able to account for both convergence and diversity. To calculate the value of HV for the final solution set, a reference point which is dominated by all Pareto optimal solutions should be predefined. Here, as suggested by [44], the reference points for computing HV are set to be 1.

C. RESULTS
We first examine the merit of LAS strategy by comparing HCCA and its variant (i.e., HCCA-RAND). In HCCA-RAND, rather than the proposed local area selection strategy, the random selection is employed as mating selection. Table 3 shows the comparison results of HCCA and HCCA-RAND on the test problems. The results show that HCCA could deliver significantly better performance on the problems based on both metrics. Using IGD metric,  HCCA outperforms HCCA-RAND on 20 out of 26 cases. Similar results can also be found in term of HV metric. Based on the results, we can conclude that the proposed LAS strategy can help select a suitable group of individuals for reproduction, thereby improving the performance of the algorithm. VOLUME 11, 2023   Then, the effectiveness of proposed LCS strategy in HCCA is accessed. For this purpose, we compare the HCCA with its variant HCCA-CROWD, in which the proposed LCS strategy is replaced by a crowding distance as the environmental selection. The results are reported in Table 4. From the results, we can see that HCCA can deliver better results on most problem cases in terms of both HV and IGD metrics. This is mainly due to the proposed LCS strategy can effectively retain a set of individuals with good convergence and diversity during evolution, thus leading to the performance improvement.
Subsequently, the sensitivity of parameter K in LAS strategy is evaluated. Five different initial values of K (i.e., K = 0, 0.25N, 0.5N, 0.75N, N) have been used for evaluation. The results are reported in Table 5. The results show that, by setting the parameter K with different initial values, the HCCA could deliver comparable performance. The results thus indicate the HCCA could be robust to initial values of parameter K . This is mainly due to the value of K is set to be dynamically adjusted during the evolution.
Finally, we compare the performance of HCCA with related methods. The results are shown in Tables Fig. 4 shows the final populations obtained by HCCA, DEAGNG and EMyO/C could be more evenly distributed along the PF than those obtained by the rest methods. Based on the results, clearly, our method is the best alternative and could significantly outperform the related algorithms to be compared.

V. CONCLUSION
This paper proposes and implements a multi-objective evolutionary algorithm with hierarchical clustering based selection. In the proposed method, the hierarchical clustering method is employed to design both environmental selection (termed as local coverage selection) and mating selection (termed as local area selection). The local coverage selection strategy is employed to preserve a group of well-distributed with good convergence during evolution, thus appropriately searching the PF. While, the local area selection strategy tends to deliver a balanced evolutionary. The significance of the two proposed strategies has been clearly shown in the results. The results also show that our algorithm could greatly outperform related methods to be compared.
The proposed method can be extended in several directions. Firstly, it is interesting to employ region partitioning technique such as zoning search [50] to design the LAS and LCS. In this regard, their performance can be compared. Secondly, it is desirable to employ clustering with crowding degree to extract the solutions from multiple populations as the final output. Additionally, employing the proposed algorithm to address problems including image segmentation [51], parameter estimation [52], [53] and nonlinear system control [54], [55], [56], [57] can also be investigated. ZHOUCHENG BAO is currently pursuing the M.Sc. degree in electronic information with Hangzhou Normal University. His research interests include meta-heuristic algorithms and data mining.
WENDA HE is currently pursuing the M.Sc. degree in electronic information with Hangzhou Normal University. His research interests include machine learning and semi-supervised algorithms.
WEIGUO SHENG (Member, IEEE) received the M.Sc. degree in information technology from the University of Nottingham, U.K., in 2002, and the Ph.D. degree in computer science from Brunel University, U.K., in 2005. Then, he worked as a Researcher with the University of Kent, U.K., and Royal Holloway, University of London, U.K. He is currently a Professor with Hangzhou Normal University. His research interests include evolutionary computation, data mining/clustering, pattern recognition, and machine learning. VOLUME 11, 2023