Knowledge Transfer for Dynamic Multi-objective Optimization with a Changing Number of Objectives

Different from most other dynamic multi-objective optimization problems (DMOPs), DMOPs with a changing number of objectives usually result in expansion or contraction of the Pareto front or Pareto set manifold. Knowledge transfer has been used for solving DMOPs, since it can transfer useful information from solving one problem instance to solve another related problem instance. However, we show that the state-of-the-art transfer algorithm for DMOPs with a changing number of objectives lacks sufficient diversity when the fitness landscape and Pareto front shape present nonseparability, deceptiveness or other challenging features. Therefore, we propose a knowledge transfer dynamic multi-objective evolutionary algorithm (KTDMOEA) to enhance population diversity after changes by expanding/contracting the Pareto set in response to an increase/decrease in the number of objectives. This enables a solution set with good convergence and diversity to be obtained after optimization. Comprehensive studies using 13 DMOP benchmarks with a changing number of objectives demonstrate that our proposed KTDMOEA is successful in enhancing population diversity compared to state-of-the-art algorithms, improving optimization especially in fast changing environments.

However, few studies have been done to solve DMOPs with a changing number of objectives (NObj).The most recent work is the proposal of the Dynamic Two Archive Evolutionary Algorithm (DTAEA) [18].The main idea was to simultaneously maintain two co-evolving populations, i.e, a convergence archive (CA) and a diversity archive (DA) during the evolution.Whenever environmental changes occur, CA and DA are reconstructed to preserve as much convergence and diversity as they can in the new environment.
Considering that the reconstruction of CA and DA in DTAEA involves copying (optimal) solutions from the past problem instance to the next after changes, DTAEA can be regarded as a kind of knowledge transfer-based algorithm, as it makes use of knowledge acquired from solving the previous problem instance to solve the new problem instance.However, in this study, we show that DTAEA cannot handle DMOPs with a changing NObj containing more complex problem features including PF shapes (convex, discontinuous and mixed shape of convex and concave) and fitness landscapes (nonseparability and deceptiveness) well.Specifically, the knowledge transfer (i.e.CA and DA reconstruction) in DTAEA is incapable of providing enough diversity in these complex scenarios.The reason is that the change in the NObj changes the distribution of reconstructed CA on the true PF for the new problem instance and the uniformly sampled solutions in the search space by the DA reconstruction are not uniformly distributed in the objective space due to problem features in the more complex problems.
In this paper, we aim to answer the following research questions: • How to increase diversity when solving DMOPs with a changing NObj, so as to improve knowledge transfer right after changes?• How does knowledge transfer help the optimization process itself?
In order to answer these research questions, we propose to expand or contract the PS of the problem after NObj increases or decreases, respectively, to improve the knowledge transfer.This strategy works better than DTAEA because DMOPs with a changing NObj usually result in the expansion or contraction of the dimension of the PS manifold [18].Experimental studies have been carried out on 13 DMOPs with a changing NObj, modified from 4 DTLZ [19] and 9 WFG [20] problems to demonstrate the effectiveness of our proposed approach.
The novel contributions of our work are summarized as follows: • Comprehensive experiments have been carried out on representative problems with complex problem features in the fitness landscape (nonseparability and deceptiveness) and complex PF shapes (convex, discontinuous and mixed shape of convex and concave) to understand the limitations of the state-of-the-art algorithm DTAEA.Our analyses reveal that DTAEA lacks diversity when solving more complex DMOPs with a changing NObj; • A novel knowledge transfer-based method, called knowledge transfer dynamic multi-objective evolutionary algorithm (KTDMOEA), is proposed.This method proposes PS expansion and contraction mechanisms to enhance diversity for dealing with changing NObj in DMOPs; • Systematic computational studies have been conducted to compare our proposed KTDMOEA with 5 algorithms on 13 DMOPs with a changing NObj under different frequencies and types of changes in the NObj.Experimental results have shown that our algorithm is competitive against all compared algorithms.The remainder of this paper is organized as follows.Section II describes related work on DMOPs with a changing NObj and evolutionary transfer optimization as well as the motivation of our proposal.The proposed knowledge transfer-based algorithm is elaborated in Section III.Section IV describes the specific experimental setup.The experimental results are presented in detail in Section V. Section VI concludes this paper and points out possible future work.

II. RELATED WORK AND MOTIVATION
This section firstly reviews related work on DMOPs with a changing NObj and evolutionary transfer optimization.Then, a preliminary investigation of the existing work DTAEA [18] is conducted to reveal its limitations on solving DMOPs with complex problem features.

A. DMOPs with a Changing NObj
In this paper, we focus on the continuous minimized DMOPs defined as follows: where Ω ⊆ R n is the decision (variable) space; t is the discrete time instance; Ω t ⊆ R is the time space.F(x, t) : Ω×Ω t → R m(t) is the objective function vector that evaluates a candidate solution x = (x 1 , ..., x n ) at time t.m(t) is the number of objective at time t.
Although people have realized the importance of tackling DMOPs with a changing NObj and mention this concept in [21][22][23][24], few work existed studying this problem until recently [18].Recently, a comprehensive investigation was conducted on the challenges of DMOPs with a changing NObj in [18].It has been experimentally demonstrated that it is a key issue of how to propel crowded solutions to cover the whole PF and how to pull unconverged solutions back to the PF with good diversity when increasing and decreasing the NObj, respectively.Bearing this challenge in mind, the authors in [18] proposed DTAEA to tackle DMOPs with a changing NObj, in which two complementary populations, CA and DA, are simultaneously maintained in the evolution process to focus on population convergence and diversity, respectively.Whenever environmental changes occur, CA and DA are reconstructed to preserve as much convergence and diversity as they can in the new environment.More specifically, when increasing the NObj, solutions in the old CA are all copied to the new CA.When decreasing the NObj, nondominated and dominated solutions of the old CA are all copied to the new CA and new DA, respectively.Therefore, DTAEA can be seen as a kind of knowledge transfer-based algorithm, as it makes use of knowledge acquired from previous solutions.

B. Evolutionary Transfer Optimization
Knowledge transfer has been applied to evolutionary computation to solve mutli-objective optimization and dynamic multi-objective optimization problems [25].Specifically, knowledge transfer is able to learn useful knowledge from related problem instances to solve the targeted problem instance [26,27].However, evolutionary multi-tasking optimization (EMT) [26][27][28][29] differs from our scenario here because EMT considers solving multiple tasks simultaneously, while our work considers solving different problem instances sequentially as the environment, e.g., NObj, changes.At any given time, we solves only one problem instance, not multiple ones.
For dynamic multi-objective optimization, knowledge transfer can help to predict good solutions for the next change based on previously optimized solutions.The transfer learning-based dynamic multi-objective evolutionary algorithm (Tr-DMOEA) [14] was the first work of applying knowledge transfer to solve DMOPs, in which transfer component analysis is used to transfer solutions in the PF of the previous environment to generate an initial population for the next environment.An autoencoding evolutionary search is regarded as a knowledge transfer method to predict the moving of PSs based on nondominated solutions obtained before the change [30].In [16], a manifold transfer learning method is applied to forecast the changing PSs over time.However, these knowledge transferbased DMOEAs never considered changing NObj and cannot solve DMOPs with a changing NObj, since they were designed to track the changing position or shape of PSs or PFs rather than expand or contract PS/PF.In general, it is always very challenging to decide what, when and how to transfer in DMOPs [31,32].

C. A Preliminary Investigation of DTAEA
Even though DTAEA has been computationally demonstrated in [18] to be effective on DMOPs with a changing NObj based on knowledge transfer, the test problems that were used to evaluate DTAEA are somewhat limited, as problem features in those problems are relatively simple, such as linear or concave PF shape and fitness landscape with multimodality, bias or even nothing.In order to evaluate whether DTAEA is able to solve DMOPs with a changing NObj and more complex problem features including PF shapes (convex, discontinuous and mixed shape of convex and concave) and fitness landscapes (nonseparability and deceptiveness), a benchmark problem WFG4 is arbitrarily selected from the WFG suite [20] as an example to conduct an experimental investigation of the performance of DTAEA.In contrast, F2 [18] is arbitrarily selected from the DTLZ suite [19].When increasing the NObj, the problems are set as bi-objective problems and then given 1000 generations to evolve by DTAEA before increasing the NObj from 2 to 3. When decreasing the NObj, the problems are set as tri-objective problems and then given 1000 generations to evolve by DTAEA before decreasing the NObj from 3 to 2.
Figures 1 and 2 show the distribution of the old CA, the reconstructed CA and DA obtained by DTAEA on the two problems F2 and WFG4 in the first generation right after changes for the cases of increasing the NObj from 2 to 3 and decreasing the NObj from 3 to 2, respectively.It should be noted that when increasing the NObj, solutions in the old CA are all copied to the new CA.Therefore, in Figure 1, 'CA' represents both solutions in the old CA and the new CA.It is clear from Figure 1 that when increasing the NObj from 2 to 3, the new CA does not have good diversity on both F2 and WFG4.As for the reconstructed DA, it has a good level of diversity on F2.As shown in Figure 1, solutions randomly generated in the search space are covering the whole area over the true PF.However, on WFG4, solutions in the reconstructed DA only cover a part of the PF over it.
Similarly, it can be seen from Figure 2 that solutions in DA also have good diversity on F2 when decreasing NObj from 3 to 2. However, on WFG4, there are some areas close to the high values of the second objective in the objective space without any solutions covered by the DA.It is clear that on both problems and the two cases (increasing and decreasing the NObj), the reconstructed CA and DA do not provide enough diversity.Therefore, the CA and DA reconstruction of DTAEA cannot provide enough diversity on DMOPs with more complex problem features.The reason is that the problem features in the more complex problems cause uniformly sampled solutions in the search space not to be uniformly distributed in the objective space.

III. KNOWLEDGE TRANSFER DYNAMIC MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM (KTDMOEA)
In this section, we present our proposed knowledge transfer dynamic multi-objective evolutionary algorithm (denoted as KTDMOEA), which is designed to tackle more complex DMOPs with a changing NObj.The main component of KTDMOEA is the proposed diversity enhanced knowledge transfer, which is designed to improve population diversity right after changes in DMOPs with a changing NObj.KT-DMOEA's flowchart is shown in Figure 3.As the flowchart exhibits, KTDMOEA maintains a single population.Whenever there is a change in the NObj, the process of knowledge transfer is evoked to reconstruct the population such that it has increased diversity; otherwise, other procedures in the evolution process are carried out.The novel process of knowledge transfer through PS expansion/contraction proposed in this paper will be elaborated in Section III-A.Then, Section III-B presents the overall evolutionary process of the proposed KTDMOEA.

A. Diversity Enhancing Knowledge Transfer
As DMOPs with increasing/decreasing NObj result in the expansion/contraction of PS/PF, we propose to enhance diversity through PS expansion and contraction for increasing and decreasing NObj, respectively.This strategy is targeted at enhancing knowledge transfer right after changes.In this section, the specific details of PS expansion and contraction are given in Section III-A1 and Section III-A2, respectively.
1) Expand the PS when Increasing the NObj: Increasing the NObj usually results in the expansion of the dimension of PS/PF manifold.Therefore, the PS is proposed to be expanded when increasing the NObj, so as to increase the population diversity.
The idea of PS expansion in the decision space is illustrated in Figure 4(a).Note this figure is just drawn to demonstrate the process of PS expansion, the specific PF and expansion direction in real problems may be different.As shown in the figure, suppose the blue point is one extreme point in the PS before the change (P S t ); blue line is the Pareto optimal set at time step t with two NObj; the expansion direction is found by generating several solutions around the blue point The framework of PS expansion is exhibited in Algorithm 1.As Algorithm 1 presents, in order to achieve PS expansion, the first step is to search for the potential PS expansion directions, whose procedure is given in Algorithm 2 and explained as follows.Given a set of Pareto optimal solutions at the time step t (PS t ), the algorithm firstly finds solutions with the maximum objective value for each objective as the set of extreme points (denoted as P e ) in line 1 of Algorithm 2. The use of extreme points helps to ensure that the found expansion directions are not misleading.If the middle points in the PF are selected, the found expansion directions might be along the direction of the PF, thus wasting computational resources.Then, in line 2, a random extreme point x e is selected from the set P e as the initial point of expansion directions.Later on, a solution set P var with the same size as the population size is randomly produced around x e via the polynomial mutation [34] to be regarded as the candidate sets of end points of expansion directions in line 3, which is also called detective population.

Algorithm 2: Search of Expansion Direction
Input: Pareto optimal solution set at time t (PS t ) Output: Set of the searched expansion directions D or NULL. 1 Find the solutions with the maximum objective value for each objective as the set of extreme points (P e ); 2 Randomly select one solution from P e and regard it as x e ; 3 Produce a solution set P var (also called detective population) with the same size as the population size through randomly generating solutions via the polynomial mutation [34]  Procedures from lines 4 to 10 in Algorithm 2 are conducted to make sure the remaining solutions in P non are nondominated and in different sub-spaces from the extreme point such that the extreme point and them can form the right expansion directions.Therefore, in line 4, all solutions in P var are evaluated in the new environment and dominated solutions are discarded after sorting them using nondominated sorting [33].Then, if all solutions in P var are nondominated by those in P S t , just set P non as P var ; else delete all solutions from P var that are dominated by those in PS t and regard the set of remaining solutions as P non in line 8.Then, in line 9, use evenly generated weight vectors following the method in [18] to estimate density of P non and PS t with the method introduced in Section III-B.Delete those solutions from P non that are in the same subarea as those in PS t in line 10.Later on, if there is no solution in P non , this means no expansion direction is found and return NULL; else, in line 13, use the points to form a set of lines that represents the directions (denoted as D) by regarding the extreme point x e in line 2 as the starting point and those solutions in P non as the end point: Then, delete duplicated expansion directions from D and return D.
After getting the expansion directions, the next step is to expand the PS to generate transferred solutions following the expansion directions.The detailed procedures of this algorithm are shown in Algorithm 1.Given the Pareto optimal solution set at time t PS t , evenly select solutions from it with the size of N base as P base , where where N is the size of population; M t+1 is the NObj at time step t+1; N dir is the number of expansion directions in D and θ is the number of solutions to generate along each expansion direction, which is a parameter to be set by the user.N − M t is designed to enable those M t extreme points in PS t to be preserved to the next environment.Then in line 3 of Algorithm 1, generate θ solutions along each expansion direction in D to fill the transferred solution set P tr through the following Equation ( 3), which produces a transferred solution based on a base solution and an expansion direction.
where x i is the i-th solution in the base population P base ; C j i is a float variable enabling some expanded solutions to reach the boundary of the decision space, whose detailed calculation will be discussed in the next paragraph; rand() is a function to generate a random number in (0, 1]; D j is the j-th expansion direction in the set D. After generating transferred solutions through PS expansion, if P tr is not full, just evenly select solutions from PS t in the objective space with the size of N − N base * θ * N dir .The calculation of C j i should follow the criterion that all solutions expanded from the solutions x i in P base are within the bound of each decision variable and they should reach the boundary of the search space as close as possible.Bearing this criterion in mind, we design the calculation of C j i .Given a base solution x i and one expansion direction D j , suppose para k is the value that makes the k-th variable of the generated solution reach the boundary of this variable.Therefore, each para k can be calculated according to whether the expansion direction is positive or negative, via the following equation: where upper k and lower k are the upper bound and lower bound of the k-th dimension of the decision space; x k i is the k-th decision value; D k j is the value of the direction D j at the k-th dimension.In order to ensure each generated solution is located within the region, C j i = min k=1,...,n para k , where n is the dimension of the decision space. 2) Contract the PS when Decreasing the NObj: It has been observed that decreasing the NObj usually results in the contraction of the dimension of PS/PF manifold.Therefore, the PS is proposed to be contracted when decreasing the number of objective.
The idea of PS contraction in the decision space is illustrated in Figure 4(b).Note this figure is just drawn to demonstrate the process of PS contraction, the specific cases in real problems may be different.It tries to generate spread and uniform solutions given current nondominated solution set.Steps from lines 4 to 7 in Algorithm 3 are designed to help improving the spread of the population and line 8 tries to make the distribution of population more even.As illustrated in Figure 4(b), black start points are the found optimal solutions for the problem with 3 objectives.When decreasing the NObj from 3 to 2, the black start points in the true PS of bi-DMOP (blue line) are selected as the nondominated solutions.
One solution (denoted by the red point) is generated based on the extreme point (denoted by blue start point) following the spread direction between the extreme point's closest point and itself, so as to increase the spread of population.Other solutions (denoted by the blue points) are produced from two randomly selected solutions in the nondominated solution set, to improve the even distribution of the population (suppose there is no bias in the problem).Produce a new solution x new along the direction D j to make it reach the boundary of the search space; 7 Put the solution x new to P tr ; end 8 Random select two solutions (x a and x b ) from P tr and generate one solution between them to fill P tr until the size reaches the population size N; Return P tr Given the Pareto optimal solution set PS t , the first step in line 1 of Algorithm 3 is to evaluate all solutions in PS t in the new environment and put the nondominated solutions to P non after conducting nondominated sorting [33] on PS t .Then, put all solutions of P non to the set of transferred solution P tr in line 2. Later on, in line 3, find solutions from P non with the maximum objective value for each objective as the set of extreme points P e .Subsequently, for each extreme point x j e in the set P e , find a closest solution P j non to x j e from P non and connect P j non to x j e as a direction and normalize it as D j = x j e −P j non |x j e −P j non | , as shown in line 5.Then, produce a new solution x i new along D i to make it reach the boundary of the search space according to the following equation: where C j is a float variable, whose calculation method is the same as that in the process of PS expansion, as shown in Equation ( 4).The newly generated solution x new is then put in P tr .Later on, in line 8, randomly select two solutions (x a and x b ) from P tr and generate one solution between them through equation ( 6) to fill P tr until the size of P tr reaches the population size N , using the following equation: We believe this strategy of expanding/contracting the PF and PS works better over DTAEA [18] because increasing or decreasing the NObj usually results in the expansion and contraction of the dimension of the PS manifold.When increasing the NObj, the proposed PS expansion is able to find the expansion directions and generate solutions along these directions, increasing the population diversity in the new environment.When decreasing the NObj, the two mechanisms in PS contraction are targeted for improving the spread and evenness of the distribution.Therefore, those produced solutions by PS expansion and contraction may achieve better diversity than that of DTAEA.

B. KTDMOEA
As the PS expansion/contraction is designed to enhance the diversity of population after changes, the proposed KTD-MOEA will not maintain a separate DA.As a result, there is no need to update DA as in DTAEA.The population update mechanism in KTDMOEA is the same as the update mechanism of CA in [18].
The flowchart of KTDMOEA is shown in Figure 3.The overall framework of the proposed KTDMOEA is given in Algorithm 4. KTDMOEA starts with generating an initial population of size N , as shown in line 1.While the stopping criteria are not satisfied, carry out the following steps.Detect whether the environmental changes occur.If the NObj is detected to increase, evoke the process of PS expansion on P op using Algorithm 1.If the NObj decreases, evoke the process of PS contraction on P op using Algorithm 3. If there is no change detected, conduct the evolutionary optimization process on P op.In line 9, an offspring population A is produced through the following two steps until the size of A reaches the population size N .Firstly, randomly pick two solutions from P op as the parent solutions via the mating selection.Then, those two solutions are used to generate two offspring solutions via appropriate crossover and mutation operators.Here, we utilize the simulated binary crossover [ Lastly, the generated offspring population A is used to update P op with the CA update mechanism in DTAEA [18].Due to space limitation, the CA update mechanism of DTAEA is not introduced here.Interested readers can refer to [18].

IV. EXPERIMENTAL SETUP
In this section, experimental studies are designed to verify whether the improved knowledge transfer-based approach answers the research questions mentioned in Section I. Analyses will be carried out to reveal whether existing DMOEAs for DMOPs are able to deal with a changing NObj despite not being designed to do so, and how well static MOEAs could perform on DMOPs with a changing NObj.These are important baselines.

A. Benchmark Problems
Two suites of multi-objective optimization test problems DTLZ [19] and WFG [20] are modified to be DMOPs with a changing NObj.Four DMOPs with a changing NObj from DTLZ1-DTLZ4 are renamed as F1-F4, the same as in [18].These two suites of benchmark functions are used to verify that the proposed algorithm is able to deal with problems with both simple and complex problem features.Detailed descriptions of problems features can be found in Section I of our Supplementary File, which can be download from https://github.com/ganrandom/Supplementary-Filefor-KTDMOEA.
There are two different sequences of changes for these benchmark problems: 1) The initial NObj is set as 2. It firstly increases from 2 to 7 one by one and then decreases from 7 to 2 one by one (simply denoted as '2-7-2 one by one'), which was used in [18]; 2) The initial NObj is set as 7.It firstly decreases from 7 to 2 one by one and then increases from 2 to 7 one by one (simply denoted as '7-2-7 one by one').

B. Compared Algorithms
In our experimental studies, five algorithms are selected for the comparison, so as to verify the performance of our proposal against the state-of-the-art.Considering the popularity and good performance on solving static MOPs, the elitist nondominated sorting genetic algorithm (NSGA-II) [33] and multi-objective evolutionary algorithm based on decomposition (MOEA/D) [36] are chosen to verify whether they are able to tackle DMOPs with a changing NObj.For NSGA-II and MOEA/D, whenever there is a change, the whole population of the last generation in the old environment is just copied to the next generation after changes and then re-evaluated in the new environment to respond to changes in the NObj.Besides, two popular and state-of-the-art DMOEAs including DNSGA-II [2] and MOEA/D-KF [13] specifically designed for DMOPs with a changing shapes and/or positions of PS and/or PF are selected to be compared, so as to figure out whether DMOEAs for solving DMOPs with fixed NObj are able to deal with DMOPs with a changing These four algorithms are compared to verify whether it is necessary to develop extra approaches tailored for DMOPs with a changing NObj.In order to verify whether the improved knowledge transfer-based approach (KTDMOEA) answers the research questions mentioned in Section I, as one of the popular and recently developed algorithms targeted for handling changes in the NObj, DTAEA [18] is also chosen to be compared.Section II of the supplementary file presents the detailed descriptions of these algorithms.

C. Parameter Settings
The parameters of these compared algorithms are set as follows: • Population size: 300, the same as that of DTAEA, θ in KTDMOEA is set as 2. The impact of θ on KTDMOEA's performance will be analyzed in Section V-E; • Several different frequencies of change: τ t is set as 5, 25 and 50 and 200; Those parameters are set for assessing the effects of different algorithms under different frequencies of change.• All algorithms run 31 times independently, also the same as in DTAEA's work [18]; • 1000 generations are given to each algorithm before the first change so that the population before the change can converge; • The crossover probability was p c = 1.0 and its distribution index was η c = 20.The mutation probability was p m = 1/n (where n denotes the number of decision variables) and its distribution η m = 20.These parameters are chosen because of their good performance on solving continuous problems, which have been analyzed in [37] and [19].• The neighbourhood size and the number n r of solutions allowed to replace in MOEA/D were set to 20 and 2, respectively, which is the same as in the original paper [36].

D. Performance Metrics
• Hypervolume (HV) [38] comprehensively measures the convergence and diversity of solution sets; the larger the better.
• Generational Distance (GD) [4] [3] evaluates the convergence of obtained solution sets; the smaller the better.• Maximum Spread (MS) [39] assesses the diversity of solution sets; the larger the better.Note that these three metrics are used to measure the solution quality of a found solution set.They can be also used to measure comprehensive performance of an algorithm by averaging the metric values of all obtained solutions under multiple environmental changes.

V. EXPERIMENTAL RESULTS AND ANALYSES
In order to achieve the objectives of the experiment, i.e. answering the research questions and verifying whether the existing static MOEAs and DMOEAs for solving DMOPs with fixed NObj are able to tackle DMOPs with a changing NObj, experimental results of all compared algorithms are presented in this section.Furthermore, further analyses regarding the further verification of the improved knowledge transfer, performance comparison of different NObj changing sequence and the impact of algorithm parameters are also given in this section.
Three metrics (HV, GD and MS) are used to measure the quality of the found solutions at the first generation after the change and in the last generation before the next change by six algorithms.To show the significant superiority of the proposed KTDMOEA to other algorithms across all problem instances in general, Friedman and Nemenyi statistical tests [40] are adopted across all benchmark problems regarding the three metrics (HV, GD and MS) of six compared algorithms.The larger the values of HV and MS, the better the algorithm.Therefore, the larger the Friedman ranking, the better the algorithm.Similarly, the smaller the Friedman ranking of GD, the better the algorithm.
The mean metric value of 31 independent runs that each algorithm gets on one problem with one frequency of change at each environmental change is regarded as an observation of the Friedman and Nemenyi test.Therefore, there are 520 (13 problems, 4 different frequencies of change and 10 environmental changes) observations for each algorithm in the Friedman and Nemenyi tests.Additionally, in order to show the significant superiority of the proposed KTDMOEA to other algorithms on each individual problem of each parameter setting, the Wilcoxon rank sum test at the 5% significance level is implemented on each benchmark problem regarding each metric of six compared algorithms at each parameter setting.Therefore, there are 31 observations obtained from 31 independent runs for each algorithm on each problem and parameter setting in the Wilcoxon rank sum test.
Due to the space limitation of the paper, only the results of the Friedman and Nemenyi statistical tests are presented here.The Wilcoxon rank sum test results are shown in the Supplementary file.Mean and standard deviation values of HV, GD and MS of obtained solutions in the first generation after changes and the last generation before the next change averaged across 10 environmental changes in two sequences of changes as '2-7-2 one by one' and '2-7-2 one by one' are also presented in the supplementary file, respectively.Moreover, mean and standard deviation values of HV, GD and MS of obtained solutions at the first generation after changes and at the last generation before the next change at each environmental changes under 31 independent runs in those two sequences of changes as '2-7-2 one by one' and '2-7-2 one by one' are also recorded and presented in the supplementary file.

A. Initial Effectiveness of Knowledge Transfer
In order to verify (1) whether the proposed PS expansion/contraction mechanism is able to increase diversity so as to improve knowledge transfer after the change, and (2) whether the MOEAs not tailored for DMOPs with a changing NObj can achieve this aim after the change, the quality of the solutions obtained by all algorithms in the first generation after changes is compared.
1) NObj increasing from 2 to 7 and then decreasing from 7 to 2: Figure 5 presents the Nemenyi post-tests results among HV, GD and MS of obtained solutions at the first generation after changes by 6 algorithms.Friedman test detects significant differences in average values for HV, GD and MS with a pvalue of 3.57E-251, 9.14E-256, and 1.44E-117, respectively.
Overall, it can be observed from Figure 5 that when comparing all algorithms, KTDMOEA significantly outperforms all others in all three metrics.More details can be found from Table 2 of the Suppmentary File.This implies that the proposed knowledge transfer technique via PS expansion/contraction indeed improves the diversity and maintain the convergence of transferred solutions right after changes, under all frequencies of changes on most problems.
For readers who want to examine the details, results of mean and standard deviation values for HV, GD and MS when the NObj increase from 2 to 7 and then decrease from 7 to 2 are presented in Tables 6, 7 and 8 of the Supplementary File, respectively.
2) NObj decreasing from 7 to 2 and then increasing from 2 to 7: Figure 6  (c) MS Fig. 6.Friedman ranking among HV, GD and MS of obtained solutions at the first generation by 6 algorithms in the changing sequence of firstly decreasing the NObj from 7 to 2 and then increasing it from 2 to 7, both one by one.
HV, GD and MS of obtained solutions at the first generation by 6 algorithms.Friedman test detects significant differences in average accuracy for HV, GD and MS with a p-value of 1.97E-240, 4.72E-221, and 6.87E-113, respectively.
Overall, it can be found from the Friedman test results in Figure 6 that KTDMOEA performs significantly better than all others regarding HV and MS metrics.There is no significant difference between KTDMOEA and DTAEA regarding GD.More details can be found from Table 3 of the Suppmentary File.This further supports that the proposed knowledge transfer technique via PS expansion/contraction indeed improves the diversity and maintains the convergence of transferred solutions right after changes, under all frequencies of changes on most problems.
For readers who are interested in details, mean and standard deviation values for HV, GD and MS in the benchmark of decreasing the NObj from 7 to 2 and then increasing it from 2 to 7 are presented in Tables 9, 10 and 11 of the Supplementary File, respectively.The comparison results of all algorithms at each NObj regarding HV, GD and MS are presented in Tables 44-53, Tables 54-63 and Tables 64-73 of the Supplementary File, respectively.
3) Why Does Knowledge Transfer Usually Get Better Solution Quality Right after Changes?: In this section, two examples are presented to elaborate the reason why the proposed PS expansion/contraction works well on most problems.
As shown in Figure 7, the distributions of the old population, the detective population and transferred population via PS expansion on F2 and WFG1 is presented when increasing the NObj from 2 to 3. It is clear that nondominated solutions in the detective population are still nondominated by the selected extreme point.Following steps 5 to 10 in Algorithm 2, there are only two solutions in P var , which are located in the areas away from that of the old population.Then, each of those two solutions is regarded as the ending point of the expansion direction, together with the extreme point as the starting point of the direction.Therefore, when evenly selecting solutions from the old population to conduct the PS expansion, almost all areas of F2 can be covered right after the change.Even for WFG1, a large area of the PF is covered by the transferred solutions.In addition, it is clear that some of the transferred solutions are able to reach the boundary of the PF.
Figure 8 presents the distributions of the old population and the transferred solutions via PS contraction on F2 and WFG1 at the first generation when decreasing the NObj from 3 to 2 in the changing NObj sequence of firstly decreasing from 7 to 2 and then increasing from 2 to 7. It is clear that on those two problems the transferred population via PS contraction has better convergence and diversity than the old population.

B. How Does Knowledge Transfer Help Optimization?
In order to verify whether the proposed KTDMOEA can find solutions with better convergence and diversity in the last generation after optimization against all other algorithms, the solution quality of all compared algorithms after optimization and before the next change is compared.
1) NObj increasing from 2 to 7 and then decreasing from 7 to 2: Figure 9 presents the Nemenyi post-tests results among HV, GD and MS of obtained solutions at the last generation after optimization by 6 algorithms.Friedman test detects significant differences in average accuracy for HV, GD and MS with a p-value of 6.83E-215, 6.22E-160, and 4.27E-164, respectively.
Overall, it can be seen from those statistical test results that KTDMOEA performs significantly better than or the same as the other approaches.Specifically, it is clear from the Friedman ranking results in Figure 9 that KTDMOEA gets significantly best results among all compared algorithms regarding HV and GD values.It is the equal best, together with DTAEA and NSGA2, regarding the MS value.These three algorithms outperforms other algorithms regarding the MS value.The statistical results show that the proposed knowledge transfer is able to help the optimization, which achieves better convergence and at least similar diversity compared to the start-of-the-arts when the NObj increasing from 2 to 7 and then decreasing from 7 to 2, under all frequencies of changes on most problems.More details can be seen from Table 4 in the Supplementary File.
2) NObj decreasing from 7 to 2 and then increasing from 2 to 7: Figure 10 presents the Nemenyi post-tests results among HV, GD and MS of obtained solutions at the last generation after optimization by 6 algorithms.Friedman test detects significant differences in average values for HV, GD and MS with a p-value of 3.56E-223, 1.58E-129, and 4.98E-183, respectively.
It can be found from the Friedman test results in Figure 10 that KTDMOEA achieves significantly better results than all other algorithms regarding HV and GD metrics.As for the MS results, KTDMOEA and DNGSA2 rank the second in the Friedman ranking test, both are outperformed by NSGA2 only.Overall, those statistical results imply that the proposed knowledge transfer is able to help the optimization in obtaining better convergence and at least similar diversity compared to the start-of-the-arts in the changing sequence of firstly decreasing from 7 to 2 and then increasing from 2 to 7, under all frequencies of changes on most problems.
3) Why Does Knowledge Transfer Help Optimization?: It has been presented in Section V-A that the proposed PS expansion/contraction has indeed enhanced the diversity of knowledge transfer, resulting in better solution quality of obtained solutions than other state-of-the-arts in the first generation after changes.In other words, given the results of better solution quality than other algorithms in the first generation after changes, our proposed KTDMOEA is able to find solutions with good convergence and diversity at at all frequencies of change, even when the frequency of change is very high.This means that our proposed approach is robust to different frequencies of change.
Because the transferred solutions are better distributed in the new environment with a better diversity already, KTDMOEA is able to find better solutions across different frequencies of change.This is also the reason why KTDMOEA is able to quickly respond to the changes in the NObj, since finding good solution under high frequency of change means fast response to changes.There are some problems where KTDMOEA did not perform best when the frequency of changes is large.The specific results and analyses are presented in Section III.B.3) of the Supplementary File.

C. Further Analysis of Our Knowledge Transfer Methods
In order to further verify the effectiveness of the proposed PS expansion/contraction method against the state-of-the-art method DTAEA, two pairs of comparisons are designed.First, DTAEA is compared to DTAEAv1, where the CA reconstruction after changes is replaced by the proposed PS expansion/contraction with other components of DTAEA unchanged.Second, KTDMOEA is compared to KTDMOEAv1, where the proposed PS expansion/contraction is replaced by the CA reconstruction of DTAEA with other components of KTDMOEA unchanged.
All experimental settings are set the same as in Section IV-C except for the frequency of change and the NObj changing sequence, which is set as 25 and NObj increasing from 2 to 7 and then decreasing from 7 to 2 one by one, respectively, to save space.For Friedman and Nemenyi tests, the mean metric value of 10 environmental changes that each algorithm gets on one problem with one frequency of change at each independent run is regarded as an observation of the test.Therefore, there are 403 (13 problems and 31 environmental changes) observations for each algorithm in the Friedman and Nemenyi tests.
Figure 11 presents the Nemenyi post-tests results among HV, GD and MS of obtained solutions at the last generation after optimization before the next change by DTAEA and DTAEAv1 algorithms.Friedman test detects significant differences in average values for HV, GD and MS with a pvalue of 9.97E-34, 5.40E-14, 0.0017, respectively.Similarly, Figure 12 presents the Nemenyi post-tests results among HV, GD and MS of obtained solutions at the last generation after optimization before the next change by KTDMOEA and KTDMOEAv1 algorithms.Friedman test detects significant differences in average values for HV, GD and MS with a pvalue of 1.50E-57, 4.88E-07, 8.63E-35, respectively.
Overall, it can be observed from Figures 11 and 12 the algorithm with our proposed knowledge transfer strategy significantly outperforms the one without it, i.e, DTAEAv1 outperforms DTAEA and KTDMOEA outperforms KTDMOEAv1.More details can be found from Tables 138-140 of the Suppmentary File.From this result, we can get the conclusion that, the proposed PS expansion/contraction method works better than the knowledge transfer in DTAEA, further confirming the effectiveness of our proposed knowledge transfer method.

D. Performance Comparison on Other Changes in the NObj
In the previous experiments, the NObj only increases or decreases one by one.This section aims to verify the performance of the proposed algorithm in the scenario where the NObj increases or decreases by more than one.Two different changing sequences where the NObj increases or decreases by one or two each time are designed as follows: • The initial NObj is set as 2.Then, NObj firstly increases from 2 to 3. Then there are four changes with the first two changes increasing the NObj by two and then two changes decreasing the NObj by two.Lastly, the NObj decreases from 3 to 2 (simply denoted as '2-3-5-7-5-3-2').
• The initial NObj is set as 7.Then, there are two changes where the NObj decreases by two.Later on, the NObj decreases from 3 to 2 and then increases from 2 to 3. In the last two changes, the NObj increases by two at each change (simply denoted as '7-5-3-2-3-5-7').All experimental settings are set the same as in Section IV-C except for the frequency of change and the metric, which is set as 25 and HV, respectively, to save space.For Friedman and Nemenyi tests, the HV values that all algorithms get on one problem with one frequency of change at one independent run of 31 runs is regarded as an observation of the test.Therefore, there are 403 (13 problems and 1 frequency of changes and 31 independent runs) observed data.
Overall, it can be observed from Figure 13 in the changing sequence of '2-3-5-7-5-3-2', our proposed KTDMOEA performs the best among all compared algorithms.In another changing sequence of '7-5-3-2-3-5-7', both KTDMOEA and DTAEA performs the best.More details can be found in Tables 141 and 142 in the Supplementary File.

E. Impact of Algorithm Parameters
In the process of PS expansion, there is a parameter θ to set which is the number of solutions to generate along each expansion direction.In this section, different values of this parameter will be set to verify whether different parameter settings affect the performance of KTDMOEA.
All experimental settings are set the same as in Section IV-C except for the frequency of change and the metric, which is set as 25 and HV, respectively, to save space.The changing sequence is that the NObj firstly increases from 2 to 7 and then decreases from 7 to 2 one by one.There are three KTDMOEAs (denoted as KTDMOEA-1, KTDMOEA-2 and KTDMOEA-4), which has the value of θ as 1, 2 and 4, respectively.In order to verify whether different parameter settings affect the performance of KTDMOEA, the Friedman and Nemenyi tests on 5 state-of-the-arts and 3 KTDMOEAs are conducted to indicate the significant differences among them.The HV values that all algorithms get on one problem with one frequency of change at one independent run of 31 runs is regarded as an observation of the test.Therefore, there are 403 (13 problems and 1 frequency of changes and 31 independent runs) observed data.
Figure 14 presents the Nemenyi post-tests results among HV values of optimized solutions at the last generation by 8 algorithms in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.Friedman detects significant differences in average accuracy for HV with a p-value of 1.7925e-281.It is clear that the three KTDMOEAs get the best HV values among all algorithms.It can be found from Figure 14 that there is no significant difference among the three KTDMOEAs with different setting of the parameter θ.These results have verified that the performance of the proposed PS expansion/contraction is not sensitive to the setting of the parameter θ.The performance of the proposed KTDMOEA against the existing algorithms is not affected by the setting of different parameter values.

VI. CONCLUSION
It has been investigated in this paper that existing work cannot handle well DMOPs with a changing NObj and more complex PF shapes (convex, discontinuous and mixed shape of convex and concave) and fitness landscapes (nonseparability and deceptiveness).The main reason is the lack of sufficient population diversity right after dynamic changes.To solve this problem, two research questions are studied, first how to transfer knowledge so as to enhance diversity and second how the knowledge transfer helps optimization.
In order to answer both research questions, inspired by the characteristic of DMOPs with a changing NObj, a dynamics handling strategy-PS expansion/contraction is proposed.As a result, a new algorithm, KTDMOEA, is designed to make use of this strategy.Experimental studies have demonstrated the effectiveness of the proposed knowledge transfer in enhancing the diversity right after changes and assisting the optimization under different number of objective changing sequences.
In comparison with the state-of-the-art in solving DMOPs with a changing NObj, our KTDMOEA achieved the best performance according to HV, GD and MS metrics across a number of test functions.We argue that is is important to use both DTLZ and WFG functions to build dynamic benchmarks because they provide rather different problem characteristics.KTDMOEA performed well on different problems under different parameter settings under different frequencies of change.
As expected, no algorithm would be the best on all possible problems.According to the details in the Supplementary File, there are several problems on which KTDMOEA did not outperform existing algorithms.Although we have done initial analysis of these, as reported in the Supplementary File, more in-depth analysis will be our next work in the future.In addition, testing our proposed approach on real problems is one of our future work.

Fig. 1 .Fig. 2 .
Distribution of the reconstructed CA and DA obtained by DTAEA in the first generation right after changes when increasing NObj from 2 to 3 on F2 and WFG4.Distribution of the reconstructed CA and DA obtained by DTAEA in the first generation right after changes when decreasing NObj from 3 to 2 on F2 and WFG4.

Fig. 4 . 2
Fig. 4. Brief illustration of how to expand and contract the PS for increasing and decreasing NObj, respectively.andconnecting the blue point to the point nondominated to it; the plane formed by 4 black lines is the Pareto optimal set at time step t + 1 with three NObj and the red arrows are the expansion directions.Solutions evenly selected in P S t , which are the points in the starting points of the red arrows, are regarded as the PS expansion base solutions to cover the whole PS right after the change (P S t+1 ).

5 .
Friedman ranking among HV, GD and MS of obtained solutions at the first generation by 6 algorithms in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.
presents the Nemenyi post-tests results among

Fig. 7 .Fig. 8 .
The distribution of the old population ('Old pop'), the extreme point ('ExPoint'), the detective population ('DetePop') and transferred population ('TrPop') via PS expansion on F2 and WFG1 at the first generation when increasing the NObj from 2 to 3. The distribution of the old population ('Old pop') and transferred population ('TrPop') via PS contraction on F2 and WFG1 at the first generation when decreasing the NObj from 3 to 2.

Fig. 9 . 10 .
Friedman ranking among HV, GD and MS of optimized solutions at the last generation by 6 algorithms in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.Friedman ranking among HV, GD and MS of optimized solutions at the last generation by 6 algorithms in the changing sequence of firstly decreasing the NObj from 7 to 2 and then increasing it from 2 to 7, both one by one.

11 . 12 .
Friedman ranking among HV, GD and MS of optimized solutions at the last generation by DTAEA and DTAEAv1 in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.Friedman ranking among HV, GD and MS of optimized solutions at the last generation by KTDMOEA and KTDMOEAv1 in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.

Fig. 14 .
Fig.14.Friedman ranking among HV of optimized solutions at the last generation by 6 state-of-the-arts and 3 KTDMOEAs with different values of parameters theta (1, 2 and 4) in the changing sequence of firstly increasing the NObj from 2 to 7 and then decreasing it from 7 to 2, both one by one.

No Initialization Change? Reproduction Update Pop Yes Stop? Yes Output Pop No Transfer Knowledge to Reconstruct Pop
around the selected extreme point x e ; 4 Evaluate all solutions in P var and delete dominated solutions after conducting nondominated sorting on them; 5 if All solutions in P var are nondominated by PS t then [18]ete solutions from P var that are dominated by those in the PS t and set the remaining solutions as P non ; end 9 Use evenly generated weight vectors following the method in[18]to estimate density of P non and PS t with the method introduced in Section III-B; 10 Delete solutions in P non located at the same subarea with solutions of PS t ; 11 if P non is NULL then Return NULL.13 Use the remaining solutions in P non and the extreme point x e to form a set of lines that represent the directions (denoted as D): D j = non −xe∥ , (j = 1, ..., |P non |); 14 Delete duplicated directions from D; end Return D.
Evaluate the PS t in the new environment and put the nondominated solutions to P non after conducting the nondominated sorting on PS t ; 2 Put all solutions in P non to P tr ; 3 Find the solutions with the maximum objective value at any objective from P non as the set of extreme points (P e ); 4 for j = 1 to |P e | do Algorithm 3: Contract the PS to generate transferred solutions Input: Pareto optimal solution set at time t (PS t ); Output: Transferred solutions P tr 1

34 ]
Algorithm 4: Framework of KTDMOEA Input: Population size N ; Output: The found population P op 1 Randomly generate an initial population P op; 2 while stopping criteria not satisfied do