A Novel Maximin-Based Multi-Objective Evolutionary Algorithm Using One-by-One Update Scheme for Multi-Robot Scheduling Optimization

With the continuous development of E-commerce, warehouse logistics is also facing emerging challenges, including more batches of orders and shorter order processing cycles. When more orders need to be processed simultaneously, some existing task scheduling methods may not be able to give a suitable plan, which delays order processing and reduces the efficiency of the warehouse. Therefore, the intelligent warehouse system that uses autonomous robots for automated storage and intelligent order scheduling is becoming mainstream. Based on this concept, we propose a multi-robot cooperative scheduling system in the intelligent warehouse. The aim of the multi-robot cooperative scheduling system of the intelligent storage is to drive many robots in an intelligent warehouse to perform the distributed tasks in an optimal (e.g., time-saving and energy-conserved) way. In this paper, we propose a multi-robot cooperative task scheduling model in the intelligent warehouse. For this model, we design a maximin-based multi-objective algorithm, which uses a one-by-one update scheme to select individuals. In this algorithm, two indicators are devised to discriminate the equivalent individuals with the same maximin fitness value in the environmental selection process. The results on benchmark test suite show that our algorithm is indeed a useful optimizer. Then it is applied to settle the multi-robot scheduling problem in the intelligence warehouse. Simulation experiment results demonstrate the efficiency of the proposed algorithm on the real-world scheduling problem.


I. INTRODUCTION
Recently, with the widespread application of autonomous robots, these cheap, small and smart robots have been widely employed in the intelligent storage management of the logistic industry [1]. In principle, the basic task of the warehouse system in the intelligent storage management is to transport goods, store goods and distribute goods The associate editor coordinating the review of this manuscript and approving it for publication was Chun-Hao Chen . efficiently [2]. Accordingly, the multi-robot coordination mechanism, if being really efficient and robust, it will make the entire intelligent warehouse management system more efficient [3], [4]. In the multi-robot coordination, a set of individual tasks needs to be scheduled (e.g., allocated and executed) in an optimal way. Such multi-robot scheduling problem can be formulated as a task allocation model where a group of autonomous robots should cater to a set of orders (tasks) in optimal routes satisfied with certain criteria [5].
There are already some heuristic approaches to deal with the multi-robot task scheduling problems for the intelligent storage scenarios [6]. Milica have proposed a scheduling problem of a single robot in the process of cargo transportation and employed a whale optimization algorithm (WOA) to resolve [7]. Elmi formulated a robotic scheduling problem for a blocked hybrid workshop and solved it by simulated annealing method [8]. A multi-objective evolutionary algorithm is proposed by Nastasi to solve a management problem of automated warehouse [9]. Xing exploited ant colony optimization algorithm (ACO) to solve the picking path optimization of storage and retrieval machines [10]. Xin proposed the problem of multi-robot coordinated path planning in intelligent warehouse [11]. Li proposed a multi-task robot scheduling and task allocation optimization problem under the dynamic requirements of customers and employed an improved particle swarm optimization heuristic algorithm (PSO) to solve it [12]. Chen proposed an on-line route selection method based on ACO to find picking routes for multiple robots in uncertain picking time [13].
However, most of the existing works focus on minimizing the total time cost of multiple robots without balancing the individual travel cost of each robot [14]- [16]. How to evenly allocate tasks to each robot, reducing the energy consumption of the robots in the warehouse, has become a key issue. For this problem, we need to consider two goals about the robot energy consumption, i.e., to minimize the total energy of all the robots, and to minimize the individual energy of each robot while each robot in the warehouse is not overused. Therefore, we research the multi-robot cooperative scheduling optimization problem by considering above goals, where the energy consumption of each robot needs to be formulated as an objective function of minimizing the total energy and the individual energy of the robots. Obviously, such multi-robot cooperative scheduling problem is an MaOP [17], where multiple objectives, each of which is specified by a robot, need to be optimized simultaneously. With the increase of the robots, the task allocation of the multirobot cooperative scheduling model will become increasingly difficult.
Recently, using multi-objective optimization algorithms (MOEAs) to solve real-world MaOPs has become an effective way [18], [19]. Zhou proposed a multi-objective ant algorithm to solve the airline crew rostering problem [20]. Fang used an improved multi-objective evolutionary algorithm for HQPM to solve the multi-objective problem for high quality pattern mining (HQPM), where the objectives are support, occupancy, and utility [21]. Sun proposed the multi-objective immune algorithm to solve the multi-objective scheduling problem [22]. Among those, one effective approach is the indicator-based evolutionary algorithm (IBEA) [23], which employs performance evaluation indicators to ensure the convergence and distribution of the solution set [24], such as HypE [25], R2-MOEA [26], and MOBI-II [27]. Especially, compared with other MOEAs such as Pareto-based algorithms [28]- [30] and decomposition-based algorithms [31]- [33], IBEAs are more effective in solving MaOPs since they will not lose selection pressure in high-dimensional space like Pareto dominance, or need the support of a large number of reference vectors [34]. However, the indicators in IBEAs are rather computationally expensive, especially when dealing with a large number of objectives.
Based on the above consideration, we propose a novel indicator-based algorithm to solve the multi-robot cooperative task scheduling problem, which usually involves more than ten objectives. In this algorithm, the maximin indicator, which is computationally efficient, is used to evaluate the solutions, and a one-by-one update scheme is devised for the environmental selection. Especially, the maximin fitness function has shown some effective properties for the optimization (see Section III).
The main contributions of this paper are as follows.
• A multi-robot scheduling optimization model is proposed in this paper, which is designed based on the energy consumption of the robot, taking into account the consumption of the robots in performing tasks and the additional consumption in the process of task alternation. In particular, for this model, a new chromosome coding method and a special crossover mutation operator are proposed.
• Aiming to solve our proposed multi-robot scheduling optimization model effectively, a novel maximin-based algorithm called MO-MOEA is developed by using oneby-one update scheme and two environmental selection indicators.
• Experiments verify the effectiveness and efficiency of MO-MOEA. Through a large number of comparisons and discussions, it is proved that the performance of our algorithm is more competitive than comparison algorithms.
The rest of this paper is organized as follows. Section II proposes the model of multi-robot cooperative scheduling. In Section III, the maximin fitness function and the proposed algorithm MO-MOEA are shown. Experimental comparisons on benchmark test suites and multi-robot cooperative scheduling optimization problem are provided in Section IV. Section V outlines conclusions and future work.

II. MULTI-ROBOT COOPERATIVE SCHEDULING PROBLEM A. MODEL
In the intelligent warehouse system, first we pass the task list into the system, and the system completes task assignment through the algorithm, and then the specific storage tasks are handed over to autonomous robots to complete. In principle, these storage tasks can be classified as follows: (1) The warehousing task, which is the first task type of the system. In this task, goods are transported to empty shelves according to warehouse requirements.
(2) The transferring task, which is the task of moving the goods in the warehouse according to user's demands or warehouse plans.
(3) The shipment task, where goods are moved out of the warehouse according to user demands. Fig. 1 shows a two-dimensional warehouse environment schematic diagram, where the black grids indicate that goods are placed on the shelves, and white grids indicate that the shelves are empty. The autonomous robot moves along the aisle area to reach each shelf position. In general, the storage area of the warehouse is regarded as a two-dimensional planar grid map. Using this approach, a two-dimensional coordinate system can be constructed, where goods, robots and obstacles can be uniformly marked and calculated. For the convenience of display, all goods are received from the left side of the warehouse and transported out from the bottom of the warehouse. Table 1 shows the list of tasks that the robots need to complete. Based on the schematic diagram of this intelligent storage environment, the multi-robot scheduling model is established in the followings.  Specifically, the multi-robot scheduling model consists of two parts: the task-specific consumption function and the inter-task consumption function. Assume that there are m free moving autonomous robots in an intelligent warehouse {r 1 , r 2 , . . . , r m }, and n pending tasks to be processed {t 1 , t 2 , . . . , t n }, which can be assigned to the warehousing tasks, transferring tasks and shipment tasks, respectively. The robot will encounter the following situations when completing its own series of tasks. For the shipment tasks and warehousing tasks, if the coordinates of the task t i is (x i , y i ), under the condition without considering the inter-task consumption, the self-cost of autonomous robot performing task t i can be defined as the function C in (t i ). In the intelligent warehouse environment, the coordinate of the exit is defined as (f out , 0), and the coordinate of the entrance is (0, f in ). Therefore, the task-specific consumption of the shipment task C out (t i ) is The task-specific consumption of the warehousing task C in (t i ) is: In addition, if goods are transported from a storage location (x i , y i ) to another location (x i , y i ) by an autonomous robot, it is called the transferring task, Different from the above tasks, its task-specific consumption C t (t i ) is In the intelligent warehouse, in addition to the consumption during the execution of the task, there is an additional energy consumption, i.e., the inter-task consumption function. Based on the above model, the Manhattan distance can be used to quickly measure the inter-task consumption of tasks. The inter-task consumption functions are divided into four types: 1) the inter-task consumption between warehousing and shipment tasks; 2) the inter-task consumption between warehousing and transferring tasks; 3) the inter-task consumption between transferring and transferring tasks; 4) the inter-task consumption between shipment and transferring tasks. Since the robot will return to the origin when performing the warehousing task and shipment task, these two can be grouped into one category when calculating the inter-task consumption.
Suppose that an autonomous robot performs a warehousing or a shipment task t i , the final storage location of the task is located at (x i , y i ), while the initial storage location of another warehousing or shipment task t j is (x j , y j ). Then, the inter-task consumption function ITC(t i , t j ) of the task interval is Obviously, for warehousing and shipment tasks, the changes in the order of execution of tasks will not affect its inter-task consumption function, i.e., ITC(t i , t j ) = ITC(t j , t i ).
Assume that a robot carries out the transferring task t m from the storage location (x m , y m ) to another location (x m , y m ) while another transferring task t n from the storage location (x n , y n ) to another location (x n , y n ). Then, the inter-task consumption function ITC(t m , t n ) between transferring task and transferring task can be formulated as Similarly, the inter-task consumption function ITC(t m , t i ) between the transferring task t m and the warehousing task t i is defined as Eq. (6), and the inter-task consumption function ITC(t i , t m ) between the warehousing task t i and the transferring task t m is defined as Eq. (7), the inter-task consumption function between the transferring task and the shipment task, is same to the inter-task consumption function between the transferring task and the shipment task: For a better efficiency of the task allocation, an efficiency function based on the consumption of the autonomous robots is defined as . . → t ik represents the sequence assigned to the robot r i , which is a linear sequence indicates that the task sequence for the robot r i consists of k tasks. f i (r i , S i ) represents the energy consumption of autonomous robot r i to complete the given task sequence S i . After the above design, the final optimization goal is to minimize the total consumption of the robots while minimizing the consumption of each robot. All consumption functions are evaluated by the Manhattan distance, which can accurately reflect the travel distance of autonomous robots in the warehouse. In this process, the homogeneous autonomous robots with the same speed and energy consumption are employed. The consumption evaluation function based on Manhattan distance not only reflects the energy consumption, but also reflects the task execution time. Therefore, the proposed model covers the two issues in the intelligence warehouse, i.e., the energy consumption and time consumption.

B. CROSSOVER AND MUTATION
In the optimization process of the task sequence, several effective chromosome crossover and mutation operators are designed to obtain better Pareto optimal front.
To effectively reduce the solution space of the task allocation scheme, we use a combined chromosome method, as shown in Fig. 2, which also can enhance the converge performance. In the figure, the task sequences are arranged in order and evenly distributed. The first part of the encoding is the sequence information of the tasks to be executed. The second part is the number of tasks assigned to each autonomous robot, which is composed of multiple randomly generated positive integers, the sum is equal to the total number of tasks to be assigned, and each number represents the number of tasks assigned to each robot. As shown in the figure below, five tasks are assigned to the first robot, then its corresponding task number is the first five task numbers of the first part, and so on. For the common crossover operators, such as single-point crossover, double-point crossover and multi-point crossover, they cannot guarantee that newly generated individuals are still completely allocated. Therefore, we adopt a combined crossover operator based on the order crossover operator and binary crossover operator. Specifically, the order crossover operator is used for the task sequence part of chromosomes, while the binary crossover operator is for the task number part assigned by each autonomous robot.
The flow of order crossover operator is shown in Fig. 3. First, two genes in a pair of parent chromosomes are selected randomly as the start and end positions, while the selected gene positions of each parent are the same. Then, two offspring are generated, and the genes between the start and end gene positions are consistent with the parents. Finally, the position of the gene selected in the first step is found out in another parent, and then the remaining genes are put in order in the offspring generated in the previous step. This crossover operator can ensure that the newly generated offspring is a scheme of complete task allocation scheme without conflict detection. Unlike the crossover operator, the mutation operator acts on a single gene, and changes the individual effect by changing the value of individual or multiple genes. Generally speaking, the mutation operator can improve the global random search ability of the algorithm, maintain the diversity of the population, and prevent premature convergence. In this paper, the replacement mutation operator is used to mutate the task sequence part of chromosome. The specific process is shown in Fig. 4. Firstly, a sub-bit string is randomly selected from individuals. Then, a mutation site is selected in the remaining bit string. Finally, the selected sub-bit string is inserted into the variant site to form a string of new genes. For the part of number of tasks assigned by each autonomous robot, simple Gaussian mutation can be adopted.

III. PROPOSED METHOD
In the recent literatures, the great potential of IBEAs in solving MaOPs has been demonstrated [35]- [39]. However, the indicators in IBEAs are rather computationally expensive, especially when settling many objectives (i.e., Hypervolume). The maximin fitness function mentioned in this paper is much less computational complexity than Hypervolume. Other advantages and features of the maximin fitness function have been described in the section III.A.

A. MAXIMIN FITNESS FUNCTION
In [40], Balling proposed the maximin fitness function, within which the maximin function is developed to evaluate the individual's fitness. The maximin fitness function is defined as: where the minimum value is taken from objective 1 to m, and the maximum fitness is taken on all solutions x from population 1 to N , except for the same solution i. From Eq. (9), we can obtain the following characteristics of the maximin fitness function [40]- [44]: 1) The dominance relation can be determined by the maximin fitness. If the fitness of individual less than or equal to 0 means the individual is non-dominated or weaklydominated. The fitness of the individual larger than 0 means the individual is dominated.
2) The clusters generated by non-dominated individuals will be penalized by the maximin fitness function, as shown in Fig. 5(a).
3) The maximin fitness of dominated individuals measure their distance to the non-dominated front. See Fig. 5(b). For multi-objective and many-objective optimization, the above characteristics are useful to select promising solutions with good diversity and convergence in the multi-objective search space. However, in Eq. (9), the maximin fitness of the non-dominated individuals are not only controlled by other non-dominated individuals, but also by the dominated individuals, as shown in Fig. 5(b), which may lower the performance of search Pareto-optimal solutions. Then, we can obtain a modified maximin fitness function [43]: where ND is the non-dominated solution set. When Eq. (10) is employed to evaluate individuals in the population, we can see each non-dominated individual is only affected by other non-dominated individuals. For example, in Fig. 5(b), individual B is no longer controlled by individual D, so the maximin fitness of individual B equals to −1.

B. ONE-BY-ONE UPDATE SCHEME
Through the maximin function, the contribution of individuals to the convergence of the population, as well as the distribution of individuals in the population can be reflected. However, if the indicator is used directly to evaluate the quality of individuals in the population, the process of individual selection will face the dilemma of equivalent selection. That is, there may be individuals with the same fitness value and cannot be selected. As shown in Fig. 6, individuals A to E are evenly distributed in the population, so they have the same fitness. However, in the environmental selection process, not all individuals will be retained. For example, keep three of them. Because all individuals have the same convergence, it is necessary to consider the distribution, individuals A, C, and individual E should be retained, and individuals B and D should be discarded. However, if only considering maximin fitness, the algorithm may not be able to make the correct selection operation, which will affect the optimization efficiency of the entire algorithm. To this end, this section will propose a one-by-one update scheme. By using the one-by-one update scheme, we can effectively deal with the dilemma of equivalence selection. The strategy steps are as follows. First, use the maximin fitness function to select the first individual of the new population, the maximin fitness function is used to calculate the individuals' maximin fitness according to the Eq. (11). Then the oneby-one update scheme is adopted to select the individual with the least fitness among the candidate individuals to the new population. In this process, the new population can maintain good distribution and convergence. As shown in Fig. 7, individual0073 A to E are uniformly arranged in the two-dimensional objective space, then three individuals will be uniformly selected from them using the one-by-one update scheme. In the first selection, since there is no selected individual, set the ideal point as the reference point to calculate the individuals' maximin fitness according to Eq. (11), as shown in Fig. 7(a). According to the fitness, individual A or individual E is selected, and we assume individual A is selected. In the second selection, according to the selected individual A, the fitness of the remaining individuals will be calculated. As shown in Fig. 7(b), since the fitness of individual E is the smallest, individual E is selected. In the third selection, the selected population includes individual A and individual E, while the candidate individuals include individual B, individual C, and individual D. After that, calculate the fitness of the candidate individuals, the result is shown in Fig. 7(c), individual C has the smallest fitness and is selected. Finally, through this scheme, individual A, individual C, and individual E are selected, as shown in Fig. 7(d), maintaining a good distribution. The following is the formula for selecting the first individual for the new population: where z * is the ideal point after normalization.

C. PROPOSED ALGORITHM:MO-MOEA
In this section, the overall framework of the algorithm is given. The specific execution steps of Algorithm 1 are as follows.
First, N individuals are randomly initialized as the initial parent population (Line 1). Then, based on the maximin Algorithm 1 General Framework of MO-MOEA 1: Objective normalization (P) 2: while the given termination condition is not reached do 3: P = MatingSelection (P, N ); 4: P = Variation (P', N ); 5: Q = P ∪ P 6: P = Environmental_Selection (Q, N ); 7: end 8: return P fitness value of each individual, a sufficient number of individuals are added to the mating pool (Line 3). Then, the simulated binary crossover operator and polynomial mutation operator are used to perform crossover and mutation operation to generate offspring population P (Line 4). After that, a certain number of individuals are selected from the combined population Q through the Environmental Selection process to form a new population P (Line 6). Continue the above process until the given termination condition is reached. In the following sections, the environmental selection process will be introduced in Algorithm 2.

1) NORMALIZATION
In this paper, the Normalization is also indispensable, which can be used to improve the algorithm's ability to solve scaled problems. Normalization has proved to be helpful in solving the problem of different value ranges for each objective [45]. In the normalization process, the formula is defined as where z * m is the ideal objective, z nad m is the nadir objective, m represents the index of the objective dimension, m ∈ {1, 2, . . . , M }. Due to the Pareto optimal front is unknown, we estimate the ideal objective and nadir objective from the solution set.

2) ENVIRONMENTAL SELECTION
The steps of the Environment Selection are shown in Algorithm 2. First, objective normalization is implemented for each objective, the formula is defined in Eq. (12) (Line 1 in Algorithm 2). Then, the individuals are selected according to the similarity verifying method (Line 2 in Algorithm 2). Then the number of non-dominated individuals ND in the population S is determined according to the characteristic 1 of the maximin fitness function (Line 3 in Algorithm 2). If the size of non-dominated individuals is larger than N , we set the ideal point as the reference point. According to Eq. (11), the individuals with the smallest fitness are added to the new population P. Then, by calculating the maximin fitness between the new population P and the candidate individuals in the population S according to the Eq.

3) TWO PERFORMANCE INDICATORS
Although one-by-one update scheme can be used to select individuals that are more suitable for the new population, it may still have a small probability of facing the equivalent selection problem, so other methods are needed to select better individuals from individuals which have same maximin fitness. The following two performance indicators are employed to further select individuals. First of all, convergence is an essential evaluation indicator when solving MaOPs. The Euclidean distance from the candidate  13) is applied to evaluate the convergence of the candidate individual. This indicator can effectively evaluate individuals with better convergence in the objective space [46].
The individual with the smallest DIS value is regarded as the closest one to the ideal point after normalization, and it will be given priority in the next selection.
If the candidate individuals have the same DIS value at this time, considering from the perspective of population diversity and propose a diversity indicator to evaluate individuals according to Eq. (14). In high dimensional space, cosine value is effective to measure the distribution of the individuals. In this paper, the angle of the individual in the objective space is used to measure the diversity, the formula is as follows: where f (x) and f (y) are the normalized vectors of the two individuals x and y in the objective space, respectively.
Niching is used to further solve the dilemma of equivalent selection, as shown Algorithm 3. In this part, we use two indicators to evaluate the individuals with the same maximin fitness, called the same equivalent individual set (ES). First, the Euclidean distance to the ideal point of the ES were compared, the individual with minimal distance to the ideal point will be selected, called individuals with minimal Euclidean distance (MS). If the size of MS is larger than 1, the acute angle between MS and population P is used to measure the distribution of MS to the population P. Only one individual will be selected and added to the new population P. If more than one individual remains after selection by these two indicators, an individual is randomly selected and added into the new population P.

4) SIMILARITY VERIFICATION
In Algorithm 4, we proposed an approach to verify the individuals' similarity. The individuals in S are selected one by one if their similarity is smaller than min_dif, which is defined in [43] (Lines 2-9 in Algorithm 4).

IV. EXPERIMENTAL STUDY
This section includes two parts: 1) preformation validation on benchmarks, which exhibits the advantages of the proposed algorithm in many-objective optimization; 2) application on multi-robot cooperative scheduling optimization problem [47], [48].

A. EXPERIMENTAL RESULTS ON BENCHMARKS
This section gives comparison results between MO-MOEA and other five start-of-art MOEAs, i.e., MOEA/D [31], MOMBI-II [27], NSGA-III [49], RVEA * [50], and MD-MOEA [43], on the benchmark test suite MaF [51]. These contrast algorithms include three mainstream methods: 1) Pareto-based algorithm, i.e., NSGA-III, 2) decomposition-based algorithm, i.e., MOEA/D, RVEA * , and 3) indicator-based algorithm i.e., MOMBI-II and MD-MOEA. We use these classical algorithms to compare with our proposed algorithm to verify its effect. Experiments are performed on PlatEMO with MATLAB R2018b [52,53]. Each test problem is tested in 5-, 8-, 10-and 15-objective instances. Then, each algorithm is run for 20 times on each test instance independently [54]. We verify the performance of the algorithm from two aspects: the image obtained by the algorithm solving the instance and the widely used performance indicators.

1) BENCHMARK TEST PROBLEMS AND PERFORMANCE INDICATORS
MaF is a widely used test suite, which is adopted for empirical comparisons in this paper. And MaF test suite includes some ''convex'' DTLZ problems [55] and several WFG problems [56]. Compared with other test suites, MaF test suite shows some characteristics that are more in line with actual problems, which represent the real scene well, and can more effectively verify the effectiveness and effect of MO-MOEA. This test suite is composed of optimization problems with many different Pareto front. Therefore, the MaF test suite is used for comparison between algorithms.
In this paper, we conduct experiment on problems with 5, 8, 10, and 15 objectives. The number of decision variables is set to M + 9 for MaF1-6, MaF10-12, M + 19 for MaF7, 5 for the MaF13 and 2 for MaF8, 9. For the large-scale case MaF14,15, the number of decision variables are M * 20. A detailed description of the parameters can be found in [51].
In this work, we use IGD as metric to evaluate the experimental results of the algorithms. IGD is a comprehensive indicator to measure the convergence and distribution of the solution sets. For more detail of metric IGD, refer to [57], [58].  Table 2. 2) Crossover and mutation: We use the commonly used simulated binary crossover (SBX) [59] and polynomial mutation [60] for crossover mutation. The probability of crossover and mutation is set to 1.0 and 1/D (where D is the number of decision variables). Set the distribution index of SBX and polynomial mutation to 20 [61].
3) Termination Conditions: Each algorithm runs independently 20 times. For MaF1-15, the maximum number of fitness evaluations (FEs) of the algorithm is determined by the decision variable, which is max {100000, 10000 × D}. Details can refer to [49]. 4) Other parameters: For MOEA/D, the Tchebycheff approach is used and the neighborhood range is set to N/10. On the algorithms MD-MOEA and MO-MOEA, min_dif = 0.0001 (minimum difference) are employed.

3) EXPERIMENTS AND ANALYSIS ON MAFS
In this section, the performance of MO-MOEA in solving many-objective optimization problems is verified on the MaF benchmark problems. Table 3 shows the experimental results of the six algorithms after 20 runs and highlights the best results (completed results are provided in the supplementary file). The Wilcoxon rank sum test is used, where the significance level is set to 0.05, and the statistics are performed. The symbols ''+'', ''−'', and ''≈'' respectively indicate that the compared MOEA performs significantly better, significant worse or significantly similarly to the MO-MOEA.
As shown, we can see that MO-MOEA has demonstrated strong competitiveness, achieving first ranks on most of the MaF instances. MD-MOEA performs poorly on the MaF test suite. The reason may be that MD-MOEA uses a one-time comparison method to select the solution. By contrast, the reason for the promising performance of MO-MOEA is that one-by-one update scheme makes a better diversity between non-dominated individuals. This scheme ensures that the algorithm considers convergence and diversity in the process of selecting individuals. Through the above experiments, it can be determined that MO-MOEA has performed   well in solving MaOPs, and MO-MOEA has also shown high efficiency on large-scale problems.

4) OTHER EXPERIMENTAL RESULTS
The parallel coordinates of the final solutions of the algorithm on 15 objectives on several MaFs are provided in the supplementary file. As shown, the compromise surface formed by the solution finally generated by MO-MOEA is better than the surface generated by other algorithms in both convergence and distribution, while MD-MOEA and RVEA * are also competitive, but the effect is not very good from the perspective of distribution.

B. EXPERIMENT RESULTS ON MULTI-ROBOT SCHEDULING PROBLEM
The benchmark test suite proves that MO-MOEA is effective in many-objective optimization. Then, the proposed algorithm is used to optimize the multi-robot scheduling problem of intelligent warehouse. Suppose that in the intelligent warehouse with a length and width of 100m, the goods entrance of the warehouse is at (0, 100) and the exit of the warehouse is at (100, 0). There are 10 or 15 autonomous robots in the intelligent warehouse, all of which run at a speed of 1m/s and are initially located near the origin of coordinates. There are 1,000 storage tasks to be processed, including 300 warehousing tasks, 300 shipment tasks, and 400 transferring tasks in the intelligence warehouse. All tasks are generated randomly.
To ensure that the experimental results are not affected by randomness, the task list is executed 10 times independently by the above algorithm MO-MOEA and the other two indicator-based algorithms IBEA and MD-MOEA. The population size of these algorithms is 500, and the maximum fitness evaluations is 100000. Table 4 and Table 5 show the average motion distance of each robot optimized by MO-MOEA, IBEA and MD-MOEA. They also represent the energy consumption of each robot to perform storage tasks. The above algorithms all realize the scheduling and allocation of all storage tasks. It can be found from the data that the consumption function values of all autonomous robots are on the same order of magnitude, so the load of each robot is basically balanced. Fig. 8 and Fig. 9 show the coordinates of the final solutions formed by the algorithms in the 10− and 15-objective multi-robot scheduling optimization problem. As can be seen from the figures that the compromise surface formed by the solutions finally generated by MO-MOEA is superior to the compromise surface generated by other algorithms in terms of convergence and distribution, the energy consumption of robots obtained by MO-MOEA is smaller in general. It can be seen from the figure that the solution set optimized by the MO-MOEA algorithm rarely has extreme solutions, that is, it is rare that one robot undertakes most of the tasks and other robots undertake a small part of the tasks.
We employ HV [62] to determine the effect of the algorithm on actual problems. Choose a set of maximum values as the reference point for each objective of HV, such as 200000 for each objective. Generally, the HV value of MO-MOEA is larger than traditional indicator-based algorithm IBEA and maximin-based algorithm MD-MOEA. In other words, MO-MOEA can obtain a solution set with better distribution and convergence. Through such experiments, it can be learned that MO-MOEA can provide managers with more diverse choices from the similar efficient plans. After the above experiments, it can be shown that the proposed algorithm has also achieved good results in solving multi-robot scheduling problem.

V. CONCLUSION
This paper formulated a multi-robot scheduling model in the intelligence warehouse and designed a novel maximin based algorithm to solve the model. This algorithm uses one-by-one update scheme together with additional performance indicators and similarity verification method to solve the dilemma of equivalent selection within traditional maximin function. Experimental results on a set of many-objective benchmarks with up to 15 objectives show the effectiveness of our proposed algorithm in solving MaOPs. Then our algorithm was applied in the multi-robot scheduling problem, and received promising results.
In the future, we will make further research on the efficiency of the maximin fitness function in solving practical problems. This approach will be used to resolve more complex real-world multi-robot scheduling problems with more objectives.