Multi-Agent Task Allocation Based on Discrete DEPSO in Epidemic Scenarios

Multi-Agent Task Allocation is an emerging technology that changes the world in the epidemic scenario through its power to serve the needs of any hospital that requires unmanned operation. In this environment, the end user may want to have a better quality of unmanned service at low loss and high efficiency. We defined a new multi-agent task allocation problem (MATAP) in the epidemic scenario, and then MATAP was formulated. This paper presents a novel hybrid discrete approach that is based on the Differential Evolution Algorithm (DE) and Partial Swarm Optimization (PSO), namely D-DEPSO, for handling this problem. First, the initial personal population was handled by “mutation operation”. Modulus operations in the “mutation operation” modify the numerical overflow of a variable. Second, when updating the speed matrix, the speed matrix is discretized using the “round” function we have defined. Then, a random permutation was used to delete repeated numbers and to reinsert integers in the “crossover operation”. The diversity of the population was expanded by introducing the discrete mutation operation of the DE into the PSO and preserving the optimal solution for each generation using the properties of PSO. It can be used for optimizing a single objective function. Experimental results are compared with other existing metaheuristic algorithms, such as discrete DE, discrete PSO, improved discrete DE, improved discrete PSO, and improved discrete genetic algorithm, in terms of running time and loss. The experiments show that the optimal solutions obtained by D-DEPSO are better than those obtained by other five algorithms. For the actual problem, D-DEPSO can generate an optimal solution by optimal parameter setting to allocate tasks rationally. It can achieve a rational distribution of tasks in the prevention of disease.


I. INTRODUCTION
In the last few years, because of the influence of COVID-19, reducing contact between people and people in the epidemic environment is particularly important. Meanwhile, internet of things and artificial intelligence technologies have made great progress in the past decade, benefit from this, multi-agent systems (MAS) [1] begin to be widely applied in reality, such as smart city [2], smart manufacturing [3], unmanned systems [4], smart transportation systems [5], unmanned aerial vehicle The associate editor coordinating the review of this manuscript and approving it for publication was Xiwang Dong.
(UAV) format combat systems [6]. Hence, major medical institutions have begun to use intelligent robots to gradually replace simple artificial operations [7]. In this specific scenario, agents are regularly allocated different tasks [8], such as medicine distribution, medical material handling, periodic disinfection, long-distance measurement of body temperature, patient supervision, and so on. As COVID-19 spreads globally, the needs of medical institutions around the world are showing a significant growth in the demand for medical robots. Task allocation is an important part of the multi-agent system. Hence, the multi-agent task allocation problem is especially critical. VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Due to the sudden emergence of COVID-19, medical resources are at a sharp shortage, so governments must establish temporary hospitals. Because of the environmental complexity, abnormal communication signals, lack of medical staff and task diversity of temporary hospitals, the best way for temporary hospitals is the application of MAS [7]. As reported [9], robots are wirelessly connected to the network at all times, so they can listen to the server from anywhere in the building and accept more than one task along their way, but temporary hospitals need higher reliability. They have to use pre-programmed robots so that one robot executes one task at a time to avoid communication failures in the tough environment.
The Task Allocation Problem (TAP) [10] originally addressed the task of distributing programs among different processors of a distributed computer system to reduce program turnaround time and increase system throughput. And then, with the development of multi-agent systems [1], a single agent has been unable to fulfill the military and civilian fields, and the gradual development of multi-agent and group agent cooperation systems makes the task allocation gradually applicable to the field. It has been demonstrated that finding an optimal solution is an NP-hard problem [11], [12], [13]. There can't be a good exact algorithm for the optimal solution in polynomial time. The goal of task allocation is to optimize the performance of multi-agent or loss of task execution, such as increasing the number of successfully executed tasks and decreasing task execution time and resource consumption. Furthermore, to optimize the objective model, a vast number of constraint conditions must be satisfied during task allocation. For example, because of their differences in structure, function, and performance, different types of agents are utilized to play different roles and handle diverse tasks. An agent dedicated to distributing medicine can only perform a medicine distribution task, a patrol disinfection agent can only perform a periodic disinfection task, a material handling agent can only perform a medical material handling task, and so on.
In the last few years, many researchers have devoted themselves to the study of MATAP, and many methods have been proposed. Task allocation strategies are divided into two types: exact methods and heuristic algorithms. For small-scale task allocation, exact methods [14] often provide an effective solution. When the scale of the task is extended, exact methods usually fail. In contrast to finding the exact solution, meta-heuristic algorithms typically find sub-optimal solutions. Heuristic algorithms have been researched more thoroughly than exact methods and have been shown to provide an efficient foundation for achieving sub-optimal solutions. Zhang Chunmei presents a distributed memetic differential evolution algorithm to solve the discrete problem [15]. Xin Bin presents a review about the way of hybridizing differential evolution and partial swarm optimization [16]. There have been several achievements in the field of meta-heuristic algorithms for the multi-agent task allocation problem. J Schwarzrock presents swarm intelligence [17] to solve task allocation problem in multi unmanned aerial vehicles systems, its principle is similar to MATAP. Jing Zhou presents a distributed many-objective evolutionary algorithm with greedy algorithm (GA) [18] to solve multi-agent task allocation problems, but the mathematical model for the epidemic scenario has not been fully considered in this paper. So we proposed a new mathematical model to address the epidemic scenario. For the metaheuristic algorithm, it is easy to fall into a local optimal solution, thus missing the global optimal solution. Wang Lu presents a novel task allocation method named Collection Path Ant Colony Optimization (CPACO) [19] to solve the problem that is easy to convergent the local optimal solution, but it depends on the set of initial parameters selected. There is instability in the results of the algorithm. Dai Jing presents a differential evolution algorithm (DE) [20] to handle the allocation problem of multi-heterogeneous UAV cooperation, but in this scenario, the number of tasks considered is too small to satisfy the requirements of multi-tasks in the epidemic scenario. Maroua Nouiri proposed a distributed partial swarm optimization (DPSO) [21] is used for flexible job-shop scheduling problems. It demonstrates the effectiveness of DPSO in handling discrete problems. Feng Zhang improved quantum particle swarm optimization (QPSO) [22] to handle the task allocation in MAS. It enhances the diversity of the population and makes it have stronger search abilities, but it does not consider the application of QPSO in complex task scenarios. The above are all the static problems, Junier Caminha Amorim [23] presents a swarm-GAP based solution for the task allocation problem in dynamic scenarios.
In view of the above, MATAP in epidemic scenario is a discrete problem. At the same time, it is also a combinatorial optimization problem. A simple discretization of the commonly used continuity algorithm is not applicable to this problem. And there is no mathematical model for MATAP in the epidemic scenario and there is no algorithm for a large discretization problem in the epidemic scenario. The purpose of this paper is to build a mathematical model and to design a discrete hybridizing algorithm for the epidemic scenario to maximize the number of successfully executed tasks, minimize the loss of performing tasks, and minimize task execution time and resource consumption. Overall, the main contributions of our study are listed as follows: (1) In the epidemic scenario, we introduced and defined the mathematical model of MATAP. The model's objective function minimized the loss of allocation. And we minimized the loss of allocation by selecting appropriate potential agents to execute corresponding tasks and then determining the order of execution.
(2) A new approach was presented by combining the discrete DE and discrete PSO algorithms. We apply this approach to solve the task allocation problem in multi-agent systems. According to the urgency of the epidemic scenario and the requirement for minimal allocation losses, the algorithm mainly emphasized rapidity and minimization of results. Therefore, we innovatively combined discrete DE and discrete partial swarm optimization by designing an algorithm framework, crossover operation, and a method of updating speed and position variables.
(3) Multiple simulation experiments are implemented to demonstrate the effectiveness of the proposed algorithm. The results showed that our algorithm is more effective at solving MATAP in the epidemic scenario.
The remainder of this paper is arranged as follows: In Section II, the MATAP is defined, and its mathematical model is introduced. In Section III, an illustrative example of MATAP is presented to describe how it works. In SectionIV, the frameworks of the DE and the PSO are briefly presented, and the procedure of the D-DEPSO is sufficiently described. SectionV shows a lot of experimental data and analysis. In SectionVI, some conclusions and future prospects of the research are discussed.

II. MATHEMATICAL MODEL FOR MULTI-AGENT TASK ALLOCATION PROBLEM
In order to better explain the task allocation problem of MAS in the epidemic scenario, it needs to be formally described.
To build a mathematical model, we refer to the paper [24]. In this paper, all parameters are known and ignore timing by default in MATAP, and two metrics were prioritized: the probability that a medical agent completes a task on time and the minimum loss of task allocation. Since the parameters are known, the pairing scheme between robots and tasks when the epidemic hits only needs to be considered. For the same multi-agent system, different allocation plans have different effects. How to optimize the task behavior of each medical robot plays a crucial role in minimizing the loss of effect produced by MAS execution of the task. Given the preceding, the objective function was designed to result in the least loss of multi-agent task allocation in the epidemic scenario. Table 1 lists the relevant indices, sets, parameters, and variables used in this section. The model is as follows: p ij represents the probability of robot i completing task j on time, y ij represents whether robot i executes task j. If y ij = 1, it means robot i executes task j, or not. So (1−p ij )y ij represents the probability that robot i does not execute task j on time.
The objective Function (1) is to minimize the loss of MATAP and ensure the quality of task allocation at the same time. Create a system with i Agents A = 1, 2, . . . , i, j Tasks T = 1, 2, . . . , j and the value of Tasks v k = 1, 2, . . . , K . Because p ij represents probability of robot i finishes task j on time, 1p ij represents probability of robot i not completes task j on time. y ij = 1 represents if robot i is allocated task j to execute; otherwise y ij = 0. v k represents the value of task j completed by robot i.
Constraint condition (2) guarantees every task is only executed by one robot. The constraint condition (3) guarantees every robot can only implement one task. Constraint condition (4) guarantees each task is specific to a particular resource only, and every robot is only assigned one task.
In the next section, we describe how an initial solution generated by D-DEPSO maps to a 0-1 matrix through an explicit example.

III. ILLUSTRATIVE EXAMPLE OF MATAP
In this section, the working principle of MATAP was interpreted specifically through an illustrative example. Let's assume a scenario with five different tasks to be allocated by five robots. Tasks include medicine distribution, medical material handling, periodic disinfection, long distance measurement of body temperature, and patient supervision. So, N = 5, an initial individual of the integer arrangement created by D-DEPSO: x = [3 1 2 5 4], where the number of integer position represents the number of the robot and the integer represents the number of the task. In this example, 3 represents task 3 allocated to robot 1, 1 represents task 1 allocated to robot 2, and so on. As the initial individual of the D-DEPSO algorithm, create random permutations of integers 1 ∼ N, x i (i = 1, 2, . . . , NP, where NP is the population size).
Although robots are universal and a machine can perform different tasks, e.g. medicine distribution, medical material handling, or periodic disinfection, the economic loss incurred when performing different tasks is different because the vk (i.e., value) of the task is different. Different tasks performed by the same agent result in different economic losses. The matrix (5) depicts the result of task allocation when a rapid task allocation after the arrival of the epidemic was requested. The first row represents medicine distribution; the second row represents medical material handling; the third row represents periodic disinfection; the fourth row represents long-distance measurement of body temperature; the fifth row represents patient supervision. The matrix (5) was transferred by the initial individual solution: x = [3 1 2 5 4]. Due to the emergency nature of COVID-19, every robot can only complete one task at a time in this scenario. Using integer arrangements to create multi-agent task pairs as the initial individual generated by D-DEPSO. In matrix (5), the number of rows where ''1'' is located denotes the number of robots; the number of columns where ''1'' is located represents the number of tasks.

IV. D-DEPSO
In this paper, a discrete differential evolution algorithm and a discrete partial swarm optimization were mainly combined to solve the above multi-agent task allocation in the epidemic scenario. Differential evolution mutation operations are used for mutating personal optimal position, and partial swarm optimization is mainly used to record personal optimal positions and update speed values. The pseudo-code of the D-DEPSO algorithm is shown in Algorithm 1, and parameters related to the D-DEPSO algorithm are shown in Table 2.

A. FRAMEWORK OF D-DEPSO
The DE and PSO are both meta-heuristic iterative algorithms. Because of their simple calculation, fast convergence speed, ease of implementation, and few control parameters, the DE algorithm and the PSO algorithm have stimulated the interest and research of many scholars. The advantage of DE is the diversity of population solutions, and the virtue of PSO is the ability to store personal best value and global best value. Based on this, we first variate the personal variable by the DE mutation operation, store the personal and global best position value, and update the velocity variable and position variable with this variable by PSO. The search for an optimum is carried out. The algorithm framework combines the strong local search for optimal value of the DE with the fast convergence of the PSO.
The above strategies are integrated into the traditional PSO to form an improved D-DEPSO, and the process is illustrated in Figure 1.

B. INITIALIZATION
At the beginning of the algorithm, the first step is to perform an initialization operation. Both the x and v matrices are 10 * 10 matrices with no repetition in each row from 1-10. for j = 1 : NP do 8: update matrix of p and pbest 9: z ← x(j, :) 10: convert z to a 0-1 matrix of y 11: if func[y] < pbest[j] then 12: p(j, :) ← x(j, :) 13: pbest(j) ← func(y) 14: end if 15: update g and gbest 16: if pbest(j) < gbest then 17: g ← p(j, :) 18: gbest ← pbest(j) 19: end if 20: adapted operator (see Section IV-C) 21: r1 ← randi(NP) 22: while r1 == j do 23: r1 ← randi(NP) 24: end while 25: r2 ← randi(NP) 26: while (r2 == j)||(r1 == r2) do 27: r2 ← randi(NP) 28: end while 29: r3 ← randi(NP) 30: r3 ← randi(NP)   The following steps are to initial personal best value, assign the value of the x matrix to the p matrix, extract each row of the p matrix and convert it to a 0-1 matrix of y through the method present in Section III, and calculate the result for y by using the formula that is shown in Section II. The result at this point is a personal best value. It'll be saved in pbest.
The next step is to initial global best value, which is simply a comparison of pbest with gbest. If pbest < gbest, the g variable is replaced by the row variable of the p-matrix being executed, gbest is replaced by pbest. and gb will record every generation pbest.

C. MUTATION OPERATION
In the DE algorithm [25], initialization, mutation strategies, and crossover operations have a significant effect on the diversity of population solutions, thus affecting the quality of solutions. Considering mutation operation and crossover operation as the most important parts in designing our D-DEPSO algorithm. Meanwhile, in order to improve the population diversity in the early stages of the algorithm and to preserve the good solutions in the later stages to avoid the destruction of the optimal solutions, we have added the adapted operator. Its formula is as follows: The adapted operator is depicted in equation (6), where T represents the maximum number of iterations, i represents the current number of iterations, and f represents the initial mutation rate. At first, i = 1, F = 2f , the algorithm can maintain the individual diversity. As the number of loop iterations increases, i gradually converges to T , and F gradually converges to f . This allows the algorithm to retain the optimal solution.
For the MATAP in this paper, because it is a discrete problem, we have to design a discrete mutation operation to replace the traditional mutation operation. The idea of the mutation operation of the classical DE algorithm is that the weighted difference vectors of two vectors are added to the third vector, as shown in Equation (7). v j,g = x r 1 ,g + F * (x r 2 ,g − x r 3 ,g ) On the basis of not altering the core idea, we used a mutation mode as shown in Equation (8).

D. CROSSOVER OPERATION
As shown in the concrete example at the end of Section IV-C, p j does not satisfy the criterion of not having repetition in each row from 1 to 10 at this time. With an example, we'll show you how to use our method to make p j satisfy the criterion. The procedure is shown in the Figure 3.

E. UPDATE SPEED AND POSITION VARIABLE
The particles in the D-DEPSO algorithm have position and velocity variables. The position of one particle represents a task allocation scheme for multi-agents. The position variable is generated by the speed variable. The fitness of particles represents the expected loss of the task allocation scheme. Thus, Equation 1 is selected as the fitness function of the algorithm. A greater particle fitness corresponds to a lower expected loss of the task-allocation scheme that the particle represents.
Every time a particle-update operation is completed, the optimal particle is updated. The process is as follows: The fitness of the current position of a particle is calculated. If the fitness of x is less than the fitness of pbest, which represents the known optimal position of x, then pbest is replaced with the fitness of x. Similarly, if pbest is lower than gbest, which is the fitness of the global optimal position, then gbest is replaced with pbest, g is replaced with p. If the conditions are not met or the iteration ends, the optimal particles are not updated.
Traditional PSO is proposed by Kennedy and Eberhart in 1995 [26]. Speed-update formulation in this paper is shown in Equation 11: Y Shi and R Eberhart present PSO with inertia coefficient [27], it becomes standard PSO gradually. As is shown in below: Based on Equation 12, we present a new speed-update formulation, as shown in Equation 13. In this equation, we define the operation ''R''. The calculation results in parentheses are first handled by a boundary condition that removes the number of less than zero or more than ten, and then all the calculation results are rounded up to the next integer. Because new speed variable v(t + 1) does not meet the criterion of not have repetition in each row from 1 to 10, we handle it using the same method that presents in Section IV-D. w = 0.6, c 1 = 1.  A new v(t + 1) = [2 1 8 3 7 9 6 5 4 10] variable is created.
The position-update operation updates the original position variable through the recombination of the two variables x and v. For the method of recombination, there are too many types of recombination operators [28]. A new approach was presented to the recombination operator based on the core idea of exacting a part of each variable of x and v, respectively. Through this method, a new x variable was obtained. As shown in Figure 4, the procedure is as follows: By generate random number 4, first four columns of x was extracted. It's obvious that we need to extract 6 columns of v(t). Then, a new x variable was got by combining the two parts. The new x variable absolutely doesn't satisfy the criterion of not having repetition in each row from 1 to 10. We also need to apply the crossover operation that is presented in Section IV-D. Then a new x variable that satisfies all the constraints is obtained.
The next experiments demonstrate the advanced and rational nature of the framework of D-DEPSO. Through this framework, D-DEPSO has a faster convergence speed and a better global value.

V. EXPERIMENTAL RESULT ANALYSIS A. EXPERIMENTAL ANALYSIS OF ALGORITHM
Because of the time-critical nature of the epidemic, multiagent task allocation problem in the epidemic scenario that has rarely been studied. Due to the large-scale and static nature of multi-agent task allocation problems in the epidemic scenario, no other algorithms are suitable for this particular scenario. So we compare the D-DEPSO with discrete DE, discrete PSO, improved discrete DE (IDE) [29], improved discrete PSO (IPSO) [30] and improved discrete genetic algorithm (IGA) [31] to illustrate the superiority of the proposed method.
To demonstrate and analyze the performance of the proposed D-DEPSO algorithm applied to an epidemic scenario, several experiments are implemented: (1) comparison of discrete DE, discrete PSO, improved discrete DE, improved discrete PSO, improved discrete GA and D-DEPSO algorithms on the performance of MATAP of the different scales and the different NP; (2) experiments on the performance of the D-DEPSO algorithm in MATAP of the same scales but with different population numbers (NP). In this paper, we assume that the parameters such as the value of an object, the probability of a robot i completing task j on time, and so on are definite.
All experiments are executed in the environment as follows: 64-bit Windows 11 21H2; 3.20 GHz AMD Ryzen 7 5800H with Radeon Graphics CPU; 16G Memory; programming environment: MATLAB R2020b.
In the experiments, the D-DEPSO algorithm was compared with discrete DE, discrete PSO, improved DE, improved PSO and improved GA algorithms in different dimensions and different NP. matrices of v k and p ij are randomly generated, v k presents value of executed task, every number in matrix of v k is a random number between [0, 100]; p ij presents probability of robot i completing task j on time, every number in matrix of p ij is a random number between [0, 1]. For each comparison experiment of the same dimension and different NP, once v k and p ij are determined, they will not be changed. The parameters of the six algorithms involved in the experiments are set as follows (all parameters are optimal settings obtained by experiments): 1. DE/IDE (1) For the initial mutation rate, F = 0.4.
(2) For the initial cognitive factor, c1 = 1.2. As shown in the first row in Figure 5. To illustrate the application of the above relative work in an epidemic scenario, we conduct some experiments in 10 dimensions based on simulated data that is generated by a random function. Ten dimensions means that there are ten tasks waiting to be matched to ten agents. In this case, there are three medicine distribution tasks, two medical material handling tasks, four periodic disinfection tasks, and one long-distance measurement of body temperature task. Every robot can execute any task.
Six algorithms, discrete DE, discrete PSO, improved DE, improved PSO, improved GA and D-DEPSO, were implemented for this case at 50, 100, and 200 NP. As you can see in the first row of Figure 5, there are the comparison plots VOLUME 10, 2022 When the NP is 50, D-DEPSO still has a clear advantage. But when the NP is 100, IDE has a fastest convergence speed. When the NP is 200, IPSO has a faster convergence speed and the result of IGA almost catches up with D-DEPSO. As a whole, D-DEPSO has the best performance in this case.
As you can see in the second row of Figure 5, we conducted some experiments to compare the losses of six algorithms in twenty dimensions for the NP of 50, 100, and 200. There is almost the same trend as in ten dimensions. D-DEPSO can obtains an optimal solution in this scenario, but IGA has a faster convergence speed when NP are 50 and 200. When NP is 100, IDE and IPSO has a faster convergence speed. But D-DEPSO has the absolute advantage of solution in twenty dimensions.
For fifty-dimensional MATAP in an epidemic scenario, we have conducted simulation experiments on 50 tasks allocated to 50 robots in this part. Because of the increase in the number of tasks, we need a longer iteration period. The experimental results are presented in the third row of Figure 5. IPSO and IGA have started to show excellent performance.
Whether NP is 50, 100 or 200, IPSO and IGA have almost equally fast convergence speed. But the D-DEPSO algorithm based on PSO has the better solution. The NP is bigger, so the optimal solution is better.
Due to one-hundred-dimensional MATAP has a more complex computation, we not only increase the number of iterations but also set the number of populations from 50, 100, 200 to 100, 200, 300. Experimental results have shown in the last row of Figure 5 that D-DEPSO has an absolute advantage in searching for an global optimal solution. For the high dimension MATAP in the epidemic scenario, D-DEPSO has a faster convergence speed and the lowest losses.
Because of the complexity of the task under the epidemic, we compare 10, 20, 50, and 100 dimensions in Figure 5. We can know that IPSO and IGA algorithms have fast convergence speed in the early stages of algorithm operation. For the low dimension, IDE can search for the sub-optimal solution fast and has a fast convergence speed. But it is easy to fall into the local optimal solution. The ability to search for global optimal solutions in IPSO, IGA and D-DEPSO is much better than in DE and IDE. In the cases of three medicine distribution tasks, two medical material handling tasks, four periodic disinfection tasks, and a long distance measurement of body temperature task. D-DEPSO can obtain the optimal solution [4,10,1,6,7,9,5,2,8,3] when ''NP'' is set to 100. Meanwhile, D-DEPSO can generate a minimum loss for the task allocation in the epidemic scenario. And for the 100 dimensions, contrast to the other five algorithms D-DEPSO has a definite advantage.

B. ANALYSIS OF STATISTICAL DATA
To facilitate our setting of the ''NP'' parameter, we test the effect of different populations on D-DEPSO's handing MATAP in the same dimension. We conducted four experiments in 10, 20, 50, and 100 dimensions, respectively. The results are shown in Figure 6. The optimal solution generated when NP is 100 and 200 is almost the same. It means that after NP increases to 100, continuing to increase NP has little effect on the MATAP of 10 and 20 dimensions. For the fifty dimensions MATAP, D-DEPSO generates an optimal solution when NP is set at 200. For the one hundred-dimensional MATAP, the optimal solution generated when NP is 200 and 300 is almost the same. So the parameter setting of NP is set to the maximum of 200.
According to the above experiments, we get a result about the ''NP'' parameter setting. For the task allocation problems of 10 and 20 dimensions, ''NP'' set to 100 task allocation problems will have an optimal solution. However, for the higher dimensional task allocation problems of 50 and 100, the ''NP'' parameter setting of 200 is more appropriate.
In this part, the statistical data of discrete DE, discrete PSO, improved discrete DE, improved discrete PSO, improved discrete GA and D-DEPSO was summarized. Here are some conditions to clarify. The ''NP'' parameter is set using the experimental results above. From every dimension, mean value, standard deviation, and average time were obtained  from 100 sets of data. The statistical data of all experimental results have been shown in Table 3.
Here are three types of data: the minimum loss, ''X ± S'', and the average time for each algorithm, respectively. The unit of the minimum loss and X ± S are k, 1k means one thousand RMB economic losses. The unit of the average time is s, it means second. Bolded characters are the optimal values for each parameter. Next, we illustrate the Table 3 from ten, twenty, fifty, and one hundred dimensions, respectively.
In the ten-dimensional statistical data, IGA has the minimum loss and minimum mean in all six algorithms. Meanwhile, IDE has the shortest average time, and the average time of D-DEPSO is almost the same as that of DE. In comparison to discrete DE, discrete PSO, IDE, IPSO and IGA, D-DEPSO has a smaller standard deviation. It means that the statistical data of D-DEPSO for ten dimensions fluctuates less.
Compared to ten dimensions, it has a similar trend for twenty dimensions. D-DEPSO generates the smallest loss and obtains the minimum mean, but its average time is the almost longest. In contrast, IDE has the shortest average time. The smallest standard deviation is generated by IGA.
For the fifty-dimensional statistics, their trend is the same as the twenty-dimensional statistics. Minimum loss, minimum mean, and minimum standard deviation are all obtained by D-DEPSO. Of course, the average time of D-DEPSO has improved. As opposed to IDE, which has the shortest average time. IGA has the minimum standard deviation, it means statistics of IGA fluctuate less.
The trend begins to change in one-hundred dimensional statistical data. D-DEPSO not only obtains the minimum loss and mean but also produces the smallest standard deviation and average time at this time. The values of mean and standard deviation illustrate the result of D-DEPSO fluctuating less. D-DEPSO has the best performance in one-hundred dimensional experiments.

C. ANALYSIS OF ACTUAL TASK ALLOCATION 1) TEN DIMENSIONS MATAP
For the ten dimensions of MATAP in the epidemic scenario (v k of each task is a specific value obtained from our evaluation), D-DEPSO generates an optimal solution VOLUME 10, 2022 [3,10,2,8,7,9,1,6,4,5] when ''NP'' is 50, the optimal solution is [4,10,1,6,7,9,5,2,8,3] when ''NP'' is 100, and the optimal solution is [4,10,1,6,7,9,5,2,8,3] when ''NP'' is 200. As a result of the conclusions in the previous Section V-B, we know that the ''NP'' must be set to 100 to produce the optimal solution for the 10 dimensional task allocation problem. So we choose the optimal solution when ''NP'' is set to 100 for analysis. The optimal solution is [4,10,1,6,7,9,5,2,8,3] when NP is 100 and the matrix transformed by the optimal solution is shown in Matrix 14. For the matrix transformed by the optimal solution, there are some things we must interpret: the first column to third column represent medicine distribution tasks; the fourth column and fifth column represent medical material handling tasks; the sixth column to ninth column represent periodic disinfection tasks; and the tenth column represents long-distance measurement of body temperature tasks. It means that robots No.
2) TWENTY DIMENSIONS MATAP In this part, a twenty-dimensional task allocation in the epidemic scenario was illustrated. From the conclusion of the previous Section V-B, for twenty-dimensional MATAP, ''NP'' set to 100 is most suitable. So we obtained the optimal solution generated by D-DEPSO in twenty dimensions and 100 ''NP''. The optimal solution is [13,10,15,20,9,16,18,7,12,4,11,8,6,3,19,1,5,17,14,2], where the number of integer locations represents the number of the robots and the integer represents the number of the tasks. The meaning of the optimal solution is shown in Figure 7. We can know the optimal task allocation is that robots No. 16,No.20,No.14,No.10,No.17,and  Actual experiments show that D-DEPSO has good performance in twenty dimensions; it not only finds the optimal solution but also has a fast convergence speed. Limited by the length of the article, 50-dimensional and 100-dimensional MATAP in the epidemic scenario are not specifically described here.

VI. CONCLUSION
This paper proposed a discrete hybrid algorithm named D-DEPSO to handle multi-agent task allocation problems based on a task allocation model and a meta-heuristic algorithm. Based on the constraints of the epidemic scenario, a multi-agent task allocation strategy was proposed, and the task allocation problem in the epidemic scenario was introduced and defined using the mathematical model. D-DEPSO is used to minimize the loss of task allocation. D-DEPSO improves the diversity of the population by mutation operation, and it combines the discrete PSO algorithm to improve the global searching ability. The results of experiments that compare D-DEPSO with the other five algorithms demonstrate that the D-DEPSO algorithm obtains optimal solutions in different dimensions, and the running speed of D-DEPSO is faster than the discrete DE, the discrete PSO, IDE, IPSO and IGA in 100 dimensions. Thus, we think that D-DEPSO has better performance in higher dimensions. And it shows that D-DEPSO handles large-scale task allocations with significant advantages. Due to the large scale task in the epidemic scenario, D-DEPSO will be able to obtain lower losses and more diverse solutions compared to the other five algorithms.
However, this paper only considered single-objective optimization. In the future, we will concentrate on the multi-objective optimization of multi-agent task allocation problems in epidemic scenarios. More complex task scenarios will be considered in future work.