Distributed Task Rescheduling With Time Constraints for the Optimization of Total Task Allocations in a Multirobot System

This paper considers the problem of maximizing the number of task allocations in a distributed multirobot system under strict time constraints, where other optimization objectives need also be considered. It builds upon existing distributed task allocation algorithms, extending them with a novel method for maximizing the number of task assignments. The fundamental idea is that a task assignment to a robot has a high cost if its reassignment to another robot creates a feasible time slot for unallocated tasks. Multiple reassignments among networked robots may be required to create a feasible time slot and an upper limit to this number of reassignments can be adjusted according to performance requirements. A simulated rescue scenario with task deadlines and fuel limits is used to demonstrate the performance of the proposed method compared with existing methods, the consensus-based bundle algorithm and the performance impact (PI) algorithm. Starting from existing (PI-generated) solutions, results show up to a 20% increase in task allocations using the proposed method.

One challenge in using teams of robots is to co-ordinate them to perform tasks while optimizing one [13], [14] or more objectives [15]- [17]. Considering a search and rescue scenario, in which survivors need to be assisted before specified deadlines, the two main objectives are: 1) to maximize the number of rescued survivors and 2) to minimise the average waiting time before their rescue [18]. Similarly, tasks in an assembly line or manufacturing process require completion with different constraints and optimization objectives often including completion of all tasks in the shortest possible time [9]. As opposed to a factory environment, search and rescue missions often deal with very dynamic conditions, unstructured environments, and limited resources. Increasing the number of survivors and reducing waiting time are top priorities. The novel algorithm presented in this paper applies particularly to search and rescue scenarios, but applications extend to other scenarios in which similar conditions and constraints are present.
The specific problem investigated in this paper is that of maximizing task assignments in time constrained scenarios. Robots, or autonomous vehicles, can only perform one task at a time, each task requires only one vehicle to perform it and each vehicle may be assigned multiple tasks that they execute based on a schedule. Using the Gerkey and Matarić taxonomy [13], [19], this is known as the single-task (ST), single robot (SR), and time-extended assignment (TA) problem. Due to the complexity of the problem, existing heuristic task allocation methods are likely to generate a local optima solution and lack the flexibility to escape from it. Following the principle that tasks are assigned to minimize costs, we introduce a method of measuring the cost of a task assignment, called performance impact (PI)-MaxAss, that effectively shifts task assignments among vehicles to create feasible time slots for unassigned tasks where none would exist otherwise. The maximum number of reassignments can be adjusted to match performance requirements. With this method, existing task assignment solutions are iteratively improved without the need to repeat the whole task allocation procedure. The procedure follows a two-phase task assignment strategy that starts from a solution generated by an existing distributed task allocation algorithm, PI [20], that minimizes average waiting time. The proposed method PI-MaxAss is used in the second stage for maximizing task allocations.
This paper introduces PI-MaxAss, an extension of [21] with further configuration settings and new findings using This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ previously untested simulation scenarios. A new convergence guarantee method is proposed, as well as a complexity analysis. In line with the updating of terminology from [22] to [20] that more precisely reflects the contribution of the PI algorithm, the taxonomy used in [21] has also been updated for this paper.
The remainder of this paper is organized as follows. The task assignment problem and current approaches are presented in Section II. In Section III the concept of PI-MaxAss is introduced. Simulation results are presented in Section IV followed by a discussion in Section V and concluding remarks in Section VI.

A. Related Work
The search and rescue scenarios considered in this paper have similarities with the traveling salesman problem (TSP), a well-known NP-hard combinatorial optimization problem in graph theory [23]. The objectives considered in this paper are comparable to the constraints of two variants of the TSP: 1) the team orienteering problem with time windows (TOPTW) [24], also known as the multiple tour maximum collection problem and 2) the K-traveling repairmen problem (K-TRP) [25], also known as the minimum latency problem. The TOPTW considers multiple time-limited paths with the objective to maximize the total collected score over a set of vertices. Each vertex is assigned a time window and is to be visited once at most. The K-TRP tries to determine a set of tours for multiple repairmen to visit a set of customers with the objective to minimize the average time a customer must wait before a repairman arrives. These objectives and constraints are applicable to a variety of scenarios such as those found in healthcare, target tracking, pick-up and delivery, logistics, dynamic ride sharing [26], cleaning chemical spills, patrolling, checking for structural integrity of buildings [4], and any scenario that requires many urgent jobs to be completed in a minimum time by multiple agents.
Various algorithms have explored strategies to solve multiobjective TSPs or vehicle routing problems, see [16] for a survey. Paquete and Stützle [15] tackled a bi-objective TSP with a two-phase local search procedure. The first phase generates a solution that optimizes only one objective. The second phase begins the search from the solution generated in the first phase to optimize the second objective. The advantages to using this approach highlighted by [15] are to exploit the strong performance of single objective local search algorithms by chaining them together, and to maintain a flexible modularity and ease of understanding to the procedure that allows for modifications and enhancements. Heuristic methods to solve combinational optimization problems are prone to finding a local optimum [27]; however, a second search can perturb the first phase solution out of local optima to reach an enhanced solution closer to a nondominated global optimum. Algorithms previously developed to solve variants of the TSP problem, such as [15], [16], and [28]- [30], rely on computing a solution with a centralized approach.
Centralized task allocation systems, where a central server gathers information from each vehicle in the team and then computes an allocation for each vehicle, can optimize a chosen global objective based on a complete set of information from all vehicles. The drawbacks are the resulting single point of failure, and the requirement that each vehicle must have a communication link with the central server. Thus, the possible mission range is limited, and a heavy communication and computation burden is put on the central server. Distributed methods for task allocation overcome these limitations. In such cases, the task allocation algorithm runs on each vehicle simultaneously and the solution is reached through the interaction and exchange of information among them [11], [12], [31]. One of the drawbacks of distributed systems is that each vehicle has a different situational awareness, and therefore, consensus procedures are required for the team of vehicles to reach agreement.
ST-SR-TA is a combinatorial optimization problem known to be strongly NP-hard [13], [19]. A subcategory of this problem, for which the cost of a task assignment depends on the other tasks that agent is performing, is called in-schedule dependencies (ID [ST-SR-TA]). Variants of the K-TSP can be modeled under this class of problem [19]. Due to the high complexity of the problem, as the number of tasks and vehicles increases, it is usually too computationally expensive to consider each combination of tasks for each vehicle in order to find the optimal solution. The computational limitations are particularly relevant in search and rescue scenarios in which time and resources could be limited. Therefore, heuristic methods are employed to speed up the process of task allocation while maintaining an efficient and scalable algorithm [11], [28]- [30], [32].
Market-based multirobot (MR) co-ordination approaches [11] have been applied successfully to the ST-SR-TA problem to find suboptimal solutions efficiently and in a distributed fashion. With this approach, teams of self-interested agents iteratively trade tasks to maximize their own profit or minimize their costs. A cost is associated with an agent visiting a task within its path and is often measured as the total estimated use of individual resources to reach that task, such as fuel consumption, distance traveled, or time to reach the target. The local cost of an agent's path is equal to the sum of costs of each task the agent is assigned to [33], and the global cost of an agent team is the sum of costs of all task assignments in the team. An auction is a commonly used market-based approach to assign tasks [34]. The process consists of several rounds of bidding in which agents place bids on each task where the value of a bid for a task is equal to the agent's estimated cost of visiting that task. The agent wins and is allocated those tasks for which it has placed a bid lower than any other agent. The effect of using this market-based approach is that local costs and subsequently global costs are minimized [11].
Zheng and Koenig [35] developed a multirobot, distributed reallocation mechanism called K-swaps that describes multiple task exchanges among multiple agents at a time, and showed empirically that the method can optimize an existing task allocation solution by reducing team costs. Extending the idea of K-swaps, [36]- [38] introduced a decentralized task assignment algorithm considering instantaneous assignment, such that each robot is assigned exactly one task, the SR-ST-IA problem. The algorithm requires the differentiation of two roles, organizer and member robots, and can be used to optimize existing suboptimal task assignments.

B. Formal Problem Description
Consider a search and rescue scenario with n heterogeneous autonomous vehicles and m survivors. In this scenario, attending to a survivor is synonymous with executing a task. The goal is to provide targeted emergency support to the survivors as quickly as possible, e.g., some survivors may require food supplies, while others may require medical provisions. Thus in some scenarios different types of vehicles are necessary to complete different tasks. The distributed vehicles in a network rely on local communication to co-ordinate a rescue plan over multiple iterations.
In the particular scenario considered, each survivor must be visited by one vehicle in order to be deemed rescued. Each vehicle can be assigned multiple targets and will sequentially visit those targets, while not required to return to its initial location. The main challenge is to reach an optimal allocation where allocation numbers are maximized and waiting time minimized, while respecting time constraints.
To formulate the problem mathematically, a set of n heterogeneous autonomous vehicles is defined by V = [v 1 , . . . , v n ], and a set of m tasks waiting to be completed is defined by T = [t 1 , . . . , t m ]. A list of key symbols used hereafter is provided in Table I. The ordered task allocation of the ith vehicle v i is stored in a i , which can contain a variable number of tasks depending on how many tasks are assigned to v i . Each task is to be assigned to one vehicle only, or left unassigned when time constraints cannot be satisfied.
Different task types can be executed by heterogeneous vehicles with the right capabilities. Thus, each task will be assigned only to vehicles functionally capable of performing them.
A latest start time s k is defined for each task t k after which it is too late for the task to be executed successfully; it is therefore necessary to determine whether a vehicle can arrive at the location of a task t k before the latest start time s k . The objective of minimizing average waiting time measures the cost of a task assignment as the time it takes to start servicing the task from the start of the vehicle's schedule, i.e., the total time the survivor must wait before being attended to. The time cost of a task t k in a i , defined as c i,k (a i ) in [22], is the predicted time taken by the vehicle v i to arrive at the location of the task t k . This time includes the duration of earlier tasks in a i and travel time to and from those earlier tasks, but does not include the duration of the execution of t k . For this particular scenario, the duration of a task is dependent on the task type [22]. Vehicles are additionally assumed to have limited fuel capacity that restricts the time that they can be active for. All tasks must be started before the vehicle reaches its fuel capacity. The latest time at which v i can arrive at a task before reaching its fuel capacity is defined as f i . The start time of the kth task must therefore also be no later than f i such that (1) In [20] and [22], the global objective J is to minimize the average start time of all tasks, such that where |a i | is the number of tasks assigned to v i . In [20] and [22], unassigned tasks are given the highest cost and are therefore prioritized for inclusion following the inclusion criteria described later in Section II-E. The main contribution of this paper is a novel way to measure the PI of a task assignment such that an assignment's cost is correlated to the PI of tasks that it can be replaced with were it to be reassigned to another vehicle. The objective is to maximize the number of allocated tasks. Maximizing task allocations is defined as The new version of PI, which maximizes the number of assignments, is referred to as PI-MaxAss, and is the main contribution of this paper. The PI presented in [22] that minimizes average time is referred to as PI-MinAvg in order to distinguish the two.
The motivation for PI-MaxAss is to prioritize assigning the maximum number of tasks in scenarios in which time constraints severely restrict the number of tasks that can be assigned. For scenarios in which all tasks can be assigned, it is recommended to use PI-MinAvg to optimize average waiting time.

C. CBBA, Extensions, and Variations
The consensus-based bundle algorithm (CBBA) [39] is a robust and fully distributed multiassignment task allocation algorithm that employs a greedy auction strategy to enable agents to build a bundle of tasks sequentially. This task building phase is followed by a consensus procedure phase that resolves conflicting assignments. These two stages alternate until consensus has been reached by the team on all task assignments. For an analysis of CBBA's scalability, see [39]. Of the various extensions and modifications, [40] and [41] address MR task assignments and heterogeneous networks for the ST-MR-TA problem in which multiple robots may be required to service one task [13]. Choi et al. [40] addressed the case in which a task requires only one single agent, one or two agents, and exactly two agents of different type. Hunt et al. [41] proposed the consensus-based grouping algorithm that addresses the problem of multiagent multitask assignment with group and equipment-based dependencies, and which can accommodate any number of robots.
Ponda et al. [42] increased the overall efficiency of a task assignment by incorporating time windows of validity and fuel costs as part of the scoring scheme. The scoring scheme rewards agents for arriving at the optimal time for each task and for minimizing fuel consumption. Ponda et al. [42] also addressed real-time replanning for broken communication links, solving the problem of conflicting assignments when unconnected sub networks each have an agent assigned to the same task.
The consensus phase of CBBA requires synchronized communication between all agents. In a real-time dynamic environment, co-ordinating a large number of agents to communicate in sync may overburden the network and require artificially delaying the broadcast of new messages until all earlier messages have been received by the network of agents. Johnson et al. [43] extended CBBA with an asynchronous communication protocol to permit the agents to run the consensus phase of the algorithm on their own schedule. The asynchronous communication protocol also uses less bandwidth than CBBA. Ponda et al. [44] introduced CBBA with Relays algorithm that improves the team of agents' range and ensures network connectivity in a dynamic environment by utilizing agents as communication relays.
Di Paola et al. [45], [46] proposed the heterogeneous robots consensus-based allocation (HRCA) algorithm that deals with multiassignments in heterogeneous networkedteams. The algorithm consists of two outer stages. Stage 1 iterates two inner phases that closely resemble the two phases of CBBA. As opposed to CBBA, in Stage 1 of HRCA the maximum task bundle size is ignored. Stage 2 is performed only if there exist bundles exceeding the maximum limit. In this case, iterative task elimination based on least penalty is performed to resize the bundle. Binetti et al. [7], [47] developed the decentralized assignment algorithm based on CBBA and HRCA to solve the task allocation problem for assigning critical tasks for heterogeneous agents with limited capacity.
Cui et al. [48] introduced a game theory approach for task allocation. As with CBBA, the process of task allocation is split into two phases. A contract net protocol is used for the initial task allocation and a game theory approach is then used to reallocate the tasks to satisfy Pareto optimality. Smith et al. [49] extended CBBA to develop the cluster-formed CBBA to reduce the communication necessary for reaching consensus on task allocation. The communication reduction has a tradeoff of a drop in optimality of task allocation as complexity increases.

D. Performance Impact Algorithm
Whitbrook et al. [20] and Zhao et al. [22] proposed a concept called PI as an extension of CBBA. This method introduces PI, a value used by vehicles to prioritize task assignments. With PI, unlike CBBA, tasks included into a vehicle's task list can push back the execution times of later tasks in that same list, provided that all time constraints are satisfied. Likewise after a task is removed from a task list, the execution times of later tasks in the list may be shifted forward. With the PI algorithm, a vehicle does not release a task until it is reassigned elsewhere at a lower cost, i.e., once a task is assigned it does not become unassigned. PI considers not only the cost of a task assignment but also the impact of that task assignment on the cost of other assignments in the vehicle's task list. The authors demonstrate the effectiveness of PI through a simulated search and rescue scenario with a global objective to minimize the average start times of tasks with deadlines. The PI algorithm was shown empirically to solve time-critical task allocation problems that CBBA could not, and was shown to find a lower average start time compared with CBBA. Despite the improved performance, the PI algorithm still fails to solve some problems that are solvable due to converging to locally optimal but globally suboptimal solutions [20].
The PI algorithm is a distributed task allocation algorithm that runs simultaneously on each vehicle. Using the same twophase architecture as CBBA, the PI algorithm iterates over a task inclusion phase and a consensus and conflict resolution phase. During the first phase vehicles locally and iteratively build themselves a task bundle; during the second phase vehicles share their assignment lists with neighboring vehicles converged ← Check Convergence.

7:
T ← T + 1 8: end while and resolve conflicting assignments. Both phases repeatedly alternate until a global conflict-free task allocation is agreed upon by all vehicles. These main steps in an iteration of the algorithm are expressed with pseudocode in Algorithm 1.
The PI algorithm measures the local impact of a task assignment to the total cost of a vehicle's task list with the removal performance impact (RPI) and the inclusion performance impact (IPI) of a task assignment. The IPIs are computed during the task inclusion phase and determine which task to include next into a task list. The RPIs are computed at the end of the task inclusion phase and are communicated to networked vehicles during the communication and conflict resolution phase. RPIs determine which vehicle keeps a task in case of conflict.

E. PI Task Inclusion Phase
The IPI of a task t q in a i , as defined for PI-MinAvg, is measured as the time cost of t q in a i plus the sum of increase in time costs of other tasks in a i that have been assigned previously. The increase in time costs occurs if later tasks need to be shifted to create enough time to service t q . If no tasks have been assigned previously, the IPI of t q in a i is equal to its time cost, i.e., the time for v i to reach t q . This is because the sum of increase in time costs of other tasks in a i is necessarily equal to 0. Let a i ⊕ l t q be the insertion of task t q at position l in a i . The IPI of t q in a i is computed as where Equation (5) computes the IPI of t q at each position l in a i , where c i,z (a i ) denotes the time cost of the task at position z in v i 's task list. Equation (4) finds the smallest IPI and records it as t q 's IPI in a i . A list to store the IPIs of each task is kept on each vehicle and is defined as During this task inclusion phase, vehicles select tasks to include into their task lists until no more tasks can be added. This repeating process is depicted on lines 1-21 in Algorithm 2. Before including a task, the algorithm computes the IPIs of all candidate tasks t q according to (4) and (5), for each task q do 4: if task q is a candidate then 5: for each insertion position l in task list do 6: if a i ⊕ l t q is feasible then 7: Compute w q,l according to (5) 8: end if 9: end for 10: Compute w ⊕ q and position l according to (4) 11: end if 12: end for 13: Compute g from (6) 14: if g > 0 then 15: Insert task q yielding g in position l of task list 16: Update vehicle list β q = i 17: Update time costs of task list 18: else 19: break 20: end if 21: end while 22: Compute γ i (only RPIs in task list will be affected) where candidate tasks are those compatible with v i 's capabilities and not already in a i . The computation of IPIs is depicted on lines 3-12 in Algorithm 2. When there are already tasks in a i that have been assigned previously it is necessary to determine which position in the task list yields the most optimal IPI, i.e., whether it is most optimal to include t q at the start of a i , at the end, or in a position between tasks. Thus the IPI of t q is computed in each position l (lines 5-9) and the position l in which the IPI is lowest is the optimal position (line 10).
After the IPIs of all candidate tasks have been computed, v i selects for inclusion the task whose IPI can improve upon that task's current RPI the most. At this stage candidate tasks' RPIs will either have their initial value if unassigned, or an updated value received during the communication and conflict resolution phase. RPIs for all tasks are initialized to their highest permissible cost such that RPIs of tasks must be lower than this value once they are assigned. An IPI of t q in a i lower than t q 's RPI in another vehicle's task list a j indicates that the global cost can be reduced if t q is reallocated to v i . The RPI of a task t q is referred to formally as w q and each vehicle stores the vector γ i = [w 1 , . . . , w m ]. A task t q assigned to v j with an RPI greater than the IPI of t q in a i is written formally as w q (a j , t q ) > w ⊕ q (a i , t q ). Multiple IPIs may improve on the current RPIs, as such, v i selects for inclusion the task that reduces the global cost most. The maximum difference between the RPIs of all tasks and the IPIs of all tasks is computed as Line 13 in Algorithm 2 computes g according to (6). If g > 0 (line 14), the task corresponding to g is included into the vehicle's ordered task list, leading to the maximum reduction to the global cost. If g 0, IPIs of all tasks are greater or equal to the current RPIs, meaning that the current assignments cannot be improved upon, or that time constraints of candidate tasks cannot be met. In this case the task inclusion process ends (line 19).
RPIs are updated at the end of the task inclusion phase (line 22). While RPIs are constant for unassigned tasks, once assigned, the RPI is measured as t k 's time cost in a i plus the sum of the changes in time cost of remaining tasks in a i before and after the removal of t k . By removing t k from a i , v i may be able to execute its remaining task assignments earlier. The time costs of tasks earlier in the task list than t k are not affected by the removal of t k . The RPI of a task t k in a i is formally written as: where b is the position of task t k in v i 's task list, c i,z (a i ) denotes the time cost of the task at position z in v i 's task list, and a i t k denotes a i with t k removed. When a global consensus is reached, all vehicles have an identical copy of γ .

F. PI Communication and Conflict Resolution Phase
Once the task inclusion phase is complete, the RPI list and an m-sized vehicle ID list that keeps track of which vehicle is assigned to which task, are broadcast to neighboring vehicles. The vehicle ID list is necessary for consensus and is defined as β i = [β 1 , . . . , β m ]. Neighboring vehicles are those where a communication link exists between them based on a network topology. This topology may be dynamic and depend on, e.g., communication range and physical distance between two local vehicles. The vehicles communicate once per algorithmic iteration and this paper does not consider a communication cost. As two or more vehicles may be assigned the same task, the consensus procedure introduced in [39] is used to resolve these conflicting assignments. A lower RPI indicates a more optimal assignment, therefore vehicles with a higher RPI for a conflicting assignment release the task. RPIs and associated vehicle IDs are updated during consensus.
The task inclusion and conflict resolution phases repeat until no inclusions or removals can be made. At this point, the system is deemed to have converged and the task allocation procedure ends.

III. PERFORMANCE IMPACT FOR MAXIMIZING TASK ASSIGNMENTS
Simulated experiments have shown that the PI algorithm both allocates more tasks and optimizes average waiting time better than CBBA in time critical scenarios with a low taskto-vehicle ratio [20], [22]. However, preliminary experiments showed that when there is a higher ratio of tasks to vehicles, PI can fail to allocate all tasks even though it is possible to do so. Due in part to their scoring strategies, the baseline CBBA and PI do not reassign tasks when this is necessary in order to assign additional tasks. In the search and rescue scenario the safety and rescue of survivors is a high priority; a poorer Solid lines indicate tasks assigned after reaching consensus. (a) Each task assignment is labeled with its PI-MinAvg IPI or RPI. With PI-MinAvg t 1 is assigned to v 1 , t 2 is assigned to v 2 , and t 3 is left unassigned. v 2 may also include t 1 if v 2 has not yet received v 1 's RPI list. In this case v 2 releases t 1 during the conflict resolution phase due to a higher RPI than v 1 . (b) PI-MaxAss reassigns tasks starting from the PI-MinAvg solution and creates a time slot for t 3 . Each task assignment is labeled with its IPI and RPI for maximizing the number of task assignments. quality of solution results in fewer survivors being rescued than is possible with the available resources.
The new version of PI, that maximizes the number of assignments is referred to as PI-MaxAss, and is the main contribution of this paper. An early version of PI-MaxAss was presented in [21] and is extended here to include better cost scoring, convergence guarantee, and extended simulations including scenarios with battery limits only, and scenarios with task deadlines and battery limits.
Starting from a suboptimal assignment in which additional tasks cannot be directly included without violating time constraints, the extension PI-MaxAss presented in this paper is able to reassign tasks to increase the total number of allocated tasks simply through a change in the computation of IPIs and RPIs. The idea introduced in this paper is to attribute a high cost (RPI) to an assigned task when the release of this task can permit an additional task to be inserted within the free time created. An assignment is considered optimal and without cost if the release of any task does not permit another task to be assigned within the free time created. Likewise, a task's IPI is set to be without cost if it can be included into a task list and satisfy time constraints. During the conflict resolution phase, conflicts resolve in favor of vehicles offering the lowest RPI. Vehicles that can create a time slot for candidate tasks through the release of an assigned task therefore release that task during a conflict. The result is that tasks are reassigned and feasible time slots are created for unassigned tasks.
To illustrate the limitation of previous methods and the proposed solution, consider a simple scenario shown in Fig. 1(a) (a) (b) Fig. 2. Task schedules for v 1 and v 2 . A travel time is assumed between the vehicles' initial locations and between different task locations, based on the distance and speed that they can travel. A fixed task duration is also assumed. A task must be started before the deadline in order to rescue that survivor, but may end after the deadline. v 1 is the only vehicle close enough to reach t 3 in time. (a) t 1 and t 2 are optimized to minimize waiting time but t 3 is unallocated. v 1 cannot feasibly include t 3 into its schedule given t 1 . (b) If t 1 is reassigned from v 1 to v 2 , this creates the time slot for v 1 to include the unallocated task t 3 . and the associated schedule on a timeline in Fig. 2(a). With the PI algorithm, the vehicles include tasks into their lists starting with the lowest IPI. With PI-MinAvg, v 1 first includes t 1 and v 2 first includes t 2 into their task lists. Once included, t 1 cannot be released from v 1 unless v 2 includes t 1 with a lower RPI. Likewise, t 2 cannot be released from v 2 unless v 1 includes t 2 with a lower RPI. For t 3 to be serviced before its deadline, v 1 must go to t 3 directly. However, v 1 is incapable of servicing both t 1 and t 3 and meet both of their time constraints. Task t 1 does not get reassigned to v 2 because the RPI of t 1 is lower in v 1 's task list than in v 2 's task list. Therefore t 3 does not get assigned. The suboptimal task allocation is due to the minimization of waiting time performed by PI-MinAvg. The novelty in PI-MaxAss is that the cost of t 1 in v 1 's task list is higher than in v 2 's task list, causing t 1 to be reassigned to v 2 . This creates a time slot in v 1 's schedule for t 3 . Therefore, PI-MaxAss achieves the optimal allocation illustrated in Figs. 1(b) and 2(b). Although the waiting time for t 1 and t 2 has increased in Fig. 2(b), this reassignment has enabled an additional task to be assigned.

Algorithm 3
Computing RPI-MaxAsses for Tasks in v i 's Task List 1: Set RPI of tasks in a i to 0: γ i,k ← 0, t k ∈ a i 2: Identify Candidate Tasks:ψ i 3: for each task k in a i do 4: a k i = a i t k

5:
Update times c i,z (a k i ) for tasks after t k 6: for each task q inψ i do 7: if γ i,q − r > γ i,k then 8: for each position l in a k i do 9: if a k i ⊕ l t q is feasible then 10:

A. Formal Description
With PI-MaxAss, unallocated tasks are set initially to have a fixed highest RPI-MaxAss, a constant defined as U, such that if t q is unassigned then w q = U. The RPIs of assigned tasks t k are initially set to 0, such that w k = 0.
The steps of PI-MaxAss follow the two phases depicted in Algorithm 1. During the task inclusion phase shown in Algorithm 2, as with PI-MinAvg, the PI-MaxAss candidate tasks for inclusion into a i are those compatible with v i 's capabilities and not already in a i , and with an RPI-MaxAss greater than 0. The candidate tasks for inclusion into a i are formally defined as The IPI-MaxAss of t q in a i is formally defined as In other words, the IPI-MaxAss of the candidate task t q is set to 0 if there exists a position l in a i where the task t q is inserted and all time constraints are met. On line 10 in Algorithm 2, IPI-MaxAss is recorded in place of IPI-MinAvg such that w ⊕ q = 0 if the condition on line 6 returns true for at least one position l. The optimal position l is computed as it is for IPI-MinAvg, according to (4) and (5).
Lines 13-21 in Algorithm 2 remain the same for PI-MaxAss. As the RPI-MaxAss of assigned tasks were initialized to 0, only unassigned tasks are candidates for inclusion in the first round of the task inclusion phase. RPI-MaxAss is computed on line 22 in the place of RPI-MinAvg. The steps for computing RPI-MaxAss are shown in Algorithm 3.
Candidate tasks in the computation of RPI-MaxAss follow the same constraints as the candidates in (8) with the added constraint that the candidate task's RPI-MaxAss is greater than δ. This constraint is used to limit the number of reassignments permissible to allocate an additional task (see Section III-B). Candidate tasks used in the computation of RPI-MaxAss for a task t k are formally defined as The identification of candidate tasks occurs on line 2 in Algorithm 3. To compute the RPI-MaxAss of a task t k in a i , first, a temporary task list a k i is created that is equivalent to a i with t k removed and is formally defined as The creation of a k i occurs on line 4 in Algorithm 3. Next, a candidate task t q is inserted into each position l in a k i to determine if there exists a position l in a k i in which t q is inserted and all time constraints are met. If such a position l exists then t k can feasibly be replaced by t q in a i and the RPI-MaxAss of t k is computed as the RPI-MaxAss of t q reduced by r. This computation is repeated for each task t q inψ i . The list of tasks 0 i,k that can replace t k in a i while respecting time constraints is formally defined as If a task t k in a i can be replaced by two or more candidate tasks t q with different RPI-MaxAsses, the highest RPI-MaxAss is recorded. The RPI-MaxAss of a task is formally defined as The condition on line 7 in Algorithm 3 ensures that the feasibility of inserting t q into a k i is not computed if the resulting RPI-MaxAss of t k is not higher than its current value. This condition reduces unnecessary computation and satisfies finding the maximum RPI-MaxAss according to (13). The condition on line 9 checks the feasibility of inserting t q in position l in a k i so that the computation of RPI-MaxAss on line 10 is performed only with candidate tasks that satisfy (12). Fig. 3 illustrates how the computation of a decreasing RPI-MaxAss allows for multiple reassignments to create a time slot for an unassigned task, and signposts the path with the fewest reassignments. Fewer reassignments minimizes the time to reach consensus and better maintains the original solution's optimization for minimizing average waiting time.

B. Swap Distance
In a time critical scenario such as search and rescue, it may be necessary to limit the time it takes for the distributed system to converge to a task allocation. The time to converge partly depends on the number of iterations until consensus. Depending on the network topology, propagating new assignments across the network may require multiple iterations affecting the total time to consensus. Therefore, with PI-MaxAss, limiting the number of reassignments permissible to assign an unassigned task is required. A maximum number of reassignments, expressed as "Swap Distance" SD is defined. SD is a new parameter, not present in CBBA or PI-MinAvg, introduced in PI-MaxAss to limit the maximum number of reassignments. As defined by (10), a candidate task inψ i must Fig. 3. RPI-MaxAss minimizes the number of changes to existing task assignments to create a time slot for an unallocated task. In this scenario it is assumed that v 3 is the only vehicle near enough to t 4 to service it in time. t 4 is unallocated and takes RPI-MaxAss = U = 100. r is set as 10. t 3 can be replaced by t 4 according to (12) therefore t 3 's RPI-MaxAss is 100 − 10 = 90 according to (13). t 2 can be replaced by t 3 therefore t 2 's RPI-MaxAss is 90 − 10 = 80. During the task inclusion phase, v 1 can include t 3 or t 2 (without removing t 1 ) therefore t 3 and t 2 's IPI-MaxAss are 0 according to (9). Given (6), v 1 selects t 3 for inclusion as t 3 yields the greatest difference between RPI and IPI. During the communication and conflict resolution phase, v 3 releases t 3 due to having a higher RPI-MaxAss for t 3 than v 1 . During the task inclusion phase, v 3 includes t 4 . The decreasing RPI-MaxAss ensures that the minimal number of reassignments is selected when different options are available for the inclusion of an unassigned task.
have an RPI-MaxAss greater than δ which limits the number of reassignments to SD; δ is defined as In Fig. 3, U = 100 and r = 10. If SD = 0 then δ = 100 resulting in no candidates for the computation of RPI, according to (10). As a consequence only unassigned tasks have an RPI greater than 0 and can therefore be included in the task inclusion phase according to (8). If SD = 1 then δ = 90 and one reassignment is permissible for the inclusion of an unassigned task. In Fig. 3, the path that requires two reassignments in which the RPI-MaxAss of t 2 is 80 is not permissible when SD = 1. When SD = 1, t 3 does not satisfy the constraints to be inψ 2 because its RPI-MaxAss is not greater than δ, therefore the RPI-MaxAss of t 2 remains as 0. The path with two reassignments is only possible with SD = 2 (or higher). SD therefore restricts the tasks eligible to be candidates so that the number of reassignments is less than or equal to SD. Guidance on setting SD is discussed in Section V.

C. Convergence
Preliminary experiments running PI showed that two or more vehicles occasionally get caught in an infinite cycle exchanging the same tasks. In order to avoid infinite cycles and to guarantee convergence, the proposed solution is to limit the number of times that a vehicle can remove the same task from its list before it no longer attempts to include it. A maximum limit on removals ϒ where ϒ ∈ Z + can be set. This precaution may prevent those tasks that are being repeatedly exchanged from being allocated optimally, however, it ensures that the system can converge. A vector i is used to store the number of times each task has been removed from a vehicle v i 's task list. During the conflict resolution phase when a task t k has been removed from v i 's task list: i,k = i,k + 1. During the task inclusion phase, a task t k is considered a candidate in ψ i for inclusion if i,k < ϒ is satisfied.

D. Complexity
To assess the computational complexity of running PI-MaxAss on one vehicle, the method used in [22] is followed. In [22], the computational complexity of PI-MinAvg is determined to be polynomial. The complexity is dominated by the computation of IPI-MinAvg during the task inclusion phase and it is defined in [22] as where |a i | represents the cardinality of the task list a i . m i is the capacity of vehicle v i . A maximum number m i − |a i | tasks can be added into a vehicle's task list during each iteration of the algorithm. σ denotes the complexity of computing the time cost of a task. ϑ y denotes the number of tasks that are not yet in the task list and meet the compatibility constraints. ϑ y is equivalent to the cardinality of candidate tasks |ψ i | as defined in this paper. In the experiments conducted in this paper, no hard limit was imposed on the number of candidate tasks. However, such a parameter could be introduced to limit the computational cost of the task inclusion phase. The complexity of PI-MaxAss is dominated by the computation of each task's RPI-MaxAss in vehicle v i 's task list, as shown in Algorithm 3. The first step in the outer loop (for each task in vehicle v i 's task list) is to remove a task and adjust the times of the remaining tasks in the temporary task list a k i ; the complexity is |a k i |(|a k i | + 1)σ/2. Within the inner loop, the task times of each task starting from the position of the included task are computed: |a i ||ψ i |(|a k i |+1)((|a k i |+1) + 1)σ/2. Altogether this equates to |a k i |(|a k i | + 1)σ/2 + |a i ||ψ i ||a i |(|a i | + 1)σ/2. This simplifies to The RPI-MaxAss computation has a higher complexity than the RPI-MinAvg computation, but is equivalent to the complexity of computing IPI-MinAvg.

IV. NUMERICAL RESULTS
This section presents the results of numerical simulations conducted to test the performance of the proposed PI-MaxAss compared with the performance of PI-MinAvg and CBBA when maximizing allocated tasks in scenarios with time constraints. CBBA is an established benchmark for comparison in distributed task allocation problems and therefore provides a useful metric for general comparisons with similar algorithms. Thus, the evaluation of the proposed method is performed by comparison with CBBA using a range of parameter settings.

A. Scenario and Simulation Setup
To test the robustness of the proposed approach, the same types of scenarios as in [20] and [22] were used. These include scenarios with a variety of different parameters including task and vehicle numbers, and network topologies. Moreover, the parameter settings are extended in this paper to include a more challenging high task-to-vehicle ratio, and to include fuel constraints on vehicles. Preliminary experiments revealed that changing other parameter settings such as the starting positions of the vehicles, e.g., all vehicles starting from the same position, did not significantly affect the number of task allocations. The setup uses a rescue team equally split into two vehicle types with different functions. One vehicle type provides medicine, the other provides food. All tasks are considered to have equal priority to facilitate a clearer analysis of the task allocation maximization process. However, a range of priorities could be introduced in future extensions of the algorithm through an ordering of candidate tasks. The scenario specification, summarized in Table II, is as follows: the vehicles' speeds are assumed to be constant and are set to 30 and 50 m/s, respectively. The survivors are likewise equally split into those requiring food and those requiring medicine. The medicine tasks last for a duration of 300 s and the food tasks last 350 s. The deadlines for starting each rescue are uniformly distributed on a timeline between 0 and 2000 s. The mission takes place in a 3-D space spanning 10 000 m × 10 000 m × 1000 m. The tasks are randomly placed in a 3-D space, and vehicles on the 2-D ground space, with coordinates drawn from uniform distributions. The battery limit of each vehicle is set randomly between 1000 and 2000 s. Given the random initialization of task and vehicle locations and deadlines, it is sometimes impossible for some tasks to be started by any vehicle before their deadline. In these simulations, all task information is available to all vehicles up front. The task allocation procedure is performed before any tasks are executed, although previous studies have demonstrated a version of the PI algorithm that is effective at allocating new tasks online [50]. Fig. 4 compares the PI-MinAvg solutions with the PI-MaxAss solutions that are initialized with the PI-MinAvg solution. A row formation was used for these experiments and a swap distance of 2 (SD = 2) was set. Fig. 4(a) shows the percentage of runs where PI-MaxAss increased the number of allocated tasks from the PI-MinAvg solution. Fig. 4(b) shows the corresponding average percentage change and standard deviation of number of allocated tasks when PI-MaxAss changed the number of allocated tasks. For ratio p = 2 the number of tasks were 12, 16, 20, 24, and 28, and with ratio p = 4.6 task numbers were 28, 36, 46, 56, and 64. Ratio p = 2 was tested with task deadlines, ratio p = 4.6 was tested with task deadlines, with battery limits only, and with battery limits and task deadlines, respectively. In (a) each bar shows the percentage of solutions over 50 runs that PI-MaxAss assigned additional tasks starting from PI-MinAvg solution. (b) Corresponding average percentage change and standard deviation in number of allocated tasks when PI-MaxAss changed the number of allocated tasks. Fig. 4 shows both the results using the same experimental setup as in [20] and [22] with a task-to-vehicle ratio of 2 to 1 (ratio p = 2), deadlines for each task and without battery limit time constraints, and results using a task-to-vehicle ratio p = 4.6 with task deadlines only, vehicle battery limits only, and combined task deadlines and battery limits, respectively. Ratio p = 4.6 was selected to test the system approaching maximum capacity. In [20] and [22] experimental results showed that PI-MinAvg was capable of finding a solution that maximized the number of allocated tasks in most cases. The ratio p = 2 results in Fig. 4(a) reflect these findings. For each of the five setups with ratio p = 2, PI-MaxAss increased the number of allocated tasks from the PI-MinAvg solution; in the best case 14% of the runs were improved upon. In each run that PI-MaxAss increased the number of allocations (starting from PI-MinAvg with p = 2), one extra task was allocated. The results for ratio p = 4.6 show that when the system is approaching maximum capacity, i.e., when the order and allocation of tasks is critical to optimize number of allocated tasks, PI-MaxAss increased the number of task allocations in approximately half the runs with battery only time constraints and in up to 100% of runs with task deadlines. Up to three extra tasks were assigned in runs with battery only time constraints. Up to eight extra tasks were assigned in runs with task deadlines with ratio p = 4.6. In one such instance, PI-MaxAss increased the number of allocated tasks from 44 to 52 out of 56 tasks, where four tasks were impossible to allocate from the outset due to their relative positions and deadlines. In other words, PI-MaxAss facilitated an 18% increase in allocated tasks achieving the maximum allocation. In another instance, a 20% increase was achieved by increasing the number of allocations from 35 to 42 out of 46 tasks.

1) PI-MinAvg Versus PI-MaxAss:
Over the 2000 runs, in six cases PI-MaxAss modified the solution by reassigning tasks without increasing the total number of assigned tasks. In all other instances that the solution was modified, the number of allocations was increased.
2) Swap Distance Parameter Comparison: Fig. 5 shows the results of a comparison between the performance of CBBA, PI-MinAvg, and PI-MaxAss with swap distance set between 1 and 4. The performance with regards to number of allocated tasks and number of iterations until convergence is presented. The total iterations for one simulation is determined by the last time an allocation change was made, either through inclusion or removal. As the PI-MaxAss solutions are initialized with the solutions from PI-MinAvg, the number of iterations for a run of PI-MaxAss is the sum of iterations taken for PI-MinAvg and PI-MaxAss, so PI-MaxAss will necessarily be at least as high as PI-MinAvg in all instances. Fig. 5(a) is a box plot [51] that shows the total number of allocated tasks for each algorithm. Fig. 5(b) is a box plot that shows the corresponding total number of iterations for each algorithm. The notches in the plots show that increases in allocated tasks between CBBA, PI-MinAvg, PI-MaxAss with SD = 1 and SD = 2 are statistically significant, and are correlated with an increase in iterations. For SD = 3 and SD = 4 there is an increase in iterations without a significant increase in task allocations compared with SD = 2. Table III shows that when the swap distance is limited to 1, an average of 3 extra tasks are allocated from the PI-MinAvg solution (shown in the table in the supplementary material) and the number of iterations has 95% confidence of being between the intervals 7.86 and 9.42 (not counting the iterations for PI-MinAvg). The tradeoff is just over 1 fewer allocated tasks on average compared with SD = 2. As the swap distance increases, the confidence intervals for the number of iterations also widen.
3) Average Time Comparison: Fig. 6(a) plots a comparison of the average waiting time and allocations for each run using SD = 2. Fig. 6(b) plots the same results using the starting solution of PI-MaxAss and shows the effect of switching back to optimizing waiting time after increasing allocated tasks with PI-MaxAss. Here, PI-MinAvg was initialized with the solution of PI-MaxAss. Average waiting time logically increases as more tasks are performed. This increase is reflected in the graphs that show a proportional increase in average waiting time between CBBA, PI-MinAvg, and PI-MaxAss. Fig. 6(b) shows that average waiting time can be optimized with PI-MinAvg after allocations have increased with PI-MaxAss. In

4) Topology Comparison:
Changing topologies are inherent to dynamic environments with moving vehicles. It is therefore informative to assess how the proposed method performs across different topologies [12]. Fig. 7 illustrates with nondirected graphs the different network topologies under which the system was tested. Fig. 8 shows the results of comparing different vehicle formation topologies in terms of the number of allocated tasks and iterations. The row topology, circular topology, the fully connected topology and the star topology illustrated in Fig. 7 are compared. The number of allocated tasks is consistent across topologies for CBBA and similar across topologies for PI-MinAvg and PI-MaxAss with SD = 2. Notable differences are the reduced number of iterations for each algorithm with the fully connected topology and the relative increase in iterations for the star topology for each algorithm.

V. DISCUSSION
The results show that PI-MaxAss can significantly improve the total number of allocated tasks starting from a suboptimal solution. There is a tradeoff between computation time and solution quality that should be considered depending on the application [15]. Note that computation time here is represented by the number of iterations, while in practice the processing speed and the communication speed of the agents will determine how long an iteration lasts. If extra computation time is available, the results show that switching optimization objectives from minimizing average waiting time to maximizing task allocations can break the solution out of local optima and further optimize the task allocation without reducing the quality of the solution. After more tasks have been included, the quality of the solution can then be optimized further with few iterations by switching back to the time minimization method. This switching strategy as described in [15] exploits the high optimization performance of single-objective search algorithms for a bi-objective problem, while remaining flexible and modular. In the cases where PI-MinAvg was able to reach an optimal or near optimal solution with regards to the number of allocated tasks, such as the two tasks-per-vehicle scenario, PI-MaxAss made few or no improvements on the PI-MinAvg solution and accordingly the computation time was not unnecessarily increased. These results further support the switching strategy [15] which increased computation time only when the solution could be improved by the proposed method PI-MaxAss.
The results show that a swap distance limited to 1 is preferable when a reliably low number of iterations is required while still providing a significantly higher number of allocated tasks. A higher swap distance can be used if the extra computation time is available to increase the likeliness of finding a better solution. On the other hand, although PI-MaxAss is guaranteed not to decrease allocations starting from an initial task allocation, it cannot be guaranteed that PI-MaxAss with a higher swap distance finds an equal or higher task allocation than a lower swap distance.
For the scenarios tested, a swap distance of 3 or 4 did not significantly increase the allocations despite the correlated increase in iterations. For each additional task reassignment, the new task allocations are propagated through the network of vehicles, and this can take several iterations depending on the network topology meaning that, as the number of reassignments increases, so do the number of iterations. It is also likely that the number of instances where 3 or 4 reassignments are required are fewer than those requiring 1 or 2 reassignments. This may result in an insignificant increase in task allocations along with a relatively high increase in number of iterations.
PI-MaxAss was shown to be effective at increasing allocated tasks when the time constraint was on vehicle battery limits only. In these cases, the extra flexibility in the possible ordering of task allocations meant that PI-MinAvg was more likely to find an optimal solution, however, PI-MaxAss increased the allocations in about half of the runs, a noteworthy proportion.
In 0.3% of 2000 runs, PI-MaxAss modified the solution by reassigning tasks without increasing the total number of assigned tasks. This may happen because an additional task allocation attempt may be inhibited if a time slot created to assign a new task is instead filled by a task later in that reassignment sequence.
Tests with different topologies provided strong evidence that the number of allocated tasks is independent of the specific topology. The number of iterations required to reach consensus, on the contrary, appears to vary according to the type of topology. The increase in iterations is due to information requiring multiple iterations or "hops" to reach all vehicles when the network is not fully connected. In general, the longer the network diameter, i.e., the shortest path between the two most distant vehicles, the longer the system takes to reach consensus.
The task shifting effect of PI-MaxAss is similar to the theoretical task swap loop methods described and analyzed in [35]- [38] and [52]. Compared with these methods, PI-MaxAss has the advantage that it does not require distinguishing roles. Furthermore, PI-MaxAss does not require finding a complete swap loop to reassign tasks. As opposed to the task swap loop methods, with PI-MaxAss the last task reassignment in the sequence need not be assigned to the vehicle that started the sequence. By following the task swap loop strategy, the created time slot is more likely to be filled by the task being reassigned from another vehicle, inhibiting the assignment of an additional unassigned task. A final distinction is that the objective of PI-MaxAss is to increase the number of task assignments within vehicles' schedules, whereas the costs being minimized in [36]- [38] are nonspecific, and the problem being addressed considers vehicles that can be assigned one task each, at most.

VI. CONCLUSION
In a search and rescue mission, optimal task allocation for available vehicles is crucial. In this paper, an effective algorithm that allows for simple and efficient reassignment of allocated tasks is proposed and analyzed to improve the task allocation solution of a previous method for task allocation. The novel idea is to allow vehicles to reallocate tasks to create a feasible space for unallocated tasks by taking advantage of existing schedule space. Simulations showed a noteworthy increase in performance, measured as the total number of allocated tasks, making the method appealing when this objective is a priority. An increment in the number of iterations appeared proportionate to the gain in performance. Experimental results confirmed that the proposed algorithm can be applied beneficially to an existing scheduling method, thus opening the possibility of integration to other implementations.
Future work will look at restricting the number of iterations used to reach consensus while maintaining the solution quality, as well as implementation in more realistic testing scenarios that could include having the vehicles returning to a base to refuel. Joanna Turner received the B.Sc. degree in computer science from Loughborough University, Loughborough, U.K., in 2013, where she is currently pursuing the Ph.D. degree.
Her current research interests include multiagent task allocation, machine learning, and biologically inspired artificial intelligence.
Qinggang Meng (M'06) received the B.Sc. and M.Sc. degrees in electronic engineering from Tianjin University, Tianjin, China, and the Ph.D. degree in computer science from Aberystwyth University, Aberystwyth, U.K.
He is currently a Reader in robotics and autonomous systems with the Department of Computer Science, Loughborough University, Loughborough, U.K. His current research interests include biologically and psychologically inspired learning algorithms and developmental robotics, robot learning and adaptation, autonomous vehicles/systems, multi-UAV/UGV cooperation, service and assistive robotics, situation awareness and decision making for driverless vehicles, verification and validation of autonomous systems, driver's distraction detection, human motion analysis and activity recognition, activity pattern detection, pattern recognition, artificial intelligence, machine learning, deep learning, and computer vision.