Task Planning on Stochastic Aisle Graphs for Precision Agriculture

This work addresses task planning under uncertainty for precision agriculture applications whereby task costs are uncertain and the gain of completing a task is proportional to resource consumption (such as water consumption in precision irrigation). The goal is to complete all tasks while prioritizing those that are more urgent, and subject to diverse budget thresholds and stochastic costs for tasks. To describe agriculture-related environments that incorporate stochastic costs to complete tasks, a new Stochastic-Vertex-Cost Aisle Graph (SAG) is introduced. Then, a task allocation algorithm, termed Next-Best-Action Planning (NBA-P), is proposed. NBA-P utilizes the underlying structure enabled by SAG, and tackles the task planning problem by simultaneously determining the optimal tasks to perform and an optimal time to exit (i.e. return to a base station), at run-time. The proposed approach is tested with both simulated data and real-world experimental datasets collected in a commercial vineyard, in both single- and multi-robot scenarios. In all cases, NBA-P outperforms other evaluated methods in terms of return per visited vertex, wasted resources resulting from aborted tasks (i.e. when a budget threshold is exceeded), and total visited vertices.


I. INTRODUCTION
Autonomous agricultural mobile robots are becoming increasingly more capable at performing persistent missions such as monitoring crop health indices [1] and sampling specimens [2] across extended spatio-temporal scales to enhance efficiency and productivity in precision agriculture [3]. An autonomous robot (or a team of them) needs to perform certain tasks in distinct locations of the operating environment subject to a specific budget [4] on the actions the robot can take (e.g., a maximum capacity of soil samples to carry [5]). During in-field operations, the actual costs to complete tasks can be uncertain whereas expected costs may be known. In addition, some tasks can be more urgent than others, hence they will have to be prioritized. It is often the case [3,6,7] that there exists some prior information about a required task (e.g., older measurements of soil moisture [8]) that can bias robot task assignment(s). Hence, it is necessary to develop approaches that utilize limited prior information to plan tasks with uncertain costs and priority level.
There exist two key challenges for efficient robot task allocation in precision agriculture. First, prior maps can indicate biases in task assignments, but may not be trustworthy. This is because conditions in the agricultural field can change rapidly [9], are dynamic [10,11], and may be hard to predict 1 Dept. of Electrical and Computer Engineering, University of California, Riverside. Email: {xkan001, karydis}@ucr.edu. 2 Dept. of Computer Science and Engineering, University of California, Merced. Email: {tthayer, scarpin}@ucmerced.edu.
We gratefully acknowledge the support of NSF under grants #IIS-1724341, #IIS-1901379 and #DGE-1633722, and USDA-NIFA under grants #2017-67021-25925 and #2021-67022-33453. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies. ahead of time [12]. Second, as the budget is being depleted, the robot needs to periodically return to a base station (e.g., to drop collected samples and/or recharge). Addressing these two challenges simultaneously poses a two-layer intertwined decision making under uncertainty problem: How to perform optimal sampling given an approximate prior map, and how to decide an optimal stopping time (i.e. to return to base) to avoid exceeding a given task capacity? This paper introduces a new stochastic task allocation algorithm to balance optimal sampling and optimal stopping when task costs are uncertain.
A direct approach for persistent sampling (and/or monitoring) is to survey the entire space and perform the desired task(s) sequentially [13][14][15]. The main drawback is that the robot would then exhaustively visit all desired sampling locations in the environment without prioritizing locations that would yield a higher gain or would be more time-critical. Orienteering [16][17][18][19] can address part of this drawback by determining paths that maximize the cumulative gain under a constant budget. The robot prioritizes visiting adjacent locations if they jointly yield higher gains than isolated high-gain locations, and provided that any budget constraints are not violated [16,17]. However, this strategy can be insufficient for missions where some tasks are more urgent than others. For instance, several existing robot task allocation strategies, albeit for distinct application domains [20][21][22][23], typically consider a deadline [24] or user-defined importance levels. In precision agriculture, overhead imagery (e.g., thermal imaging) can help pinpoint locations that appear to be under water stress [9], in which case sampling leaves or soil in those areas should be prioritized. We formalize the notion of tasks with distinct urgency (e.g., a closer deadline or greater importance) by assigning a priority level [25,26] to tasks.
Besides the task priority level, deciding a next task for a robot to complete is also dependent on available budget, which can be of multiple types. For instance, the number of locations that a robot can visit and sample from in one 'trip' is constrained by both the energy capacity to move between locations and the robot's sample payload capacity. Exceeding the energy budget can prevent the robot from returning to the base station to recharge and drop collected samples, whereas exceeding the sample payload capacity may cause potential robot and sample damage. Here we consider an energy budget for the robot moving between locations, and a resource budget linked to task execution. The two budgets are independent of each other, and both can be reset to their initial values when the robot returns to the base station. The actual amount of resources consumed to execute a task can differ from what is the expectation in practice. In fact, the actual amount of resources consumed for task execution is revealed only after the task has been completed. To model this, we consider the cost to complete a task to be a stochastic random variable that follows some known distribution. The cost to move between locations, however, is considered to be deterministic [8]. Specific details are given in the following. This paper introduces a new stochastic task allocation approach, termed Next-Best-Action Planning (NBA-P), for task planning under uncertainty in precision agriculture. The paper also contributes a new Stochastic-Vertex-Cost Aisle Graph (SAG). SAG is an extension of the aisle graph [8,27], which is often used to describe agriculture-related environments. The main novelty of SAG is that it can represent uncertain task costs. Using SAG, our proposed NBA-P algorithm simultaneously determines 1) how to optimally schedule which tasks to perform at run-time, and 2) when to optimally stop performing new tasks and return back to the base station also at run-time. NBA-P ensures that urgent tasks are prioritized subject to both energy and resource budgets. In addition, it can be extended to multi-robot teams. We test our method in single-and multi-robot scenarios using both simulated data and 10 real-world datasets collected in a commercial vineyard at central California. In all cases, NBA-P achieves higher efficiency than naive lawnmower, informed lawnmower, and series Greedy Partial Row planners [28][29][30] in terms of more return per visited vertices, less resources wasted because of aborted tasks, and less total visited vertices.

II. RELATED WORK
Aisle graphs [8,27] can model motion constraints emerging when robots navigate in structured environments such as agricultural fields. Vertices denote possible task locations, and edges represent connections between locations. Any two rows connect to each other only via the two end vertices. Moving backwards is not allowed. Hence, if a robot enters a row, it will have to reach the row's other end then to move to any other row. In the original aisle graph formulation [8,27], vertices and edges are associated with known and constant reward and movement costs, respectively. Our proposed extension, SAG, can also represent uncertain task costs.
Orienteering can tackle persistent sampling on aisle graphs. Even with motion constraints introduced via aisle graphs, orienteering remains an NP-hard problem and thus greedy heuristics are often employed [8]. Recent efforts on stochastic orienteering associate stochastic costs to graph edges and propose a time-aware policy for a robot to adjust its path to avoid exceeding a certain budget [31]. However, addressing cases that involve uncertain task cost on vertices for aisle graphs remains open. Our proposed NBA-P tackles the problem by simultaneously considering uncertain task costs on vertices and deterministic costs on edges.
The optimal stopping framework [32] can be used to investigate the (optimal) criteria to terminate a process while incorporating uncertainty [33]. In most cases [34][35][36], data arrive in sequence, and irrevocable decision has to be made as to when the expected return is maximized. Optimal stopping has been used in robotics applications like target tracking [37] and marine ecosystem monitoring [38]. However, no motion constraints, like those imposed by aisle graphs, apply to robot actions, and hence existing methods cannot be ported over to operations on aisle graphs. Paths planned with NBA-P fill the gap, as they directly apply to environments with motion constraints captured by aisle graphs.
The proposed method applies when: 1) the motion constraints in the application environment can be captured by a SAG; 2) the cost of completing tasks follow exponential distributions; and 3) the obtained gain by completing a task is proportional to the actual task cost.

III. STOCHASTIC TASK ALLOCATION PROBLEM SETUP
We first define the Stochastic-Vertex-Cost Aisle Graph (SAG), to incorporate uncertain task cost on vertices. Then, we present this paper's problem setup utilizing SAG.

A. The Stochastic-Vertex-Cost Aisle Graph (SAG)
We propose SAG as a way to extend the original aisle graph [8,27] to handle missions consisting of tasks with priority levels and stochastic execution costs. There are three main differences between SAG and the original aisle graphs. 1) SAG considers stochastic costs for task execution at vertices. 2) Vertices in SAG are associated with task priority levels.
3) The gain, which describes the benefit of completing a task, is proportional to the actual resource consumption if the task is fully completed. With more resource consumption, higher gain could be obtained, e.g., higher quality information during soil sampling process, or better field hydration/irrigation results. Note that no gain will be obtained if 1) the resource budget is exceeded during task execution, and the task is aborted, or 2) a robot only passes through a vertex on its way without performing a task. In contrast, in the original aisle graph rewards are constant and can be collected immediately when passing through vertices.
Given a field that contains m rows and n columns (where n denotes the total number of possible sampling locations in each row), its SAG representation is an (undirected) graph A s (m, n+2) = (V, E) where V and E are the sets of vertices and edges, respectively. Note that in the graph representation we add two additional 'virtual' columns at indices j = 0 and j = n + 1 that connect the m rows; virtual vertices carry no gain. An example of a A s (3, 5) graph is given in Fig. 1.
The set of edges E is built as follows: 1 each of which has two edges. Set S contains all priority levels in A s . Let c v : V − → R ≥0 and c e : E − → R ≥0 be the costs for task execution at vertices and movement on edges, respectively. The actual resource consumption to complete a task at vertex v ∈ V follows an exponential distribution, c v (v) ∼ Exp(w s ), wherew s is the mean cost of all tasks with priority level s ∈ S. The actual task cost is not known before task completion, and is independent between tasks at different locations. The cost of movement on edges is a known constant. Function f : V − → S returns the priority level of a vertex, and implies that the task at v i2,j2 is more urgent than the task at v i1,j1 . In other words, if the same amount of resources is consumed at v i1,j1 and v i2,j2 , higher gain is obtained at v i2,j2 . Once a task is completed, its priority level is set to 0.
Let r : V − → R ≥0 be the actual gain obtained when completing a task. Function µ : S − → R >0 maps each priority level to a deterministic positive value, which indicates the gain-to-cost ratio of completing a task of given priority level. Then, Priority levels can be user-defined or estimated via any prior environment maps. The latter can be determined based on collected data, e.g., difference between ideal and sampled soil moisture levels [8]. However, prior information may be approximate and thus lead to suboptimality if directly set as priority levels for vertices. A way to assign priority levels from prior information is to set thresholds so that data within a range yield the same priority level. Only same types of tasks with same expected cost can be set at same priority level.

B. Stochastic Task Allocation on SAGs
A mission on SAG A s (m, n + 2) = (V, E) comprises tasks located at v ∈ V T ⊂ V . Given energy budget for moving along edges and resource budget for executing tasks on vertices, to complete all tasks in V T so that: • C1: Tasks are prioritized according to priority level. • C2: The number of tasks being aborted because of exceeding the resource budget (at run-time) are minimized. C1 enforces the time-critical decision making, whereas C2 ensures efficiency of mission completion. When a task is aborted, no gain is obtained and both consumed resources and energy spent moving to that vertex are wasted. Aborting tasks will also cause delays on mission completion time. To avoid exceeding the resource budget at run-time, the robot thus needs to determine an optimal stopping time. Its next action should be to either 1) perform another feasible task of the highest possible priority level (which we describe how to set next), or 2) stop performing tasks and return to the base station. Since the actual cost is unknown before completing a task, the next action and corresponding paths are determined in an adaptive manner based on remaining budget at run-time.

IV. PROPOSED TASK PLANNING ALGORITHM
Our proposed Next-Best-Action Planning (NBA-P) approach balances sampling feasible vertices on SAG and determining when it is preferable to exit (i.e. return to base station) based on remaining resource and energy budgets. When sampling feasible vertices, we use a three-phase approach. Phase 1: sample feasible vertices subject to resource budget; Phase 2: sample feasible vertices from phase 1 subject to energy budget; Phase 3: select a row to proceed and plan corresponding paths. When sampling in phase 1, we start from the highest priority level that currently exists. If either phase 1 or phase 2 returns no feasible vertex, we decrease the examined priority level until either feasible vertices are found, or the examined priority level reaches 0, in which case it is optimal to exit. This strategy ensures that tasks with higher priority level are prioritized when possible.

A. Phase 1: Feasible Vertices Subject to Resource Budget
To tackle the stochastic task cost, we formulate the nexttask selection subject to resource budget as an optimal stopping problem. We employ a one-stage-look-ahead rule: if it is better to return to base station directly than to perform one more task of any priority level then return, then return at current time. In this phase, we do not need to consider the actual robot position. Let p and q be the remaining resource budget and the total gain in the current 'trip' (i.e. operation since last visit to a base station), respectively. 3 If a task of priority level s ∈ S consumes x amount of resources, the return is µ(s)x. Then, in a dynamic programming framework, with (p, q) the state, the expected return function, Φ(p, q), is where λ s = 1 ws . When p > 0 (i.e. some resource is available), a task of priority level s ∈ S which maximizes the return is selected. Otherwise, no task can be completed and the total return remains the same as q.
1) Single Priority Level for All Tasks: We start with the case that all tasks in the mission have the same priority level, |S| = 1. In this case, we only need to determine the optimal time to exit. According to (1), for s ∈ S, the state (p , q ) is on the optimal stopping boundary if since continuing to perform another task will not result in higher expected return. Hence, the robot should exit if the current state (p, q) satisfies p < p and q > q , i.e. all tasks are infeasible. Solving (2) leads to Defining function g : (R ≥0 , S) → R ≥0 , (p , s) → q based on (3) represents the optimal stopping boundary curve for a given priority level. Thus, it is optimal to exit at state (p , q ) when q ≥ g(p , s) given a priority level s ∈ A s . Definition 1. A task of priority level s is feasible for the current state (p, q), if (p, q) lies below the optimal stopping boundary curve g(p , s) (Fig. 2).
2) Multiple Priority Levels Across Tasks: If |S| > 1, the robot determines the candidates with highest possible priority level allowed by the remaining budget. The optimal strategy is to examine the feasibility to perform a task of priority level s = max(S), and then decrease s until a feasible task is found. If no feasible task exists until s = 0, then the optimal decision is to return back to the base station.
When multiple priority levels exist, it is not always true that tasks with higher priority levels must be performed before any lower priority rank tasks. To maximize the expected return in one 'trip' (i.e. between two times that a robot visits the base station), when the remaining resource budget is not enough for high priority level tasks, a task with lower priority level can potentially be selected to be performed next. However, in some scenarios, a lower priority task will never be selected prior to a higher priority task. Lemma 1. At state (p, q), given that tasks with priority level s ∈ S are infeasible, then all tasks with s ∈ S and s < s must be infeasible ifw s ≥w s . Proof. Let s 1 , s 2 ∈ S : 1 ≤ s 1 < s 2 ≤ max(S) and µ(s 1 ) < µ(s 2 ). The mean costs of s 1 and s 2 tasks arew s1 andw s2 , respectively. Ifw s1 ≥w s2 , for any p > 0, g(p, s 2 ) > g(p, s 1 ). Hence, if a s 2 priority level task is infeasible at state (p, q), i.e. q ≥ g(p, s 2 ), then q ≥ g(p, s 2 ) > g(p, s 1 ), and s 1 tasks are infeasible too (Fig. 2(a)).
• Condition 1 : • Condition 2: ∃p 0 > 0, such that For Condition 1, a state (p, q) above the curve g(p, s 2 ) can be still below the curve g(p, s 1 ), e.g., point b 2 in Fig. 2(b). In this case, s 1 tasks should be performed next even if there still exist s 2 tasks. For Condition 2, when p > p 0 , the situation is the same as described above for Condition 1. When 0 < p ≤ p 0 we reduce to the conditions of Lemma 1, in which case given that s 2 tasks are infeasible, s 1 tasks must be infeasible (an example is point c 1 in Fig. 2(c)).
Let Q 1 be the set containing all feasible vertices subject to a given resource budget. We propose Algorithm 1 to determine Q 1 at a state (p, q). If Q 1 = ∅, the robot returns to the base station. Otherwise, all vertices in Q 1 will continue to be examined in Phase 2 subject to a given energy budget.

B. Phase 2: Feasible Vertices Subject to Energy Budget
From Q 1 , we continue sampling vertices that satisfy the energy budget constraint. Suppose a robot is at vertex v ic,jc ∈ V , and a vertex v i,j ∈ Q 1 is a candidate to be examined. if q < g(p, s) then 5: for v ∈ V T do return Q 1 , s 16: end procedure Without loss of generality, suppose two base stations are located on row i d , each at one of the end vertices v i d ,0 and v i d ,n+1 . The robot can reset at either one. Vertex v i ,j is feasible if the current remaining energy budget T allows the robot to move to v i ,j then to any base station. Let t α (i ) be the cost to move from v ic,jc to an end node-either on column 0 or n+1 depending on the robot's moving direction in current row i (recall backward motion is not allowed). Let t β (i ) be the cost to move between two end vertices v i ,0 and v i ,n+1 in row i , and t γ be the cost to move from the end vertex on row i closest to the robot along its direction of motion to the closest base station. Then, v i ,j is feasible if 4 If v i ,j can be reached, all vertices on row i must be reachable, since t α (i ) + t β (i ) + t γ (i ) only depends on row i . Costs t β and t γ are fixed for each row and can be precomputed prior to deployment. Cost t α is computed at run-time. The set containing all vertices that satisfy both budgets is

C. Phase 3 and Proposed Algorithm
Phase 3 can be reached if Q 2 = ∅. Note that all tasks at vertices in Q 2 have the same priority level s and hence the same expected costw s . Therefore, sampling the next vertex turns into selecting a row i which consists of one or more tasks of priority level s. Then, the robot will perform the first encountered feasible task while moving along row i.
Suppose the robot is currently at v ic,jc with state (p, q).
where Q 2 (i) = {v i ,j ∈ Q 2 |i = i}, andw s is the mean cost of feasible tasks. By (7), the robot is expected to complete more tasks in row i than any other row. Thus, row i should be the row that contains the largest number of feasible tasks permitted by remaining budget q, according to the expected costw s . If multiple rows return a tie, then the row closest to the robot's current position will be selected as per (8).
The proposed Next-Best-Action Planning (NBA-P) approach is formalized in Algorithm 2. NBA-P can be extended to apply to multi-robot teams by sequentially determining the next best action for each robot. In multi-robot implementation, each robot runs NBA-P independently and in parallel, and exchanges information only about the row it currently occupies. For each robot, Q 2 has to be modified by removing all vertices in those rows that are occupied by other robots. Note that similar to [16], multiple robots can travel simultaneously along the vertical columns 0 and n + 1, since space on the boundary of a field is typically much larger. To study the efficiency and effectiveness of the proposed approach, we test with 1) simulated data in a 2-robot scenario, and 2) data collected from a real-world vineyard in 1-and 5-robot scenarios. Testing with simulated data enables parameter tuning so as to study the properties of NBA-P, whereas testing with real-world data reveals the spatial pattern of real tasks that exist in agricultural fields. In both cases, NBA-P is compared against lawnmower planner [28], which is often seen in agriculture-related applications [29,30]. In naive lawnmower (N-LM), a robot follows meandering paths to survey rows in sequence. When no budget constraint is considered, and when departing from a corner in a square environment, lawnmower will generate the shortest path to survey the entire field in the sense that each vertex is visited only once. In experiments using real-world datasets, NBA-P is also compared against informed lawnmower (I-LM) and Series Greedy Partial Row (S-GPR) [16]. I-LM attempts to complete a task if the remaining budget is greater than the expected cost of a task. S-GPR is modified to use both energy and resource budgets. In multi-robot cases, each robot runs N-LM, I-LM, and NBA-P independently and in parallel to each other; in S-GPR robots plan their trajectories sequentially. All vertices in occupied rows turn infeasible for other robots so that each row has one robot performing tasks.

A. Testing with Simulated Data and Parametric Study
We consider a simulated environment A s (20, 17) of 20 rows and 17 columns (including the two virtual columns). Base stations are located at v 10,0 and v 10,16 . The cost to move on each edge is 1. Consider two robots deployed from base station v 10,0 to complete all tasks. Each robot departs with energy budget 80, and resource budget 40. 5 A vertex is considered "visited" if the robot stops at the vertex and attempts to perform a task, regardless whether the task is ultimately completed or aborted. If the resource budget is exceeded before task completion, the task will be aborted, the resources already consumed for this task are considered to be "wasted," and no gain will be obtained. The total gain will be the sum of the actual gain, which is proportional to the actual task cost, at all task-completed vertices. Two cases are studied, 1) S = {1}, i.e. all tasks have equal priority level, and 2) S = {1, 2}, i.e. two priority levels exist, hence tasks with s = 2 will be prioritized. In case 1, µ(1) = 1,w 1 = 2; and in case 2, µ(1) = 1, µ(2) = 2,w 1 = 1.5,w 2 = 2. For each case study, 10 trials are conducted. In each trial, 225 tasks are randomly assigned to 225 vertices in A s , with randomly generated task location and actual task cost. In each trial, the proposed method and the N-LM method are tested on the same simulated environment.   Figures 3(a) and (c) show the percentage of obtained gain over ground truth total gain as a function of visited vertices (shortened as r/v ratio). Figures 3(b) and (d) show the total wasted resources because of aborted tasks as a function of visited vertices (shortened as w/v ratio). Total wasted resources are the sum of resource consumption for all aborted tasks. Total gain, total wasted resources and visited vertices correspond to the sum of those values from both robots. Results suggest that all tasks are completed, and the robots return to the base station. Table I contains the mean and one standard deviation of r/v ratio, w/v ratio, and total visited vertices over 10 trials. Larger r/v suggests higher efficiency since more gain is obtained by visiting the same number of vertices, i.e. same number of attempts to execute tasks. Lower w/v indicates lower rates of aborted tasks, i.e. less resources are wasted by visiting the same number of vertices. Higher r/v, lower w/v, and less total visited vertices are desired, and these conditions together indicate higher overall effectiveness.  Fig. 3 and Table I suggest that, in both cases, NBA-P achieves higher r/v ratio, lower w/v ratio, and less total visited vertices than N-LM. When |S| = 1, since all tasks have the same priority level, the higher r/v ratio of NBA-P is mainly due to the optimal stopping strategy that helps prevent aborting tasks. When |S| = 2, the higher r/v is due to both the optimal stopping and priority-driven strategies. This can be observed by the steep slope at the beginning of the curve of our proposed method in Fig. 3(c), during which time tasks with priority level 2 are prioritized. The high rate of tasks being aborted in N-LM is the reason why the total visited vertices for N-LM are more than for NBA-P. For N-LM, the robot will attempt to perform a task if there is still remaining resource budget. However, if the budget is exceeded during task execution, the task will be aborted and the vertex needs to be re-visited. Setting multiple priorities may be useful but this needs to be carefully tuned as having too many priority levels can make the process inefficient in practice, forcing the robot to move across the field to reach the tasks of the next highest priority level.

Results in
Reducing the rate of aborted tasks can increase efficiency. We continue to study the influence of µ andw s to the rate of aborted tasks, in which case the energy budget does not need to be considered. Starting with |S| = 1, and assuming there exist infinite tasks, we need to determine the optimal time to stop performing more tasks. Figure 4 (left) shows the relation between aborted tasks (ratio of occurrences over 1000 trials) and the ratio of initial budget over mean task cost w 1 , for different values of µ. Results suggest that aborting a task is barely influenced by the value of µ. If the initial budget is close to the mean task cost, the rate of aborted tasks can be as high as 50%, and the optimal stopping rule is less effective. If the initial budget is more than 50 times the mean task cost, the rate of aborted tasks reaches 0. Given that µ barely affects the rate of aborted tasks (Fig. 4 (left)), we examine in the case that |S| = 2 the influence of the ratio between initial budget and task mean cost for each priority level. We assume there exist infinite tasks of priority level 1 and 2. The goal is to determine if it is better to select another task of priority level 1 or 2 to perform, or to stop. Figure 4 (right) suggests that, regardless of the relation betweenw 1 andw 2 , the rate of aborted tasks is more influenced by the ratio of initial budget overw 2 . This is intuitive since tasks of priority level 2 are prioritized over tasks of priority level 1. Since more s = 2 tasks are performed if possible, it escalates its influence to the rate of aborting tasks. Thus, when higher priority tasks exist, the initial budget can be set by considering expected cost of high priority tasks, as the expected cost of low priority tasks do not have much impact when energy is sufficient. The proposed method is more suitable when the mean cost of tasks is small enough compared to the initial (resource) budget.

B. Testing with Real-world Field Data
The real-world datasets used here contain soil moisture values collected in a commercial vineyard located in central California. The structure of the vineyard imposes motion constraints to ground robots moving therein. Irrigation lines are attached to metallic wires at about 12 in from the ground and running parallel to the trellis. Thus, to move from one row to another (even if adjacent), the robot must first exit the row from either end (based on its direction), and then reenter at the desired row. Samples were collected on a regular grid with 275 rows and 214 columns uniformly covering the field. Sampling locations were computed offline, and data were collected with a Campbell H2SP soil moisture sensor.
Suppose autonomous ground mobile robots are deployed to water plants in the vineyard. Vertices with moisture values less than a desired level are considered to contain a task (of precise watering). The ground truth task cost at any vertex is the moisture difference between collected moisture values and the desired level. An example in shown in Fig. 5. The robots' decision is constrained by the resource budget of total water carrying capacity, and the energy budget to move between locations. All tasks are considered to have the same priority level, i.e. |S| = 1. The location of tasks and the mean cost of all tasks (averaged real costs of all tasks using ground truth) are available to the robot(s) prior to departure, whereas the actual cost for each task is unknown to the robot(s) before task completion. Without loss of generality, we consider the movement cost on edges to be 1 (all water  (Right) Sample ground truth cost (i.e. the moisture difference between collected moisture values and a desired level) for one of the field experimental datasets used here. Low-moisture (dry) locations are indicated by higher differences. In these areas, more water (the resource in this case) needs to be consumed to reach a desired moisture level, which is equivalent to a higher cost. Discretely-sampled values where mapped to a continuous contour illustrated here using the kriging interpolation algorithm. (However, we use the discrete values directly on the SAG representation of the environment utilized by our algorithm.) Table II shows results for 10 real-world datasets in 1and 5-robot scenarios. The 10 datasets were collected during a timespan of 5 months, hence task locations and costs differ among datasets. Results suggest that, in all cases, our proposed NBA-P algorithm achieves higher r/v ratio, lower w/v ratio, and less total visited vertices than N-LM. Even though I-LM and S-GPR achieve similar r/v ratio and total visited vertices as NBA-P, the w/v ratio is much higher compared to NBA-P. Results attempts only if the expected task cost is less than the remaining budget yet it fails to consider the uncertainties in actual resource consumption, which can be much higher than the expected value. In addition, the high actual cost may cause multiple failed attempts at the same position. Thus, the total waste of resource can be significant, evident by the averaged total waste over 10 datasets for 1-robot cases being 5, 1725, 671 and 1796 for NBA-P, N-LM, I-LM and S-GPR, respectively.
That is, NBA-P is able to handle uncertain task costs, and requires less total resources to complete the same amount of tasks as compared to other methods. Importantly, no tasks are aborted in 13 out of 20 cases using NBA-P, where each case contains up to 60000 tasks. For datasets 3, 8, and 9, N-LM in fact visits three times more vertices than the proposed method. The total path lengths for evaluated methods differ within around 2%-range, and NBA-P achieves the shortest path for datasets 6 and 10. That is, NBA-P plans paths of similar length as lawnmower methods.
In all, testing with experimental data validates the efficacy and efficiency of our proposed method, and demonstrates preliminary feasibility that it can scale both in terms of the size of the environment and the number of robots in the team.

VI. CONCLUSIONS
Contributions and Key Findings: The paper contributes to stochastic task allocation in precision agriculture. Given resource and energy budgets, our NBA-P algorithm returns the best action on a stochastic aisle graph (SAG) by simultaneously determining optimal sampling locations and optimal stopping times to return to a base station, all at run-time. The proposed algorithm is tested using both simulated data for a 2-robot scenario and agricultural field experimental datasets for 1-and 5-robot scenarios. Results suggest that, by applying NBA-P, all tasks are completed, and tasks with high priority levels are prioritized when possible. The rate of aborted tasks is minimal when the initial resource budget is more than 50 times the mean task costs. NBA-P outperforms N-LM, I-LM and S-GRP methods in all simulated and real-world datasets, in terms of more gain per vertex visited, fewer tasks being aborted, and less total visited vertices to complete the same number of tasks. In testing with real-world datasets, our method has no tasks aborted in 13 out of 20 cases with up to 60000 tasks in each case. Further, N-LM visits up to 3 times more vertices than NBA-P to complete same number of tasks, which leads to a significant waste of resources.
Directions for Future Work: At its current form, NBA-P is not suitable for scenarios where the task and movement costs are correlated. Further, the overall paths using NBA-P can be longer than those derived via the lawnmower method, especially when multiple priority levels exist and tasks of different priority level are intertwined. Future directions of research include 1) application of the proposed algorithm to physical robots in the field, and 2) study of the scenario that considers correlated cost for movement and task execution.