A Vehicle Routing Problem with Option for Outsourcing and Time-Dependent Travel Time

This paper studies the time-dependent vehicle routing problem with private fleet and common carriers (TDVRPPC), which provides an option to outsource the customer requests and considers real-world time-dependent travel times. The problem is commonly seen in the transportation and logistics industries as it considers the impact of changing traffic conditions on travel times, maximum working hour regulations as well as vehicle capacity constraints. The time-dependent travel time is modeled as a piecewise linear function, based on which a mixed integer programming model is proposed for the TDVRPPC. To solve this NP-hard problem, we customize a hybrid algorithm to harness an adaptive large neighborhood search algorithm for exploration and a tabu search for the exploitation of the search. Through constraint relaxation, dynamic and coordinated adjustment of diversification and intensification strengths of the hybridized procedures, as well as an effective segment-based evaluation method, the proposed algorithm performs well on newly generated test instances for the TDVRPPC and on benchmark instances for the simplified vehicle routing problem with private fleet and common carriers (VRPPC).


I. INTRODUCTION
This paper investigates a goods distribution problem motivated by the challenges encountered by a supermarket operator in Singapore. The supermarket operator owns more than 160 outlets across the island-city and handles numerous delivery requests to the outlets each day. It adopts a mixed strategy to both dispatch self-owned trucks and outsource to third-party logistic (3PL) providers to meet the delivery requests of the outlets. The operator cannot neglect the impact of varied traffic conditions on the road throughout the day, which greatly undermines the decisions of outsourcing and route planning.This paper models the problem as the time-dependent vehicle routing problem with private fleet and common carriers (TDVRPPC), a generalization of the vehicle routing problem with private fleet and common carriers (VRPPC) problem [1]. In the VRPPC, a set of customers must be served by either third-party logistic (3PL) providers (common carriers) or the dispatchers' owned fleet (private fleet), subject to capacity constraints of the private fleet. The objective is to minimize the total cost, including both fixed and variable costs of the private fleet, and outsourcing cost charged by the 3PLs. Due to the high cost of owning and managing a private fleet and seasonable and/or unpredictable demands, outsourcing has become widely common in the transportation and logistics industries. The VRPPC and its variants have been investigated in the past decade [2]- [6]. However, existing research that considers the impact of realworld traffic conditions on the solution and algorithms for the problem appears to be minimal.
Typical vehicle routing problem (VRP) research implicitly assumes that travel times on the road are time-invariant [7], which is not close to realistic situations. Hence, the time-dependent VRP (TDVRP) has attracted much attentions in the academic research community in recent years. [8] proposed a mixed integer programming (MIP) model and several heuristics for the TDVRP for the first time with a stepwise travel time model, which could potentially produce solutions violating the first-in-first-out (FIFO) property. The FIFO property is enforced when [9] modelled the travel speed as a stepwise function with a piecewise linear time-dependent travel time function. Most subsequent research investigates TDVRP and its variants with the proposed travel time model, such as under a congestion charge scheme and a speed-dependent fuel cost [10], determination of best paths between any pair of nodes [11], reduction of fuel consumption and carbon emission [12], [13], and deployment of electric vehicles [14]. Various heuristic algorithms are developed to solve related TDVRP, such as ant colony system [12], [15], [16], adaptive large neighborhood search [14], and hybrid algorithms [17], [18]. [17] studied the multi-trip timedependent vehicle routing problem with time windows (MT-TDVRPTW) and extended the efficient segment-based evaluation method in [19] with time-dependent travel times, which makes use of forward and backward propagation to generate the necessary breakpoints as speedup actions. Due to a lack of real-world time-dependent data in the research community, most of the literature relies on synthetic travel time functions where the computational campaign relies on time-dependent graphs [20]- [23]. Describing the time-dependent graphs considering less than ten speeds per time zone is generally used, because of the limitation of computational memory [24]- [26].
Though the VRPPC problem considers the capacity constraint of vehicles to minimize total cost, it ignores the working hours of the drivers. This is impractical when we consider real-world legal regulations on working hours and the varying travel times required on the road throughout the day. Specifically, [24] analyzed the working hours include driving time and service time under time-dependent travel times in the urban delivery model. The results show that working hours heavily affect road safety and routes efficiency. The TDVRPPC problem is hence an important research topic with great impact when deployed in the real world to minimize operational costs while adhering to both capacity and maximum route duration constraints.
As a generalization of both the TDVRP and the VRPPC, the TDVRPPC is NP-hard and requires planned problem modeling and specific algorithmic design. Various heuristic and meta-heuristic algorithms have been contemplated to study the VRPPC, the TDVRP and other VRP variants, for example tabu search (TS) [2], [27]- [32], variable neighborhood search (VNS) [3], [33], [34], ant colony optimization [16], and adaptive large neighbourhood search (ALNS) [17], [35]- [41]. Among them, the ALNS, which utilizes a set of destroying heuristics to remove part of the incumbent solution and recreates a new solution with another set of repair heuristics in an adaptive way based on the historical performances, has been shown to be effective in solving various problems. On the other hand, TS is a deterministic search algorithm that can flee from local optima efficiently by taking the best non-tabu move. Recently, [18] hybridized ALNS and TS (ALNS-TS) to solve the duration-minimizing time-dependent vehicle routing problem with time window (DM-TDVRPTW), in which the TS searches both feasible and infeasible solution spaces efficiently to identify local optima in a small solution region, while the ALNS performs in-tentional perturbation to the incumbent solution for adaptive exploration. We adopt the same algorithm framework in this study and tailor various algorithmic components based on the unique features of the TDVRPPC: (1) design removal operators and regret insert operation to handle outsourcing options; (2) relax both the maximum route duration constraint and the capacity constraints in the TS procedure to accept infeasible solutions with a penalty function; (3) propose neighbourhood structures with both private fleets and the common carrier; (4) extend the efficient segment-based evaluation method [17], [18] to the relaxed TDVRPPC (r-TDVRPPC) problem to speed up the feasibility checking. Specifically, capacity and maximum route duration constraints can be violated in the r-TDVRPPC and a forward and backward propagation algorithm is used to process the time-dependent ready time function and duration function.
Our main contributions are summarized as below: 1) we offer a formal description and a mixed integer programming (MIP) model for the TDVRPPC; 2) we tailor an effective hybrid algorithm that adjusts its behavior adaptively and dynamically to solve the problem; 3) we devise an efficient evaluation method for a vehicle route under the r-TDVRPPC, and 4) we derive benchmark test instances for the TDVRPPC and present computational studies that illustrate the performance of the hybrid algorithm.
The remainder of this paper is organized as follows. In Section II we introduce the problem with an arc-flow model. We then describe the ALNS procedure and the TS procedure in Section III and IV respectively. The Section V details the acceleration technique for route evaluation with the segmentbased method for the r-TDVRPPC. Lastly, computational results and analysis are given in Section VI before the conclusion in Section VII.

II. PROBLEM DESCRIPTION
The TDVRPPC problem is defined over a directed graph G = (V, A) with node set V and arc set A. V = {0} ∪ V c , where 0 represents the depot, and V c = {1, 2, ..., n} is the customer set. A is defined as Each customer i has a demand q i and must be served by exactly one vehicle from the private fleet or by the common carrier. The homogeneous private fleet is denoted by K, each of which has a maximum capacity Q. The working hour is from E to L and a private vehicle may be activated to start from the depot, serve some customers, and return to the depot within the working hour. Without loss of generality, E is set to 0 in this study. The route's duration, defined as the arrival time at the returning depot minus the departure time at the starting depot, should not exceed the driver's maximum working hour (T max ) as regulated by the government and the company's policy. This is also referred to as the maximum route duration constraint.
Each private vehicle has a fixed cost of f if it is activated. Distance-based travel cost d ij will be incurred when a vehicle from the private fleet travels from node i to node j, while a vehicle from the common carrier incurs a charge of p i if it serves node i. The TDVRPPC problem determines the set of customers to outsource and plans vehicle routes for the private fleet to serve the remaining customers. The problem then aims to minimize the sum of both the fixed and variable costs incurred by the private fleet as well as outsourced costs charged by the common carriers.

A. TIME DEPENDENT TRAVEL TIME FUNCTION
The TDVRPPC adopts the travel time model in [9], which defines the piecewise linear travel time function formally as below. A typical workday ([E, L]) is divided into nonoverlapping time zones T = {1, 2, ..., |T |}, where the mth time zone is [w m−1 , w m ]. The time values of w m , ∀m ∈ {0} ∪ T are called break points. It is assumed that the travel speed is constant within a time zone and will change only at the end of each time zone. The arcs in A are assigned to a speed profile, which share the same travel speeds for each time zone. Let v m i,j be the travel speed along (i, j) during the m-th time zone. The actual time-dependent travel time along (i, j) with departure time t 0 (denoted asτ i,j (t 0 )) depends on the speed profile of (i, j) and d ij , which can be calculated using Algorithm 1. When given a departure time t 0 , Algorithm 1 iteratively calculates the extra time required (line 2 and 6) to complete the remaining journey from i to j (line 4 ) under the travel speed of the current time zone. If the required travel time crosses into the next time zone (line 3), the calculation continues with the travel speed of the next time zone (line 7). Otherwise, the actual travel time required is determined (line 9). We can now use the calculated actual travel times to construct the piecewise linear time-dependent travel time for an arc (i, j) with any departure time t as τ i,j (t). As an example in Figure 2, this function can be fully determined using the break points (w 0 i,j to w 9 i,j ) of the arc time zones T i,j and their associated actual travel times (e.g,τ i,j (w 1 i,j ),τ i,j (w 2 i,j ), and etc, which are calculated with Algorithm 1 ). Note that the function contains two groups of breakpoints: the speed Figure 1 respectively, and the travel time breakpoints {w 1 i,j , w 3 i,j , w 5 i,j , w 7 i,j } representing the time to depart from i so that the arrival times at j is exactly one of the speed breakpoints. The resulted time zones for (i, j) is called the arc time zones T i,j , which is unique to each arc because of the arc's speed profile and d ij . We denote the m-th arc time zone for (i, j) as T m i,j . For ease of calculation, the slope (θ m i,j ) and the intersection with y-axis (η m i,j ) of the m-th line segment (representing the m-th arc time zone) can be pre-calculated and used to fully represent the piecewise linear travel time function τ i,j (t) as below: where the indicator function 1 T m i,j (t) indicates whether t belongs to T m i,j . Similarly, the backward travel time function τ −1 i,j (t), defined as the travel time required for the vehicle to definitely arrive at node j at time t along arc (i, j), is also a piecewise linear function and can be determined in a similar algorithm as Algorithm 1.
In the rest of this paper, we will use the travel time function τ i,j (t), θ m i,j and η m i,j directly for calculation of the travel time given (i, j) and t instead of invoking Algorithm 1.

B. MATHEMATICAL MODEL
This section presents an MIP model for the TDVRPPC, which incorporates θ m i,j and η m i,j directly in the model instead of using τ i,j (t). To construct the model, we first add node n + 1 to graph G as the returning depot, and thereafter create a new graph The MIP is based on the decision variables as listed below: : binary variable which is 1 if vehicle k travels from node i to node j, and 0 otherwise; • y k i (i ∈ V ′ , k ∈ K): binary variable which is 1 if node i is served by the private vehicle k, and 0 otherwise; • s i (i ∈ V c ): binary variable which is 1 if the demand of node i is served by the common carrier; : binary variable which is 1 if vehicle k travels from node i to node j during time zone T m i,j , and 0 otherwise; : the departure time if vehicle k travel from node i to node j during arc time zone T m i,j , and 0 otherwise; This TDVRPPC problem is formulated as below: x Constraint (2) is the objective function to minimize the total cost, which includes fixed costs and routing costs of the private vehicles, and the costs of outsourcing. Constraints (3) define the relationship between z m,k i,j and x k i,j to ensure that x k i,j is set correctly ∀(i, j) ∈ A ′ , k ∈ K. Constraints (4) define y k i in terms of x k i,j for the customers and is also the flow conservation constraints for the customer nodes. Furthermore, constraints (4) ensure that when a vehicle visits i, the customer must be served by the vehicle. Constraints (5) and (6) define y k i in terms of x k i,j for the departure depot and the returning depot for all the private vehicles. Both will be set to 0 if the private vehicle is not used. Constraints (7) set the right range for t m,k i,j . It forces t m,k i,j to be in the right arc time zone if the k-th vehicle travels from i to j during the m-th time zone; otherwise, t m,k i,j is set to zero. Constraints (8) define the arrival time at a node i by consideration of all possible incoming arcs and the time-dependent travel time. The constraints use pre-computed parameters θ and η directly instead of the travel time function τ , which can be directly entered into any commercial solver, such as CPLEX Solver. Constraints (9) link arrival time at and the departure time from node i together to ensure that a vehicle will depart immediately from the node after arriving at and serving customer i. As no service time is used in the original VRPPC problem, the arrival time at and departure time from a visited customer are the same for the TDVRPPC. Constraints (9) and Constraints (8) together serve as subtour elimination constraints too. Constraints (10) calculate the length of the vehicle routes and ensure that the length does not exceed the maximum route duration allowed (T max ). Constraints (11) ensure capacity constraint for each vehicle used. Constraints (12) guarantee that at most |K| private vehicles can be activated and all activated vehicles must start from and return to the depot. Constraints (13) ensure that each customer is served exactly once either by the private fleet or by the common carrier. Constraints (14), (15), (16), (17), (18) and (19) define the variables of the model.

III. ADAPTIVE EXPLORATION
The adaptive large neighbourhood search with embedded tabu search algorithm (ALNS-TS) is outlined in Algorithm 2 below. Note that the algorithm controls the strength of diversification and intensification adaptively and dynamically by setting the number of customer removals (parameter α) in the ALNS (Section III-A) and the maximum number of steps without improvement allowed (parameter ϵ T S ) in the TS (Section IV-B). When the ALNS-TS fails to achieve a better solution in an iteration, the values of α and ϵ T S are increased gradually to increase the strength of ALNS and TS respectively for the subsequent iterations. A counter C ALN S cni is maintained for this purpose.
The ALNS adopts an adaptive selection of the removal and repair operators, whose main objective is to explore unexplored yet promising solution region through large neigh-Algorithm 2 The ALNS-TS framework 1: Construct an initial solution S with Regret Insert 2: C ALN S cni ← 0, S best ← S 3: while not exceeding time limit do 4: Select removal operator to remove α customers from the S 6: Select repair operator to repair S 7: Update scores and probability for removal and repair operators 15: end while 16: return S best borhood changes. Advanced removal methods based on the Shaw removal principle are designed to select customers either served by the private fleet or the common carrier for removal and repair (Section III-C).

A. ADAPTIVE EXPLORATION STRENGTH
The number of customers to be removed in an ALNS iteration (α) controls the diversification strength. Let the counter C ALN S cni denote the number of consecutive non-improving ALNS iterations. This counter is then used to update α adaptively as below: (20) where α min and α max are the minimum and maximum perturbation rates, and α rate is the unit increase rate. Initially, C ALN S cni is set to 0 and α is simply set to α min × n.

B. ADAPTIVE OPERATOR SELECTION
The removal and repair operators are selected adaptively based on their historical performance with the roulette wheel selection principle [35]. Let O be the set of operators and w o be the weight of an operator o. Then o is selected with probability wo o ′ ∈O w o ′ . We divide the ALNS iterations into phases of ϕ phase consecutive iterations, within which the probability remains constant. w o is re-calculated based on its scores ζ o at the end of the phases as below: wherew o is the weight of o in the current phase, γ is the learning rate to control the speed of reaction, and θ o is the number of times that o is selected during the current phase. ζ o reflects the historical performance of the operator and is updated for each iteration when o is selected as below: 1) ζ o is increased by λ 1 if an overall best solution is found; 2) otherwise, ζ o is increased by λ 2 if the solution is better than the current solution. In the initial phase, all the operators are assigned the same weights and probabilities.

C. REMOVAL OPERATORS
One of the five removal operators is selected to remove α customers for each iteration. Note that at the end of each removal operator, the set of customers served by the common carriers will also be removed so that they will be considered for re-insertion in the repair stage.

1) Worst Removal
A tailored worst removal aims to obtain better solutions by removing customers assigned wrongly in the vehicle routes. It calculates the insertion cost of a customer i as , whereS i is the partial solution generated when removing i from S. It then removes customer i with probability based on ∆C i . Specifically, the probability of selecting a customer i at each round is equal to ∆Ci j∈V (S) ∆Cj , where V (S) represent the list of customers served by private fleet in the current solution S. The introduced randomness tends to remove customers with high insertion costs but avoids repeated selection of the same customers. Note that outsourced customers will not be removed by this operator.

2) Advanced Shaw Removal
A special Shaw removal is designed to increase the possibilities of relocating customers to new positions in the solution by selecting customers based on the distance-based closeness measure and deleting neighboring customers served by the same vehicle route to create a bigger gap. The distancebased closeness measure is set to d ij in this study. Algorithm 3 illustrates the pseudo-code of the operator. Initially, two empty lists are prepared for removed customers: L 1 stores customers selected based on the closeness measure, and L 2 stores customers based on the sequence of visits that are either before or after the removed customers in the vehicle route. The operator removes both L 1 and L 2 customers, but only uses L 1 customers to search for the next customer for removal based on the closeness measure. Note that no customer will be inserted into L 2 if the current customer being removed is served by the common carrier. One initial customer is randomly selected and inserted into L 1 to initiate the removal process. VOLUME 4, 2016 Algorithm 3 Advanced Shaw Removal if L 1 is empty or all L 1 customers are processed then 4: Select i randomly for L 1 .

5:
Randomly pick either the customer before or after i in the vehicle route for L 2 . Randomly pick either the customer before or after the selected customers in the vehicle route for L 2 10: Mark customer i as processed in L 1 11: end while 12: Remove all customers from the common carriers

3) Route-based Shaw Removal
As opposed to the Advanced Shaw Removal operator, this operator differs in selecting initial customers removal as it considers the service efficiency of the vehicle route instead. Formally, we define route efficiency as ϕ(r) = f (r)/ i∈V (r) q i for a route r, where f (r) is the total cost of the route r and V (r) contains all the customers served in the route r. A route r is selected with probability ϕ(r)/ r ′ ∈S ϕ(r ′ ) to initialize the Shaw removal process as in Algorithm 3.

4) Mixed Shaw Removal
To further increase the diversity of the search procedure, we incorporate greater randomness by combining the methods for initial customer selection for the removal of the above two operators. Formally, a route is selected based on its effectiveness and another customer not served by the route is randomly selected for L 1 . This operator then applies the Shaw removal procedure to select the rest of the customers.

5) Random Removal
This operator randomly selects α customers for removal from the current solution.

D. REGRET INSERTION AS REPAIR OPERATORS AND CONSTRUCTION
We treat the common carrier as a possible route for the insertion of customers to extend the Regret Insertion (RI) method for the TDVRPPC, which has been shown to be effective to repair partial solutions for various VRP problems [42], [43]. RI inserts unserved customers with the greatest regret value to the solution one by one. Let the minimum insertion cost of customer i to the r-th vehicle route be ∆C i,r , where r ∈ K and i ∈ V c . ∆C i,r is set to +∞ if i cannot be inserted into the route. Let r i,k ∈ K be a variable that indicates the route for which customer i has the k-th lowest insertion cost. RI inserts i with the highest RV = ∆C i,ri,2 − ∆C i,ri,1 into the r i,1 -th route at the position with the lowest insertion cost. RI always maintains an empty route as a candidate route for the customers whenever possible.The advanced regret-k method adopts a look-ahead strategy and chooses i to maximize: where U ⊆ V c is the set of unserved customers. We create three repair operators by setting k to 1, 2 and 3 respectively. When k = 1, regret-k is equivalent to the greedy insertion method. We also employ the regret-k method to construct initial solutions in this paper.

IV. ADAPTIVE EXPLOITATION
While the ALNS steers the search to unexplored territory in the feasible solution space, the TS organizes an intensive and comprehensive local search to improve the solution generated by the ALNS. Hence, we permit the TS to explore infeasible solution spaces for r-TDVRPPC that may violate maximum duration and/or capacity constraints. Similar to diversification, the strength of intensification is dynamically determined based on the historical performance of the algorithm. The TS is restricted to return only feasible solutions to the ALNS loop since infeasible solutions are only used as a bridge between two otherwise unconnected feasible regions during local search and are not useful for large neighborhood changes in ALNS. The TS procedure for the r-TDVRPPC is shown in Algorithm 4.

Algorithm 4 TS for r-TDVRPPC
Select the best non-tabu move of S and obtain S ′

4:
if S ′ is a feasible TDVRPPC solution then Update tabu list 15: Reset β after every ϕ p iterations 16: S ← S ′ 17: end while 18: return S best

A. ADAPTIVE PENALTY COST
A given solution S for the r-TDVRPPC is evaluated with the modified cost function below: where f (S) is the original cost function of S for the TD-VRPPC, P dur (S) and P cap (S) denote the amount of vi-olation of the maximum route duration and the capacity constraints by S respectively, while β dur and β cap indicate the adaptive penalty factors. Both β dur and β cap are handled in a similar way and we omit the subscripts for clarity below. β is restricted to [β min , β max ] to avoid trapping the search procedure in the feasible or infeasible regions for too long. β is set to β min initially and is adaptively updated based on the feasibility of the generated solution. If S is a feasible TDVRPPC solution, β is reduced to β/β rate to encourage exploitation of the infeasible solution space. Otherwise, we increase β to β × β rate to direct the search back to a feasible solution space.

B. ADAPTIVE EXPLOITATION STRENGTH
The strength of the TS depends on the termination criterion.
Hence we confine the maximum number of consecutive nonimproved steps allowed for the TS algorithm (ϵ T S ) as below: where ϕ cni is a numerical parameter and C ALN S cni was defined in previous section. After each ALNS iteration, C ALN S cni is updated accordingly and the strength of the TS is determined dynamically based on the historical performance.

C. NEIGHBORHOOD STRUCTURE
Two groups of neighborhood structures are considered for the TS with the traditional relocation and swap operators. The first group is the intra-route operators, which relocate a customer or swap two customers within the same private vehicle. The second group is the inter-route operators, which relocate a customer or swap two customers between two privates vehicles, or between a private vehicle and the common carrier.

D. TABU DURATION AND ASPIRATION
A tabu duration of µ iterations is associated with each arc removed by the TS operators. The tabu status of an arc can be revoked if a new best feasible solution can be obtained by violating the tabu.

V. SPEEDUP TECHNIQUE FOR ROUTE EVALUATION
The computation under time-dependent routing operations can be time-consuming since the TDVRPPC has to consider travel time between each customer node and in each discretized time zone (departure time) [25], [26]. Route evaluation is invoked regularly to assess the feasibility and cost of a route during local search and the repair stage of the ALNS. [19] developed an efficient segment based route evaluation method for traditional VRP problems, where the information about segments of node visits is pre-processed for faster concatenation operation and the segment of customer visits might violate certain constraints of the problem. [18] extended the method in [19] with a penalty cost for violation of time windows for the time-dependent duration function, which is customized for the r-TDVRPPC with a modified cost function f ′ (S) for the violation of both capacity constraints and maximum trip duration constraints below.

A. TIME DEPENDENT DURATION FUNCTION FOR R-TDVRPPC
Let σ = (σ 1 , σ 2 , ...σ L ) be a segment of node visits for a vehicle from the private fleet with σ i ∈ V, ∀1 ≤ i ≤ L. The ready time function δ σ σi (t) is defined as the time when vehicle arrives at and is ready to depart from node σ i if it starts at σ 1 at time t. Mathematically, it can be calculated recursively as The inverse function of the ready time function is denoted as π σ σi (t), which is formally defined as the latest time when the vehicle should start to serve node σ 1 so that node σ i will be ready by time t.
The following proposition is valid for the time-dependent ready time and duration functions under TDVRPPC. Proposition 1. Under the piecewise linear travel time model for the TDVRPPC, 1) δ σ σi (t), π σ σi (t) and ψ σ (t) are piecewise linear functions for any σ and 1 ≤ i ≤ L. 2) δ σ σ L 1 (t), π σ σ1 (t) and ψ σ (t) have the same set of break points; and 3) the minimum duration of σ must be associated with one of the break points for ψ σ (t).
Proof. The proof is based on the following: let f (t) and g(t) be two piecewise linear functions on t, then f (t) + g(t), f (g(t)), and f (t) + C are all piecewise linear functions, where C is a constant number. First, it is easy to see that δ σ σ1 (t) = t is a (piecewise) linear function. So τ σ1,σ2 (δ σ σ1 (t)) is a piecewise linear function on t as the travel time function is also a piecewise linear function. Therefore, δ σ σ2 (t) is a piecewise linear function as the sum of two piecewise linear functions. Similarly, it can be shown by induction that δ σ σi (t) is a piecewise linear function ∀1 ≤ i ≤ L. Consequently, ψ σ (t) is also piecewise linear as the sum of δ σ σ L (t) and −t. The proof for π σ σi (t) is similar to the proof for δ σ σi (t) with the backward travel time function τ −1 i,j (t).
(2) By definition of ψ σ (t), it is easy to see that ψ σ (t) and δ σ σ L 1 (t) share the same set of break points. On the other hand, the departure time break points from σ 1 and the arrival time break points at σ L are the same for both δ σ σ L 1 (t) and π σ σ1 (t) by definition of the π σ σ1 (t).
(3) It can be shown by contradiction. Suppose the minimum duration of σ is associated with a departure time t, which is not one of the break points for ψ σ (t). Let t falls under the line segment (the arc time zone) denoted by [t,t]. Then either ψ σ (t) or ψ σ (t) must have a lower duration than or both have the same duration as t.
Similar to the time-dependent travel time function, these functions can be fully represented by the values of their VOLUME 4, 2016 breakpoints, where the slopes and intersections of the line segments can be computed accordingly. Therefore, the method below will only focus on the breakpoints for storage and processing of δ σ σ L 1 (t), π σ σ1 (t) and ψ σ (t) for any segment σ.

B. SEGMENT INITIALIZATION AND FEASIBILITY
A segment for the r-TDVRPPC stores its cost C(σ), total load Q(σ), the minimum duration M D(σ), the break points set BP S(σ) for the feasible start time window at σ 1 , the associated ready time break points set RT _BP S(σ) and the associated duration set DU R(σ). The segment for a single customer i is initialized with C(σ) = 0, Q(σ) = q i , M D(σ) = C(σ) = 0, BP S(σ) and RT _BP S(σ) contains all break points in T , and DU R(σ) contains |T | 0s. The segment for the departure depot is initialized in the same way except for C(σ), which is set to the fixed cost of a vehicle f . σ is feasible for the TDVRPPC if and only if Q(σ) <= Q and M D(σ) ≤ T max . On the other hand, σ is feasible for the r-TDVRPPC as long as M D(σ) ≤ L − E.

VI. COMPUTATIONAL EXPERIMENTS
In the remainder of this section, we first describe the test instances adopted from the existing VRPPC benchmark instances and the parameter tuning process. After that, we provide the computational results on the TDVRPPC test instances as well as the VRPPC test instances. All programs are coded in Java and run in a single-thread mode on a Ubuntu 18.04.3 LTS server with Intel(R) Xeon(R) Silver 4216 CPU at 2.10 GHz. The MILP model is solved with IBM ILOG CPLEX 12.8.0 [44].

A. TEST INSTANCES
We adopt a total of 34 instances from [45] for the VRPPC, which are divided into two subsets: 14 instances originated from [46] (initialized with "C") and 20 instances proposed by [47] (initialized with "GE"). Note that our test instances contain the same information as the test instances in [45], including customer locations, customer demands, vehicle capacity, fixed cost and variable costs. The customer sizes of the instances may vary from 50 to 480 customers and up to 31 vehicles is allowed for some instances. Find the associated start timet 1 ∈ BP S(σ 1 ) for t 1 at σ 1 1 3: t dur ← t ready −t 1 6: Insertt 1 , t ready and t dur into BP S(σ 1 ⊕ σ 2 ), RT _BP S(σ 1 ⊕ σ 2 ), and DU R(σ 1 ⊕ σ 2 ) respectively 7: end for 8: for t 2 ∈ RT _BP S(σ 2 ) do 9: Find the associated start timet 2 ∈ BP S(σ 2 ) for t 2 at σ 2 1 10: Insert t start , t 2 and t dur into BP S(σ 1 ⊕ σ 2 ), RT _BP S(σ 1 ⊕ σ 2 ), and DU R(σ 1 ⊕ σ 2 ) respectively 14: end for 15: M D(σ 1 ⊕ σ 2 ) = min d∈DU R(σ 1 ⊕σ 2 ) d 16: Determine the parameters of the time-dependent ready time functions for σ 1 ⊕ σ 2 Although TDVRP and its variants have received increased attention from the scientific community in recent years, realworld time-dependent data is still rare. Only big IT players, such as Google, TomTom, AutoNavi(Gaode), etc., have the availability of high-quality historical time-dependent traffic data. As a result, there are no real travel time dataset freely available to the entire research community. To overcome this aspect, most of the literature on Time-Dependent Vehicle Routing problems relies on synthetic travel time functions. As far as synthetic time-dependent data is concerned, we observe that there are (highly-cited) contributions (see for example [9], [48] ) on vehicle routing problems where the computational campaign relies on time-dependent graphs which satisfy the sufficient conditions partially introduced by [49] and further generalized by the path ranking invariance property defined in [22]. Therefore, we derive timedependent travel time data for the adopted instances using the approach in [9]. First, We run a preliminary experiment to solve the VRPPC instances without time-dependent information to determine T max and L for the instances. E is set to 0 without loss of generality in this study. Next, the workday [0, L] is divided into five time zones with two peak hours to simulate the real world traffic conditions. The second and the fourth time zones represent the morning and afternoon rush hours. The third time zone represents the off-peak hours during the day. The first and last time zones represent the beginning and end of the workday with less traffic on the road. We create three-speed profiles as in [9] to denote road segments with fast, medium, and slow traffic speeds. Each arc is then randomly assigned to one of the three speed profiles with different traffic speeds. The detailed speed profiles are shown in Table 1.
For the computational experiments, ALNS-TS runs 10 times with different random seeds per run for each test instance. We standardize the run time allocated for each run based on the customer size: 300 seconds per run for instances with less than or equal to 50 customers, 600 seconds for instances with 51 to 100 customers, 900 seconds for instances with l01 to 200 customers, and 1200 seconds for instances with 201 or more customers. On the other hand, the CPLEX Solver is given a maximum of 2 hours run time for small scale test instances with 15 and 25 customers.

B. PARAMETER TUNING
We tune the parameters of the algorithm with the IRACE package [50], which adopts the iterated racing approach to find the best parameter settings of an optimizer automatically. IRACE defines a configuration as a set of values assigned to the algorithmic parameters. The inputs to the IRACE contains the optimizing algorithm itself, a set of algorithmic parameters, a set of training instances, and a training budget for the maximum number of algorithm runs. The iterated racing in IRACE consists of three iterative steps: (1) creating new configurations based on the current sampling distribution, (2) invoking the optimization algorithm with the sampled configurations and evaluating the configuration performance through racing, and (3) ranking and selecting configurations and updating the sampling distribution to bias towards the best configurations. At the end of the parameter tuning, IRACE returns a list of elite configurations based on performance and the best elite configuration returned is selected in this study. A total of 10 test instances are randomly used as training instances. IRACE is given a budget of 2000 iterations and each iteration was given 300 seconds to execute the ALNS-TS algorithm. The list of the numerical parameters and their best configuration values returned by IRACE can be found in Table 2.

C. EVALUATION OVER SMALL SCALE INSTANCES
In the first experiment, we apply both the ALNS-TS and IBM ILOG CPLEX 12.8.0 [44] to solve small-scale TDVRPPC instances to access the correctness of the algorithm in Table  3 and 4. Test instances used in Table 3 and 4 contain the first 15 customers and first 25 customers respectively from the corresponding TDVRPPC instances. In both tables, we present (1) the upper bound (UB, a.k.a the best feasible integer solution), the lower bound (LB), the gap between UB to LB (G opt , calculated as (U B − LB)/U B)), and the run time in seconds (T run ) for the CPLEX solver; and (2) the best solution cost found (C best ), the time in seconds to find the best solution (T best ), the gap of C best to the UB (G U B , calculated as (C best − U B)/U B), the average solution cost over 10 runs (C avg ), as well as the gap of C avg to the C best (G avg , calculated as (C avg − C best )/C best )) for the ALNS-TS algorithm. Note that the optimality gap threshold is set to 1e − 4 for CPLEX solver, which explains the inconsistencies over the UB and LB of some instances that are solved to optimality by CPLEX solver. For the instances with 15 customers, CPLEX cannot solve two instances to optimal values due to the complexity of the problem while ALNS-TS finds better feasible solutions for one of them (CE-13). Besides, ALNS-TS can obtain the same best solution for the rest of the 33 instances. The G avg measures the consistencies of the algorithm over 10 runs for the same test instance. ALNS-TS performs consistently well as it can find the best solutions in each of the 10 runs for 32 out of the 34 test instances. For the other 2 instance, the G avg is only 0.03% over 10 runs. Besides, ALNS-TS normally requires a much shorter run time to find the best solution compared to CPLEX solver.
The contrast of the performance for both solvers is even bigger for the instances with 25 customers and 3 vehicles. First, CPLEX solver can only solve 4 (G-13 to G-16) of the 34 instances to optimality within 2 hours. The G opt can be as big as 133.2% for CE-13 while the average G opt over the 34 instances is over 30%. These observations prove that the TDVRPPC instances are much more difficult to solve with the increase in customer size and CPLEX solver might not be a good tool for larger scale test instances. On the other VOLUME 4, 2016 hand, ALNS-TS is able to find the optimal solutions for the 4 test instances (G-13 to G-16) in shorter run time too and returns better solutions for all test instances that are not solved optimally by CPLEX solver. The G U B can be very significant, for example −28.08% for CE-12. The average G U B is −13%, namely, ALNS-TS finds better solutions whose solution cost is on average 13% lower than the UB provided by CPLEX solver. Lastly, ALNS-TS's performance is still consistent with the increase in complexity of the test instances. The average G avg is only about 0.28% and ALNS-TS obtains the same best solution for 12 out of the 34 test instances for each of the 10 runs.

D. EXPLORATION VS EXPLOITATION IN ALNS-TS
We design an experiment to illustrate the individual effects of exploration with ALNS and exploitation with TS and the necessity of combining both procedure to solve the TDVRPPC efficiently. First we tailor two variants of the ALNS-TS to solve the TDVRPPC instances, the first is the TS procedure in Section IV, the second is the ALNS procedure described in Section III. The detailed comparisons are shown in Table  5. For each of the three algorithms, the table shows the best solution cost found (C best ), the average best solution cost found over 10 runs (C avg ), and the gap between both (Gap avg ). For the ALNS and the TS procedure, we shows the gap between the C best found by the ALNS-TS and the C best found by the algorithmic variant too (Gap AT ). A few observations can be made from the comparison. First, the performance of TS is close to the performance of the ALNS-TS, with an average GAP AT of 1.9%, which shows that the TS performs relatively well with its compre- hensive exploitation. Second, ALNS has the largest Gap avg and a larger Gap AT . ALNS is designed to jump to unexplored regions in the search process, which explains the large value for Gap avg for solution diversity. Furthermore, each iteration in ALNS destroys the current solution through a large neighbourhood change controlled by parameters α min , α max , and α rate . No guarantee of solution improvement is enforced in the design of ALNS. This explains the larger Gap AT . Lastly, ALNS-TS performs best among the three algorithms as benefits from the exploitation strength of TS with an additional perturbation through ALNS iterations. Furthermore, even though the Gap avg are in general larger than those reported in Table 3 and 4 for the small scale instances, most of the Gap avg are in the range of 1% to 3% with a median value of 2.0%.

E. CONSISTENCY OF SOLUTION COSTS FOUND
We design a box plot to illustrate the consistency of solution costs found by the ALNS-TS for the TDVRPPC test instances. We denote the test instances in Table 3 with 15 customers as "C − 15", the test instances in Table 4 with 25 customers as "C − 25", and the test instances in Table  5 as "T DV RP P C", and compare the results for the three groups. We first determine the standard score (z score) of the solution cost for each of the 10 solutions found by the ALNS-TS algorithm for a test instance. The standard score is formally calculated as (cost − µ)/σ, where cost is the actual solution cost found in the run, µ represents the mean of the costs, and σ represents the standard deviation of the costs for the test instance. Lastly, we group the standard score of a total of 10 × 34 = 340 runs per group together and plot number by instance group in Figure 3 below. The box plot shows that the solution costs are very consistent as most of the values are near to the reference point, especially for C-15 and C-25. Indeed, the standard scores for 322 of the 340 runs for the C-15, 199 runs for the C-50 test instances, and 237 runs for the TDVRPPC test instances falls within 1σ away from the mean value.

F. RESULTS ON TDVRPPC AND VRPPC
The introduction of the time-dependent travel time and the maximum trip duration constraint in the TDVRPPC places extra constraints to the VRPPC solutions. Hence, we apply the ALNS-TS algorithm to solve both the VRPPC and the TDVRPPC instances to evaluate the impact and tightness of the capacity constraint and the maximum trip duration constraint. Besides, it is necessary to evaluate the changes in the cost allocation between internal cost and outsourced cost. Table 6 compares the best solution cost found (C best ), the average percentage of outsourcing cost over total solution cost (R cs ), the average loads of the vehicle routes over Q in terms of percentage (R cap ), and the average route duration of the vehicle routes over T max in terms of percentage (R dur ). Based on the results, the new constraints have forced more customers to be outsourced and in general increased both C best by 1.9% and R cs by 3.1%. Therefore, the TDVRPPC constraints have positive impact on the solution cost. Though R cap is reduced by about 1.1%, R dur is about 5.6% smaller than R cap for the TDVRPPC. This may suggest that either capacity constraints are tighter than maximum route duration constraints, or that the ALNS-TS prioritizes to maximize the vehicle capacity to reduce outsourcing cost and total cost. The next experiment is hence designed to evaluate a scenario when the maximum route duration is tightened.

G. IMPACT OF CAPACITY AND MAXIMUM ROUTE DURATION CONSTRAINTS
We reduce T max by 10% to generate another instance group (TDVRPPC2) for further evaluation of the impact of both constraints. As shown in Table 7, we group the instances by either type CE or G and present the average of R cap , R dur and R cs in percentage values. As shown in the results, the average R dur increased more significantly than the reduction in the average R cap for both groups in TDVRPPC2 instances when T max is reduced by 10% . The results also show that introducing T max and reducing T max increases the average R cs , while means the company is forced to outsource more customers using 3PL services due to tighter constraints on vehicle route duration. This ultimately resulted in a higher average solution cost (Cost) for both groups. However, the average R cap is still higher than the average R dur for both groups, which again suggests that the ALNS-TS may prefer to maximize vehicle capacity utilization than to maximize the route duration to reduce outsourcing costs and total costs.

H. VRPPC BENCHMARK RESULTS
Lastly, we present a comparison on the VRPPC benchmark data with two state-of-the-art algorithms, namely the tabu search algorithm in [2] (TS+) and the AVNS in [3], in Table 8. The TS+ algorithm [2] extends the TS with a neighbourhood structure based on ejection chains. The AVNS [3] develops a group of 51 cyclic-exchange neighborhoods and incorporates an adaptive mechanism to bias the random shaking step. As shown in the table, the ALNS-TS algorithm obtains bestknown solutions for two instances and the average deviation of the best found solution cost from the best solutions is about 2.6%.We conclude that the performances of the ALNS-TS on the VPRPV test instance is acceptable since it is not designed specially to solve the VRPPC problem.

VII. CONCLUSION
In this work, we examine the practical time-dependent travel times on the road and the maximum route duration constraints for the drivers for the TDVRPPC problem. Two popular heuristic algorithms, namely the ALNS and the TS, are hybridized together to minimize the total cost consisting of fixed cost, variable cost, and outsourced cost. The ALNS broadens the exploration through adaptively controlling the neighborhood size and TS dynamically deepens the exploitation while relaxing duration and/or capacity-related constraints. Computational experiments show that our algorithm performs well on the TDVRPPC benchmark data. For further research, we are interested to extend the current TDVRPPC model with richer real-world considerations, such as adding multi-trip per vehicle, customer time window limitation, as well as multi-depot for vehicles. At the same time, new efficient algorithms can be proposed and evaluated on the benchmark data.