An Improved Tabu Search Algorithm for the Stochastic Vehicle Routing Problem With Soft Time Windows

In real-world environments, vehicle travel and service time will be affected by unpredictable factors and present a random state. Because of this situation, this article proposes the vehicle routing problem with soft time windows and stochastic travel and service time (SVRP-STW). The probability distribution of vehicle travel and service time are introduced into the model, and a stochastic programming model with modification is established to minimize the distribution cost. An Improved Tabu Search algorithm (I-TS) based on greedy algorithm is proposed, in which adaptive tabu length and neighborhood structure are introduced; the greedy algorithm is used instead of the random methods to generate the initial solution. Experiments on different scale instances prove the effectiveness and superiority of the proposed algorithm.


I. INTRODUCTION
With the rapid development of the distribution industry, the expanding scale of logistics distribution, and the enhancing complexity of urban transportation networks, some phenomena occur frequently, e.g., unreasonable distribution route planning, empty distribution vehicles, and unreasonable order allocation. These phenomena have led to the increasing cost of distribution, which seriously affects the operation and development of enterprises. The vehicle routing problem (VRP) is a hot issue in the field of operations research, which refers to that the distribution center plans the optimal path according to the distribution and demand of customers. The VRP can effectively reduce the cost of distribution and overcome the unreasonable phenomenon in the process of distribution. Therefore, the vehicle routing problem has been widely concerned by scholars.
In the actual distribution process, traffic accidents, weather changes, road congestion, and other factors will affect the original distribution plan, so it is more practical to study the stochastic vehicle routing problem (SVRP) [1]. Shahparvari and Abbasi [2] have established excellent mathematical The associate editor coordinating the review of this manuscript and approving it for publication was Maurice J. Khabbaz . models for SVRP. The SVRP can be divided into two situations: (i) vehicle routing problem with stochastic demand (VRPSD) [3], [4], (ii) vehicle routing problem with stochastic time (VRPST) [5], [6].
This article investigates an extended situation of the VRPST problem, in which vehicle travel and service time are stochastic, and the vehicle is required to deliver the goods within the specified time to the customer [7]. This problem is usually termed the vehicle routing problem with time windows and stochastic travel and service time (SVRPTW). Time windows define the earliest and latest time that a vehicle may arrive at a customer and start service. In this article soft time windows are considered, i.e., the vehicle can arrive before the time window is opened or after the time window is closed, but it shall be given certain punishment. The SVRPTW problem involves a fleet of homogeneous vehicles stationed at a depot to serve different geographically scattered customers. In this process, the vehicle travel and service time are uncertain and cannot violate the vehicle capacity constraints.
The SVRPTW is an NP-hard combinatorial optimization problem. Exact algorithms can only be used to find solutions for small-and-medium scale instances. For large-scale instances, the exact algorithms cannot find solutions within an acceptable time [8]. Therefore, scholars and experts begin VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ to consider generating high-quality solutions in reasonable computational time and shift their research efforts to the application of heuristic or meta-heuristic approaches. Heuristic and meta-heuristic methods for VRP and its variants have emerged in recent years [9], [10]. Tabu search algorithm (TS) is a meta-heuristic algorithm that has strong local development ability and fast convergence speed. Many combinatorial optimization problems can be solved efficiently by TS. It has been used to solve the VRP and its variants, e.g., VRP [11], VRPMTW [12], HFVRP [13], ConVRP [14] and MCVRP [15]. In this context, a new stochastic programming model is proposed considering the stochastic vehicle travel and service time, and an improved meta-heuristic algorithm is proposed considering the high complexity of the SVRPTW problem. The main contributions of this article are: 1. The introduction of the vehicle routing problem with soft time windows and stochastic travel and service times (SVRP-STW).

A new mathematical model is established that better
reflects the randomness of travel and service times. 3. The design and implementation of improved tabu search algorithm which includes specific components such as the generation of the initial solution, the setting of tabu length, and the setting of neighborhood structure. The remainder of this article is organized as follows. Section II presents a literature review on the VRPST, VRPTW, and SVRPTW. Section III formally defines the SVRP-STW to be considered and develops a mathematical model. Section IV describes in detail the proposed I-TS and how to solve the SVRP-STW. Section V compares the proposed I-TS with other algorithms in the literature in various scale instances. Finally, conclusions are drawn in Section VI.

II. LITERATURE REVIEW A. VRPST
For the VRPST, vehicle travel time or service time is a random variable subject to a probability distribution. VRPST is closer to real conditions and has attracted more and more attention.
Kuo et al. [16] develop a mathematical model for the VRPST problem using fuzzy theory, which fully considers the randomness of time. An improved fuzzy Ant Colony System (ACS) is proposed, which embeds a cluster insertion algorithm into the ACS algorithm. The results show that the proposed algorithm performs better than the other algorithms. Shi et al. [17] develop a stochastic programming model with recourse (SPR) for the VRPST problem. A Hybrid Genetic Algorithm (HGA) and stochastic simulation method are proposed.
For the vehicle routing problem with stochastic service and travel time, Gutierrez et al. [18] propose a Multi-Population Memetic Algorithm (MPMA). The algorithm uses a lognormal approximation to estimate the arrival time, taking into account that the failure of previous customers to improve the estimation of the mean and variance of the arrival time. The results show that MPMA is superior to different methods in different target types, thus demonstrating its efficiency and flexibility.

B. VRPTW
In the actual distribution process, customers will specify the delivery time of the goods, which is called the vehicle routing problem with time window (VRPTW). The VRPTW has effectively improved the level of vehicle service and attracts more research attention.
Marinakis et al. [19] study the vehicle routing problem with time windows and a new variant of the Particle Swarm Optimization (PSO) algorithm is proposed. Three different adaptive strategies are used in the proposed Multi-Adaptive Particle Swarm Optimization (MAPSO) algorithm. The first adaptive strategy concerns the use of a Greedy Randomized Adaptive Search Procedure (GRASP) and the second adaptive strategy concerns the use of the Adaptive Combinatorial Neighborhood Topology. The algorithm is compared with other versions of PSO and has better results.
Several heuristics algorithms have been proposed to solve the VRPTW problem. Brandao [20] constructs an Iterated Local Search (ILS) algorithm for solving this problem and do numerical experiments with their instances. Molina et al. [21] propose an Ant Colony System (ACS) algorithm with local search for solving this problem. Perez-Rodriguez and Hernandez-Aguirre [22] propose an estimation of distribution algorithm for the problem. The algorithm uses the generalized Mallows distribution as a probability model to describe the distribution of the solution space. Experimental results show that the algorithm is competitive.

C. SVRPTW
The vehicle routing problem with time windows and stochastic travel and service time (SVRPTW) is directly related to our questions. In recent years, it has begun to attract the attention of scholars.
Miranda and Conceicao [23] study the vehicle routing problem with hard time windows and stochastic travel and service time. They analyze that the arrival time distribution would be truncated at the earliest time windows, thus affecting the estimation of the arrival time of subsequent customers. To solve this problem, a specific statistical method is proposed to obtain the cumulative probability distribution of the vehicles over the customers, and a meta-heuristic algorithm based on Iterative Local Search (ILS) is proposed. A benchmark is used to evidence the superior performance and accuracy of the proposed method.
Several heuristics algorithms have been proposed to solve the SVRPTW problem. Tas et al. [24] propose a Tabu Search (TS) and an Adaptive Large Neighborhood Search (ALNS) for solving this problem and do numerical experiments for well-known problem instances. Gutierrez et al. [25] propose a Multi Population Memetic Algorithm (MPMA) for solving this problem and do numerical experiments on modified instances proposed by Solomon [26] for the deterministic counterpart. Miranda et al. [27] propose a Multi-objective Memetic Algorithm (MMA) and a Multi-objective Iterated Local Search (MILS) for solving this problem. Experiments based on an adapted version of the 56 Solomon instances demonstrate the effectiveness of the proposed algorithms.
Although a large number of literatures have been published on VRPST and VRPTW problems, few types of researches have attempted to solve SVRPTW problems, and there is a lack of efficient solutions. Besides, this article differs from the existing work in the following aspects: (1) the vehicle travel time and service time are set as random variables subject to normal distribution at the same time, soft time window constraints are adopted, and a stochastic programming model with modification is proposed; (2) an improved tabu search algorithm is proposed.

III. PROBLEM FORMULATION
In the SVRP-STW, vehicle travel and service time are unknown and the customers will specify the vehicle arrival time. The problem can be described as the depot planning the optimal distribution path according to the stochastic distribution of customers.
Let G = (N , A) be a complete digraph, where N = {0, 1, 2, . . . , n} is a set of nodes and A = {(i, j): i, j ∈ N , i = j} is a set of arcs. Each arc {i, j} ∈ A has an associated distance d ij , the travel time t ij , and the travel cost c ij . Vertex 0 represents the depot, V = {1, 2, . . . , n} is a set of customers. Each customer i ∈ V has a non-negative demand q i , service time δ i , and a time window [e i , l i ], where e i is the start of the time window and l i is the end of the time window. If the vehicle arrives at customer i before e i or after l i , it needs to be punished. K = {1, 2, . . . , m} is a set of vehicles, each vehicle k ∈ K has the same capacity Q. Both t ij and δ i are random variables that subject to normal distribution.
Before presenting the mathematical formulations, we define the following notations in Table 1.
Based on these variable definitions, the model for the problem can be described as follows: The objective function (1) is formed by three parts: the vehicle travel cost, the penalty cost for vehicles arriving earlier than the agreed time, and the penalty cost for vehicles arriving later than the agreed time. Where λ w and λ d represent the penalty cost for per unit earliness time and delay time of vehicle k; E(W jk ) and E(D jk ) represent the expected earliness time and the expected delay time at customer j served by vehicle k.
The following are constraints: Eq.8 indicates the time when the vehicle k arrives at customer i, where A ik represents all customer points the vehicle k has passed since it arrived at customer i; B ik represents all customer points the vehicle k has passed before it arrived at customer i. The difference between A ik and B ik is whether it includes the customer point i where the vehicle k is currently located.
Vehicle travel time and service time obey normal distribution. Due to the additivity of normal distribution, vehicle arrival time also obeys normal distribution. The expectation and variance of vehicle arrival time are as follows:

B. ANALYSIS OF THE OBJECTIVE FUNCTION
The last two parts of the objective function are the penalty cost for vehicles arriving earlier than the agreed time and the penalty cost for vehicles arriving later than the agreed time. Among them, the expected value of the vehicle's early arrival and delayed arrival time can be calculated.
As the vehicle travel time, service time, and arrival time are all random variables subject to the normal distribution, the expectation of vehicle early arrival time can be calculated by the following process: Similarly, the expected value of vehicle delay arrival time is as follows:

IV. IMPROVED TABU SEARCH ALGORITHM TO SOLVE THE SVRP-STW
The above research shows that SVRP-STW is an NP-hard problem with high complexity, and it cannot be solved by the exact algorithm. Meta-heuristic algorithms can provide a more effective solution approach. Our algorithm is based on Tabu Search algorithm which has been applied successfully to solve various vehicle routing problems [28]. TS is a random heuristic search algorithm that uses an initial solution as a starting basis for seeking improved solutions by searching for different neighborhoods. Some factors that affect the performance of Tabu Search algorithm, such as the pros and cons of the initial solution, tabu length, etc. In this section, we give brief information about the improved greedy algorithm to generate an initial solution, the selection of tabu length, and the definition of neighborhoods.

A. AN INITIAL SOLUTION
In general, the initial solution can be obtained randomly or generated by heuristic algorithms such as a glowworm swarm optimization algorithm [29] and a genetic algorithm [30].
A better initial solution can help the algorithm search for the global optimal solution quickly. Here, the initial vehicle routes are constructed by using an improved greedy algorithm. A detailed description of the algorithm can be obtained from Shenmaier [31]. The Greedy algorithm proposed in this article is a simple and efficient insertion procedure that takes into account travel distance, residual capacity, and time window constraint. In short, the initial vehicle routes construction process can be summarized as follows. With the depot as the initial node, in each iteration, the customer with a narrow time window and close to the current node is preferred and inserted at the best feasible insertion place in the current route. When a route cannot admit any new customer due to capacity constraints, a new route is constructed. This is repeated until all customers are visited.

B. TABU LENGTH
To avoid falling into local optimum in the search process, some movements will be recorded in the tabu list. The recorded movements are called tabu object, and the tabu length refers to the time when the tabu object is forbidden. The tabu length decreases with the increase of the number of algorithm iterations. When the tabu length is zero, the recorded movement will be deleted from the tabu list.
Tabu length is a key factor in the performance of tabu search algorithm because it directly affects the search process and behavior. On the one hand, the algorithm is easy to fall into the local optimal solution with a smaller value of tabu length. On the other hand, the search time of the algorithm will become longer and the algorithm may eventually crash when the value of tabu length is set larger. Therefore, an adaptive tabu length is proposed in this article to improve the performance of tabu search algorithm. Specifically, tabu length is randomly selected from the tabu length interval at the beginning of each iteration. If the better solution is searched in the search process, the value of tabu length will be increased. Otherwise, the value of tabu length will be reduced. In this article, the left and right boundaries of the tabu length interval are determined by experimental comparison. Six instances were randomly selected from Solomon's VRPTW benchmark and the experiment was conducted on the different values of tabu lengths. These results are summarized in TABLE 3. It is worth noting that the OS represents the optimal solution and the AS represents the average solution.
Based on the results presented in TABLE 3, when the tabu length is between 30 and 50, the performance of tabu search algorithm is optimal. Therefore, the left and right boundary of the tabu length interval are set to 30 and 50, respectively.

C. NEIGHBORHOOD STRUCTURE
The neighborhood structure plays a key role in the performance of tabu search algorithm, since it determines the extent and the quality of the explored solution space. There are several general neighborhood structures, e.g., 2-opt, 3-opt, swap, cross, relocate and reverse, which have been widely applied to solve the combinatorial optimization problems. The improved tabu search algorithm proposed in this article comprises of four neighborhood moves: one specific neighborhood and three common neighborhoods (2-opt, swap and reallocate). At the beginning of each iteration, one of the four neighborhoods is randomly selected to explore solutions. If the improved tabu search algorithm stops searching in the neighborhood, it will continue to explore the next neighborhood. This is repeated until all neighborhoods are explored. The details of the neighborhood movements are as follows: Specific neighborhood: Randomly select customer i to delete it from current path l and insert it into another path k. The path k must satisfy the capacity constraint and the customer i satisfies the time window constraint after inserting the path k, otherwise it will be inserted into the path k + 1.
2-opt: Randomly select two customers in the path and flip the path between the two customers, which is indicated by FIGURE 1.
Swap-operation: Randomly select two customer points on the path and exchange positions, which is indicated by FIGURE 2.
Reallocate-operation: Randomly select customers in the path and insert them anywhere in the other path, which is indicated by FIGURE 3.

D. IMPROVED TABU SEARCH ALGORITHM
Tabu search algorithm (TS) is a modern heuristic algorithm that has flexible memory ability and aspiration criterion, so it is not easy to fall into the local optimum in the search process. It has been successfully applied to solve various complex combinatorial optimization problems [32]. Generally, TS uses a randomly generated initial solution as a starting basis for seeking improved solutions by searching for a certain neighborhood. It is improved with parallel computation and a combined neighborhood approach by Kiziloz and Dokeroglu [32]. In this article, the greedy algorithm is proposed to replace random methods to generate the initial solution, and adaptively adjust the tabu length and the type of neighborhood structure. The specific process of the improved tabu search algorithm proposed in this article is shown in FIGURE 4: VOLUME 8, 2020

V. EXPERIMENTS AND RESULTS
In this section, we demonstrate the effectiveness of proposed algorithm by testing it using well-known benchmark data sets. This section is divided into four sub-sections with different experiments. Section 5.1 describes the influence of variance changes of stochastic travel time and service time on the results of the algorithm. Section 5.

A. DISCUSSION ON THE VARIANCE OF STOCHASTIC TRAVEL TIME AND SERVICE TIME
Six types of VRPTW examples (C1 type, C2 type, R1 type, R2 type, RC1 type and RC2 type) of Solomon were used to test I-TS. In the process of applying the improved tabu search algorithm, the variance of vehicle travel time and service time was set as 10%, 20%, 30% and 40% of the mean respectively, and a comparative analysis was carried out. The algorithm was run independently for 5 times in each case, and the optimal solution and the average solution were compared to analyze the influence of randomness on the results. The experimental results are shown in FIGURE 5. FIGURE 5 shows the optimal solution and average solution of each instance at different variance levels. For different instances, the greater the variance, the stronger the randomness of vehicle travel time and service time, and the worse the optimal solution and average solution obtained by the algorithm. Based on the results presented in FIGURE 5, the effect of the algorithm at different variance levels may be summarized as follows: • For most instances, when the variance is set to 20%, 30% and 40% of the mean value, the optimal solution and average solution obtained by the algorithm are worse than the optimal solution and average solution obtained by the algorithm when the variance is set to 10% of the mean value.
In conclusion, the variance is taken to be 10% of the mean in all experiments.

B. RESULTS FOR THE 25-CUSTOMER INSTANCES
Based on Solomon's VRPTW benchmark (56 instances in total), select the first 25 customers to construct a new benchmark. Under this benchmark, the Genetic Algorithm (GA) [33], the Simulated Annealing algorithm (SA) [34], the Ant Colony Optimization algorithm (ACO) [35] and the proposed I-TS were compared. These results are summarized in TABLE 5.

C. RESULTS FOR THE 50-CUSTOMER INSTANCES
For the 50-customer instances of the Solomon's VRPTW benchmark, each instance is formed by selecting the first 50 customers in the corresponding instance. Under these instances, the Ant Colony Optimization algorithm (ACO) [35], the Simulated Annealing algorithm (SA) [34], the Tabu Search algorithm (TS) [36] and the proposed I-TS were compared. These results are summarized in TABLE 6.

D. RESULTS FOR THE 100-CUSTOMER INSTANCES
In the Solomon's VRPTW benchmark, there are totally 56 instances which can be divided into six categories: C1 type, C2 type, R1 type, R2 type, RC1 type and RC2 type. Among the C1, R1 and RC1 type instances, the vehicle capacity is small; among the C2, R2 and RC2 type instances, the vehicle capacity is large. Under the benchmark, the Ant Colony Optimization algorithm (ACO) [35], the Tabu Search algorithm (TS) [36], the Simulated Annealing As can be seen, the best solutions obtained by I-TS that are better than the best results obtained by all comparison algorithms for instances R107. 25 TABLE 5.  From TABLE 6, we can see that the best solutions are obtained by I-TS are better than the best solutions obtained by other three comparison algorithms in 10 out of the 18 tested instances. And the average solutions are obtained by I-TS are better than the average solutions obtained by other three comparison algorithms in 12 out of the 18 tested instances. As is shown in TABLE 7, the performance of I-TS is better than ACO, TS, and SA in 14 out of the 20 tested instances. In general, the results obtained by the proposed I-TS method are good in terms of solution quality, robustness, and computational time, when compared with the state-of-the-art algorithms from literatures in different scale instances. The results demonstrate the effectiveness and superiority of the proposed algorithm.

VI. CONCLUDING REMARKS AND FURTHER RESEARCH
This article approaches the vehicle routing problem with soft time windows and stochastic travel and service time. Since vehicle travel and service time are random variables, a stochastic programming model with modification is established. The tabu search is improved according to the characteristics of the problem. The improved greedy algorithm is used to generate the initial solution, and an adaptive tabu length and neighborhood structure are designed. Meanwhile, the influence of the variance of random variables such as vehicle travel time on the objective function is analyzed through experiments. The greater the variance, the greater the randomness of the problem, and the larger the objective function.
In the process of generating the initial solution, some constraints are considered, e.g., capacity constraint and soft time window constraint. Therefore, the initial solution obtained by the proposed algorithm has a low penalty cost for violating the constraints, which plays a key role in generating the optimal solution by the algorithm.
Most algorithms first determine the vehicle travel and service time in the process of searching for the optimal solution, which will generate a large correction cost. The proposed algorithm in this article fully considers the randomness of vehicle travel and service time, and obtains the expected earliness time and the expected delay time according to its probability density function and distribution function. When the vehicle travel and service time change randomly, the proposed algorithm has a low probability of generating a large correction cost. Based on the above analysis, we can see that the proposed algorithm has a significant effect on solving the vehicle routing problem with soft time windows and stochastic travel and service time.
To demonstrate the effectiveness of the I-TS algorithm, an experimental study has been carried out in 25-customer instances, 50-customer instances, and 100-customer instances respectively from Solomon's VRPTW benchmark and compared to other meta-heuristics algorithms. Experimental results show the effectiveness of the proposed algorithm for solving the SVRP-STW problem, and the proposed algorithm has strong optimization ability and high robustness.
Future researches mainly focus on three aspects: (1) introducing stochastic demand into the problem; (2) making the variables in the problem subject to probability distributions such as gamma distribution, exponential distribution, and logarithmic normal distribution; (3) designing more efficient algorithms, such as constructing multi-objective optimization algorithms.