Region Division in Logistics Distribution With a Two-Stage Optimization Algorithm

When facing the problem of modern logistics distribution under the large-scale network, the reasonable delivery task allocation has become an important part. For this problem, this paper uses the idea of decentralized control to establish a region division model in logistics distribution and transforms it into the multiple traveling salesmen problem with constraints of the road network and task allocation. The optimization goal of the model is to minimize the sum of the distances for logistics transport agents traveling all demand points in the system, with the constraints on the equilibrium of commodity requirements in each sub region. Then, a two-stage algorithm is proposed to solve the model. In the first stage, the $K$ -means clustering method is used to obtain a highly feasible initial solution; In the second stage, the initial solution is optimized by the algorithm which is combining swap algorithm and tabu search algorithm. In the two-stage solution evaluation, the TSP solution based on the Lin-Kernighan algorithm is applied to obtain the sets of demand points with ranking of the route and the corresponding shortest distances for each divided sub region. Finally, to verify the effectiveness of the proposed model and solving method, two cases are presented in this paper. The region division method in logistics distribution proposed in this paper not only helps to reduce the total distance, but also helps to balance the workload of the logistics transport agents, thereby making great potential to reduce logistics costs and improving the overall operational efficiency of logistics distribution.


I. INTRODUCTION
Logistics distribution planning is a vital part of the modern logistics system. With the continuous development of Chinese economy, the commodity requirements have increased greatly, and demand points for logistics distribution are more and more widely distributed. Thus, we will face the problem of logistics distribution under the large-scale network. In this regard, distribution task of a commodity from distribution centers to demand points is often responsible by several logistics transport agents. Due to the obvious differences in the quantity, geographical location, and traffic connectivity of each demand point, the lack of a reasonable distribution region division may result in a high detour cost. The total distance of services for all demand points can reflect and The associate editor coordinating the review of this manuscript and approving it for publication was Ioannis Schizas . evaluate the rationality of the regional division. Therefore, the scientific and reasonable region division in logistics distribution with the shortest total distance will make great potential to save costs in the logistics distribution process which has important practical significance.
At present, there are three main methods to divide the logistics distribution region. (1) The scanning method mainly regards the area where the customer point is located as a geometric plane, then scans it as a rectangle (grid), a sector or a polar line, and divides the area. Gillett and Miller proposed a sweep algorithm (SWA) [1] 11. Afterward, Na et al. added some extension to the SWA by combining it with the nearest neighbor method [2]. Zulfiqar et al. used SWA to process the road network nodes and generate the initial distribution region division plan [3]. (2) The clustering algorithm divides the whole distribution region into several categories according to the similarity between data sets. Gao et al. proposed a CLARANS-based clustering method constrained by road network and distribution scale for the customer clustering process with the location of the distribution center unknown [4]. (3) The heuristic algorithms are getting more and more attention. Wang et al. established a logistics distribution partition model to minimize the total cost of the network, then proposed an Extended Particle Swarm Optimization and Genetic Algorithm (EPSO-GA) to get an optimal solution [5].
For the convenience to manage the delivery task of each logistics transport agent, in reality, a demand point can only be visited by one logistics transport agent. Therefore, this paper looks upon the region division problem in logistics distribution with the objective of the shortest total distance of services for all demand points as a multi traveling salesman problem (MTSP). The MTSP is an extension of TSP [6], which transforms a single traveler into multiple ones, and often adds some additional constraints conditions to make them more suitable for real-life application, such as the flying sidekick traveling salesman problem considering a drone and truck collaboration [7] and the school bus routing with mixed loads [8]. From the perspective of modeling, the MTSP objective function is divided into single objective and multi objectives. The single objective is usually to minimize the longest tour length or to minimize the total length of all tours [9]. The multi objectives may be the combination of the two single objectives [10], or adding the objectives related to the equilibrium among the numbers of nodes or the tour lengths that travelers visit. The goals of appointing missions in Lu and Yue's research were to reduce the total distance, whereas to minimize the difference between the longest and the shortest tour lengths, and the multi objectives problem was solved by lumping the two claims into a single penalty-like objective function [11]. Trigui et al. provided an analytical hierarchy process-based approach to minimize three objectives: the total traveled distance, the maximum tour, and the deviation rate [12]. For model constraints, the most common one in the existing research is time window [13]. Considered the situation that time windows might be across periods for the same location, the multi periods MTSP with time window constraints was modeled by Yapicioglu [14]. Subramaniam and Gounaris extended the definition of the TSP with time windows to allow a waiting time at each customer location and to incorporate maximum route duration limits [15]. Then, the constraints on a minimal or maximum number of nodes that a traveler must visit are also common [16]- [18]. In addition, some road network-related constraints such as multi depot [19], asymmetric network [20] and so on are also included. In the logistics distribution, the commodity requirements at demand points are often very different. For this reason, some researchers take the quantity of commodity requirements as a known parameter loading into the delivery vehicle capacity constraint, and study the pickup and delivery route planning for vehicles [21], [22]. However, no research has been found to consider the constraints on the equilibrium of commodity requirements in each sub region.
For solving the MTSP, there are extensive researches. Part of the papers can optimize large systems with one phase algorithm which usually transforms the MTSP into the traditional TSP by setting virtual depots. Bellmore and Hong showed that the MTSP with m salesmen and N nodes can be converted into a standard TSP with (n+m−1) nodes [23]. The solution methods can be roughly divided into accurate algorithms and heuristic approaches. The former is represented by branch and bound [24], and the latter has received a lot of attention in the past 20 years. Genetic algorithm is widely used and researched in MTSP. For example, Yuan et al. adopted the two-part chromosome crossover for solving the MTSP using a genetic algorithm [25]; Zhou et al. proposed a partheno genetic algorithm with a new selection operator and a more comprehensive mutation operator to solve the MTSP [26]; Lo et al. designed a new Genetic Algorithm with two local operators, branch and bound and cross elimination, to solve the MTSP [27]. Then, swarm-intelligence-based approaches have also been applied to the MTSP successfully. Venkatesh and Singh proposed two approaches based on artificial bee colony and invasive weed optimization [28]; Lu and Yue assigned ants to teams with mission-oriented approaches to enhance ant colony optimization algorithms for the MTSP [29]. And neighborhood algorithm also has important value in solving the MTSP. LKH (Lin Kernighan heuristic) algorithm evolved from k-opt is one of the fastest heuristic algorithms in the world to solve the TSP [30]. This method has been proved to be able to find the real optimal solution within the scale of 13509 cities problem network [31]; Soylu proposed a general variable neighborhood search for the MTSP, and they concluded that the algorithms performed better than other algorithms such as the artistic bee colony algorithms and the innovative wed optimization algorithm proposed by Soylu [9] and Venkatesh and Singh [28]. On this basis, some researchers designed a hybrid algorithm to solve the MTSP. Yousefikhoshbakht et al. proposed a hybrid algorithm of the ant colony algorithm with the swap, insert and 2-opt approaches [32]; Larki and Yousefikhoshbakht presented an algorithm which integrated the modified imperialist competitive algorithm and Lin-Kernighan algorithm [33]; Gulcu et al. designed a parallel cooperative hybrid method based on ant colony optimization and 3-Opt algorithm [34].
The other researches solve the MTSP with multiphase optimizations. At present, there are two kinds of multiphase optimization methods. (1) Whole-network multiphase algorithm using multiple heuristic algorithms. Na proposed a two-phase heuristic approach for the MTSP where, in the first phase, m tours were constructed, and in the second phase these tours were improved [35]; Similarly, Sedighpour et al. designed a two-stage algorithm, where at the first stage a modified genetic algorithm solved the MTSP, and at the second stage the 2-opt operator was applied to improve the solution [36]; Wang et al. proposed a two-stage logistics system with the minimum cost as the goal to divide the distribution region based on particle swarm algorithm and VOLUME 8, 2020 genetic algorithm [5]. However, for this type of multiphase algorithm, once the network increases, it will directly affect the efficiency and effect of the solution. (2) A two-phase method with clustering or partitioning and TSP solving methods based on the idea of decentralized control. In the first stage, K -means is the representative algorithm. In the second stage, the heuristic algorithm above-mentioned of the one phase algorithm is basically used. Yu et al. proposed a hierarchical approach where the top-level used a genetic algorithm to implement city exchange among traveling salesmen with the result clustered by K -means, while the bottom level used a branch-and-cut and Lin-Kernighan algorithms [37]. Majd solved MTSP by K -means and crossover based modified ACO algorithms [38]. Xu et al. introduced the two phase heuristic algorithm, which included the K -means algorithm by grouping the visited cities based on their locations, and the genetic algorithm to assess the idea route for each above set achieved [18]. Above all, the algorithms could handle the special constraints which we encountered in industrial projects. At the same time, this type of two-phase algorithm can effectively improve the computing efficiency, but is less suitable to find the global option due to the nature of the algorithms.
This paper mainly focuses on how to divide the distribution region to make great potential to reduce the logistics costs, on the premise of considering the road network conditions and the reasonable delivery task allocation for logistics transport agents. The main contributions of our method include: (1) An extension of the current research on the balance problem associated with MTSP. In fact, commodity requirements at each demand point are different. In this regard, first of all, the problem is transformed into an MTSP with the constraints on the equilibrium of commodity requirements in each sub region, which is different from the existing MTSP models only considering the number of demand points. (2) A two-stage algorithm is designed to solve the problem. In the first stage, based on the route repetition, K -means is used to preliminarily divide the distribution region, so that multiple demand points along the same route from the distribution center can be divided into one sub region as much as possible. Consequently, it could meet the benefit for the less truck load (LTL) freight by the reduced detour cost which provides great possibilities for carpooling in the delivery of commodity. (3) In the second stage, the swap algorithm is used as the core, combining with the tabu search list. The demand points in adjacent sub regions are exchanged to optimize the clustering results in the first stage. This method can effectively break through the weakness of the two-stage algorithm that is difficult to find the optimal global solution in the existing research. (4) In the two-stage solution evaluation, the TSP solution based on the Lin-Kernighan algorithm is applied to obtain the sets of demand points with the ranking of the route and the corresponding shortest distances for each divided sub region. The optimal path planning is obtained while solving the region division problem. So far, the optimization of region division in logistics distribution is realized. Therefore, the model and algorithm proposed in this paper which takes the possibility of LTL freight and the balance of delivery task allocation into consideration to divide the distribution region, can not only mobilize the enthusiasm of logistics agents, but also help to reduce the detour cost and provide support for logistics distribution planning.
The remainder of this study is organized as follows. (1) The section of model construction presents the building method of the region division problem in logistics distribution; (2) In the section of algorithm design, a two-stage algorithm integrating the K -means clustering method, swap algorithm, tabu search algorithm, and LKH solver is proposed to solve this model. (3) In the section of case studies, we test the performance of the proposed model and algorithm in solving cases. (4) The last section summarizes the whole study and concludes with a discussion of future work.

II. MODEL CONSTRUCTION
In this paper, the region division problem in logistics distribution is transformed into the MTSP. In general, the region division with the smallest total traveling distance is the solution of the problem. On the basis of the traditional MTSP, the model established in this paper adds the constraints on the equilibrium of commodity requirements in each sub region.
The assumptions of the model include: 1) During the planning period, the locations of demand points and distribution center have been given; 2) For a region division planning, the logistics distribution service resources and equipment are not taken as parameters, hence the delivery tasks of each logistics transport agent shall be completed by one-time; 3) It does not consider the dynamic variable factors such as the uncertainty of commodity requirements and the uncertainty of road network performance; 4) The logistics distribution process starts from the distribution center to demand points, and returns to the distribution center. All demand points are served once. Repeat section between shortest path from distribution center o to demand point x and shortest path from distribution center o to demand point y dp xy Length of p xy µ xy Route repetition between any two demand points.
It is equal to the ratio of the length of p xy to the larger value of d ox and d oy .

C. OBJECTIVE FUNCTION
In order to reduce the detour cost for LTL freight, the optimization objective is set to minimize the total distance of all tours in each sub region.
Among them, equation (2) ensures that the sum of demand points in the sub region is equal to the total number of demand points; equation (3) ensures that each demand point belongs to only one sub region; equation (2) and equation (3) constitute the constraints of full coverage for all demand points; equation (4) ensures that all delivery tasks are accomplished; equation (5) is the flow-balance constraint for demand points, that is, the number of travelers arriving in each demand point is equal to the number of departures of travelers in the demand point; in equation (6), a and b are lower and upper thresholds of the equilibrium ratio to ensure the balance of delivery tasks allocation for logistics transport agents in some extent.

III. ALGORITHM DESIGN
In this paper, the solution methods are proposed for models without and with the constraints on the equilibrium of commodity requirements respectively.
It should be noted that LK algorithm is used in this paper for obtaining the objective function value corresponding to the region division scheme. This algorithm is one of the most effective methods to solve TSP at present, and LKH solver which has been applied extensively, can be flexibly embedded in complex algorithms for efficient operation [30].

A. ALGORITHM FOR CONTROL GROUP (WITHOUT THE CONSTRAINT ON THE EQUILIBRIUM OF COMMODITY REQUIREMENTS)
Without considering the constraints on the equilibrium of commodity requirements, the problem of region division is transformed into the problem of MTSP without special constraints. Each sub region is regarded as a traveler, then the demand points that a traveler passes through composing the corresponding sub region set. The optimization goal is to minimize the total distance of all tours.
In the solution method, by setting virtual origin nodes, the MTSP is simplified to TSP. The specific steps are as follows: Step 1: The number of sub regions is k, and the number of demand points is N. Add (k-1) virtual distribution centers (origin nodes) with the same location of the real origin node (the distance between any two origin nodes is set to infinity).
Step 2: Construct the adjacency matrix with (N + k) nodes, and get the shortest path matrix based on the Dijkstra algorithm.
Step 3: The shortest path matrix is substituted into LKH solver to solve TSP, a set of all demand points with the ranking of the route is obtained.
Step 4: Then, the demand points between two neighboring origin nodes on the output route are divided into one sub region, the demand points ranking of the route in each sub region, and the corresponding shortest distance are recorded.

B. ALGORITHM FOR THE PROPOSED MODEL (WITH THE CONSTRAINTS ON THE EQUILIBRIUM OF COMMODITY REQUIREMENTS)
It is more difficult to solve the problem when considering the constraints on the equilibrium of commodity requirements. In this paper, a two-stage algorithm is designed to solve the problem. In the first stage, the route repetition is used to cluster the demand points with the equilibrium constraints to obtain a highly feasible initial solution. In the second stage, with the sub region adjacency and the equilibrium constraints, the swap exchange algorithm and tabu search are adopted to expand the search area and further reduce the total distance of all tours. Finally, we test the influence of the thresholds of the equilibrium ratio and the scales of road networks on solutions.

1) FIRST STAGE ALGORITHM
In the first stage, the initial solution is obtained based on the K -means algorithm, which is the most widely used clustering algorithm proposed by Macqueen in 1967. The core of the algorithm is to classify data sets according to the similarity between data samples, then optimize the criterion function for evaluating cluster performance to ensure that each cluster is compact inside and independent between clusters.
This paper obtains the shortest path matrix from the distribution center to each demand point, and then K -means algorithm is applied to cluster all demand points according to the route repetition to reflect the benefit for LTL freight. The steps are as follows: Step 1: The number of sub regions is k, and the number of demand points is N. Based on the Dijkstra algorithm, obtain the shortest path matrix from the distribution center to all demand points.
Step 2: According to the shortest path matrix, calculate the route repetition µ xy between each two demand points.
Step 3: Set the current number of iteration t = 0. Pick out k demand points to be initial clustering centers randomly.
Step 4: Compute the Euclidean distance from each demand point to each clustering center, d xy Put the demand point into a clustering with the smallest distance, then the clustering result A 0 = C 0 1 , C 0 2 , · · · C 0 k is obtained.
Step 5: Update t = t + 1, and update the clustering centers by calculating the average value of sample points on N dimensions in k clusters according to the result in iteration t. The new matrix with 1 * N elements is set to be a new clustering center.
Termination Criteria: The variance of commodity requirements among all clusters is less than a parameter q 1 , and the average distance which is defined to be the average value of Euclidean distance from each demand point to the center of the corresponding cluster is less than a parameter q 2 . The setting of the termination criteria enables the clustering process to output a relatively good initial solution, but the algorithm does not converge. Therefore, without changing the values of parameters q 1 and q 2 , the algorithm needs to be run multiple times to get a better initial solution. Furthermore, during clustering, the values of the two parameters q 1 and q 2 need to be adjusted to assure the existence of feasible solution.
According to the clustering result, the optimal TSP solution is calculated by LKH solver, the ranking of the demand points in each region is recorded as A 1 = C 1 1 , C 1 2 , · · · C 1 k and the objective function value is obtained D 1 = i∈I x∈C i y∈C i d xy θ xy .

2) SECOND STAGE ALGORITHM
In the second stage, the initial solution is optimized iteratively by combining swap algorithm and tabu search algorithm. The swap algorithm is a local search algorithm which has been widely used in the field of combinatorial optimization.
Considering the possible limitations of the initial solution, the algorithm optimizes it by adding mobile and exchange operations. To expand the search area, this paper applies the tabu list which records the local optimum in the search history and forbids it to be repeatedly selected in the next search, then the global optimal solution can be obtained by jumping out of the local optimum.
The specific parameter setting method of the second stage algorithm is as follows: x Tabu object: The historical optimal solution obtained in each iteration is put into the tabu list as a tabu object.
y Neighborhood operation method: Based on the exchange and movement of solution elements, the conventional neighborhood selection strategies are the 2-opt method and insertion method.
z Determination of tabu list length: The tabu list length is a constant according to the scale of the problem, and this paper sets it as the number of adjacent sub regions of a sub region.
{ Amnesty condition: The value of the objective function after the exchange is less than the historical optimal solution D m .
Step 1: Initialization: Input the road network, the commodity requirements of each demand point, the initial region division scheme A 1 = C 1 1 , C 1 2 , · · · C 1 k obtained by the first stage algorithm and its objective function value D 1 , and set the current number of iteration t = 0.
Step 2.1: Candidate sets generation. Based on the region division scheme, take demand point sets of each sub region as tabu objects, and randomly select a sub region and one of its adjacent sub region as a candidate pair C i , C j , generate γ candidate pairs.
x Update σ . y Perform exchange operation in candidate sets. According to the number of demand points in the sub region i and j, n i and n j get demand point sub sets C i , C j (C i ⊆ C i , C j ⊆ C j ) with a random number of demand points which is between {0, 0.5 · n i } or 0, 0.5 · n j . Swap all the demand points of set C i with the demand points of the set C j .
z Check whether the new scheme meets the constraints it is satisfied, save it. Then use LKH solver to calculate the optimal TSP solution of each sub region and obtain the objective function value. If it is not satisfied, do not save the scheme.
Step 2.3: Update γ = γ + 1, until exchange operation for all candidate pairs is carried out.
Step 2.4: After calculating all the candidate sets, select the solution A m = C m 1 , C m 2 , · · · C m k with the lowest value of the objective function D m . The demand points with the ranking of the route in each region and the corresponding value of the objective function is recorded.
Step 3: Tabu list update: Check whether is the candidate pair C m i , C m j of the solution A m , already in the tabu list, and make sure that the length of the tabu list is consistent with the number of candidate pairs.
Step 3.1: If the candidate pair C m i , C m j is not in the tabu list, make the following judgment.
x If the length of the tabu list is less than γ , put it into the tabu list; y If else, the first tabu object is released from the tabu list, and the candidate pair of the current solution A m is added at the end of the tabu list.
Step 3.2: If the candidate pair C m i , C m j is already in the tabu list, it is necessary to compare it with the amnesty condition.
x If it is satisfied with the amnesty condition, the tabu list will not be updated; y If not, the first tabu object is released from the tabu list, and the candidate pair of the sub-optimal solution will be added to the tabu list.
Termination criteria: The maximum number of iterations T is reached.
The current optimal solution output by the algorithm is the model optimal solution. In this paper, region division, TSP solution in each sub region and the total distance are the final output result.

C. COMPLEXITY ANALYSIS 1) FIRST STAGE ALGORITHM
The number of sub regions is k, and the number of demand points in each region is n i . The termination criterion is a linear constraints which inspects parameters in the clustering process. The time complexity of this step is ε which is far below k · n i . Therefore, it can be obtained that the clustering complexity is O 1 = k · n i + ε, simplified as O 1 = O 1 (n i ).

2) SECOND STAGE ALGORITHM
x Swap algorithm: The number of sub regions is k, the total number of external iterations is n max , and the number of internal iterations in neighborhood interchange is 0.5(n i +n j ). The algorithm complexity is O 2 = n max · 0.5(n i + n j ) · k, simplified as O 2 = O 2 (n i ).
y Tabu search: The number of demand points in each region is n i , the number of candidate pairs is γ , the length of the tabu list is l. The time complexity of judging the tabu list is n i · l, updating the tabu list is n i · l, the total complexity is O 3 = n max · (γ + 2n i l).
z Lin-Kernighan algorithm: According to the article ''An Effective Implementation of the Lin-Kernighan Traveling Salesman Heuristic'', the time complexity of Lin-Kernighan algorithm is O 4 = n 2.2 i . Therefore, the total time complexity of the two stage algo-

IV. CASE STUDY
The algorithm presented above is coded in Python 3.6. The algorithms are run on a desktop with a 2.8 GHz processor and RAM of 8 GB. To verify the effectiveness of the proposed model and algorithms, two cases are illustrated in this section.

A. DEMONSTRATION CASE
In order to evaluate the excellent degree of optimized region division plan and the effectiveness of the proposed algorithm, a case of logistics distribution in a small-scale network is set up in this paper, and the global optimal solution is calculated by an exhaustive method for comparison. Set node 10 as the distribution center, and nodes 1-9 as the demand points. The road network with cost (distance) of each directed arc and the quantities of commodity requirements at each node (unit: piece) are shown in the Figure 1. Besides, two logistics transport agents are designed to participate in the delivery tasks. Therefore, all demand nodes are required to be split into two parts. The shortest distance matrix based on the Dijkstra algorithm is as shown in Table 1. Without the constraints on the equilibrium of commodity requirements, a virtual origin node and a real origin node are VOLUME 8, 2020 used for region division. Combined with LKH solver, we can get the optimal solution {[0-9-0], [0-7-8-6-5-3-1-2-4-0]}. Based on this division plan and the ordered route, the total distance is 70600, and the equilibrium ratio of commodity requirements is (0.13, 1.87). The equilibrium ratio is the set of the smallest and largest values of the commodity requirements in all sub regions divided by the average value. It could be found that there is a large difference in the commodity requirements in the two sub regions, which is not conducive to balance the workload of the logistics transport agents. Exhaustive method: Set the thresholds a and b of formula (6) to be 0.8 and 1.2 respectively. Nine demand points are divided into two groups, there are C 1 9 + C 2 9 + C 3 9 + C 4 9 = 255 combinations. In this paper, only the first 20 groups of excellent solutions that meet the equilibrium constraints of commodity requirements are listed, which are sorted according to the total delivery distance from small to large. The TSP solution and equilibrium ratio for each region division plan are shown in Table 2. The accurate optimal solution is {[0-7-6-8-0], [0-4-2-1-3-5-9-0]} with the total distance of 89200. Considering the same parameter setting in equilibrium constraints of commodity requirements as in (2), the proposed two-stage algorithm is applied to solve it. The first stage algorithm obtains the initial solution of {[0-4-2-3-5-9-0], [0-7-6-1-8-0]} with the total delivery distance of 92500. In the second stage algorithm, the number of iterations is set to be 100, then, the optimal solution is obtained to be {[0-7-6-8-0], [0-4-2-1-3-5-9-0]}, and the total distance is 89200. To avoid randomness, we increase the number of iterations to 200, 300 and 400 respectively, and the obtained optimal solutions are consistent with the above. Compared to the initial solution, the total distance in the final optimal solution is reduced by 3.57%. Furthermore, the final optimal solution is consistent with the accurate optimal solution obtained by the exhaustive method. The calculation time is 230 seconds. This case also illustrates that in small-scale road networks, the algorithm proposed in this paper can effectively calculate the optimal solution.

B. ACTUAL CASES
It is known from previous researches that when the scale of nodes on the road network is gradually expanded, the MTSP will become an NP-hard, and the accurate optimal solution by traditional algorithms will take a lot of time to solve. In order to verify the effectiveness of the proposed algorithm for large-scale network cases, a real-life logistics distribution planning for a commodity is taken as an example. The commodities are needed to deliver from the commercial company in city Q of Province J (distribution center), to commercial companies in 301 cities, China. The past region division of this company was based on artificial experiences and provincial boundaries. The past total distance is 76460547, the equilibrium ratios are between 0.29-2.68. It could be observed that the historical data of commodity requirements at each demand point had stability by year. Therefore, the case analysis is performed based on the historical data in the past two years. The entire distribution region is determined to divide into 6 sub regions, that is to say, 6 logistics transport agents will be needed to responsible for the corresponding delivery tasks in the following several years. Therefore, specifically speaking, the optimized solution is used for the rough evaluation of logistics cost and bidding for logistics transport agents in this case. The results of actual cases are shown graphically.

1) DATA SAMPLES
For showing historical data intuitively, 3 records of historical data are shown in Table 3. The distances are the real lengths of shortest path between cities. The expressways' names are the actual Chinese expressways which the paths pass. The long string numbers between them are the lengths of the expressways.
A pie chart is demonstrated in Figure 2 to show the difference among the quantities of commodity requirements between cities clearly. The demand points are divided into 8 intervals according to the ranking of commodity quantities. The chart contains the proportions of the point numbers of each interval in the total point number. The interval 34-25033  has highest proportion 78.98%. The ratio of the maximum quantity to the minimum quantity is 4889.

2) CASES RESULTS
Four solutions of region division are demonstrated based on four methods. The first scheme is the control group solution, which does not consider the constraints on the equilibrium of commodity requirements. The second scheme satisfies the equilibrium constraints and is obtained by the existing general two-level algorithm, in which the most common K -means clustering algorithm is used first, and the LKH algorithm is applied in the lower layer to solving TSP. The third scheme applies GA in the second stage while the first stage algorithm is the same K -means. The fourth scheme is the optimal solution of the proposed model and algorithm. Finally, this paper compares the objective function, equilibrium ratio of commodity requirements, and calculation time of the four solutions to test the scientificity of the proposed model and algorithm.
x Control group solution without equilibrium constraints (scheme 1) In the control group solution, the majority of cities are divided into the same sub region, whereas only a few cities are divided into the other five sub regions, and the visualization result is demonstrated in Figure 3(a). The objective function value is relatively excellent. However, due to the lack of the constraints on the equilibrium of commodity requirements, the unbalanced distribution of commodity requirements between sub regions is extremely serious.
y Control group solution with equilibrium constraints (scheme 2) Referring to the first part, the current research method for the region division problem under the large-scale network is mainly combined with clustering and TSP solving. The results show that the cities in a sub region after clustering often are fragmented to some extent rather than adjacent. Under the harsher setting of equilibrium constraints, with thresholds a and b of equation (6) closer to 1, the fragmentation is more serious, which will certainly lead to detour costs. The visualization result with thresholds a and b to be 0.8 and 1.2 is shown in Figure 3(b). Six colors mean six sub regions, and the lines in one color include the information of cities with ranking of the route in this sub region. The light gray lines in the background image are the boundaries of the provinces.
z Control group solution with genetic algorithm in the second stage (scheme 3) The second stage adopts the classical genetic algorithm to solve the path planning of the initial solution based on the clustering. The crossover probability is P c = 0.9, mutation probability is P m = 0.2. As shown in Figure 3(c), it can be found that there are obvious overlaps in the path of each region which lead to detour costs. It can prove that the genetic algorithm is inefficient in solving TSP problems on large-scale networks.
{ Optimal solution by the proposed two-stage algorithm (scheme 4) Same as scheme 2, take the solution with thresholds a and b of equation (6) to be 0.8 and 1.2 respectively for example. As shown in Figure 3(d), it can be found that the optimal solution obtained by the proposed model also divides the cities into six sub regions. Then, the fragmentation is less serious than scheme 2 and scheme 3, to be more specific, the sub regions represented by magenta lines in Figure 3(b) and Figure 3(c) are disintegrated into six sub regions in Figure 3(d). That is to say, scheme 4 is more likely to reduce detour cost for LTL freight.
Compare the objective function value, equilibrium ratio and computing time of each scheme, the results are as shown in Table 4. In the past scheme of the company, the total distance is 76460547, the equilibrium ratios are between 0.29-2.68. The total distance of the optimal solution of the proposed algorithm decreases by 16.8%, and the equilibrium of commodity requirements improves as 0.8-1.2 which seems more reasonable.
Comparing scheme 4 with scheme 1, the objective function values is about 26.2% worse. This value reveals the effect of adding the constraints on the equilibrium of commodity  requirements to the optimal solution. Aimed at scheme 2 and scheme 4, on the premise that the same equilibrium constraints is satisfied, scheme 4 takes longer to solve, but the objective function is optimized by 13.8%. In scheme 3, there is the same initial solution, and the second stage algorithm is GA. The results compared to other schemes are not ideal. GA has low efficiency in solving TSP problems in large-scale networks.
It could be concluded that the proposed algorithm might get the more optimized solution with shorter total distance than the existing general two-level algorithm.

3) SENSITIVITY ANALYSIS
x Parameters: During the experiment process, it can be found that the model solutions are not only determined by intuitive factors such as the road network and commodity requirements at demand points, but also influenced by several potential factors.
On the one hand, the number of sub regions, which is determined based on the actual number of logistics transport agents in general, can affect the proposed model optimal solution. When the number gets larger, the total distance in the solution gets smaller, and it takes the more computing time. On the other hand, the values of parameters a and b of equilibrium constraints have a great impact on the proposed model optimal solution. In the sensitivity analysis on equilibrium constraints thresholds, five tests are conducted. Table 5 shows that as the thresholds get closer to 1, the objective function value is on the rise. It can be concluded that absolute equilibrium is not conducive to reduce the total distance in the TSP solution. Therefore, an acceptable solution needs to be found by satisfying both a certain degree of equilibrium of delivery task allocation and a shorter total distance in the TSP solution.
In the practical application of this case, considering the potential differences in the capability of logistics transport agents, the equilibrium constraints of commodity requirements could be relaxed by adjusting the values of a and b appropriately. However, this adjustment of thresholds has a slight effect on the computing time.
y Scales of networks: The number of demand points will have a certain influence on the optimization effect, and the expansion of scales will also affect the calculation efficiency. In this part, Python is used to generate random networks with 100-700 points. The quantities of commodity requirements refer to the historical data in the preceding case. The results are shown in Table 6, and the improvement degree demonstrates the superiority of the solution by the algorithm in this paper compared with GA. With the increase of demand points, the computing time both increase. The proposed algorithm spends more running time to solve in networks with 100-500 points, but the degree of optimization of it is superior to the GA by 29.4%-41.3%. When the point scales expansion to 600, the time of GA begins to exceed the proposed algorithm. It shows that the search efficiency of GA algorithm is gradually reduced with the expansion of network scales, requiring more time to solve. However, the proposed algorithm in this paper can maintain stable search efficiency.

V. CONCLUSION
Facing the problem of modern logistics distribution under the large-scale network, there are more and more demand points. This requires multiple logistics transport agents to be responsible for the delivery tasks at the same time. Effective control of logistics costs is a major problem. Specifically, due to the obvious differences in the quantity, geographical location and traffic connectivity of each demand point, an unreasonable distribution region division may result in a high detour cost for LTL freight. In this paper, under the premise of considering the equilibrium of delivery task allocation, the scientific and reasonable region division in logistics distribution is carried out based on the TSP solving.
It will make logistics transport agents more likely to save delivery costs in the logistics distribution process, which has important practical significance. This paper establishes a region division model in logistics distribution that considers the reasonable delivery task allocation, and transforms it into an MTSP with the constraints on the equilibrium of commodity requirements in each sub region, which is different from the existing MTSP models with constraints on number of cities in each sub region or time window. Considering the approximate equilibrium of delivery task allocation may be a new constraints for the MTSP. Particularly, due to the differences in the commodity requirements at each demand point, the new constraints would better meet the actual needs of a region division planning in logistics distribution. It is conducive to balance the workload of the logistics transport agents to improve the overall operation efficiency of logistics distribution. In order to solve this model, this paper absorbs the previous twostage algorithm and heuristic algorithm to design a twostage algorithm. In the first stage, the K -means clustering method is used to obtain a highly feasible initial solution. Specifically, multiple cities on the same route are divided into one region as far as possible based on the route repetition, which can increase the possibility of LTL freight and reduce the detour cost. In the second stage, the swap algorithm is applied combining with the tabu search list. The demand points in adjacent sub regions are exchanged to optimize the clustering results in the first stage. The downhill operation of the tabu list is used to expand the search area. This method can effectively avoid premature fall into local optimization, which breaks through the weakness of the two-stage algorithms in the existing research which are difficult to find the global optimal solution. In the two-stage solution evaluation, the TSP solution based on the Lin-Kernighan algorithm is applied to obtain the sets of demand points with ranking of the route and the corresponding shortest distances for each divided sub region. So far, the optimization of region division in logistics distribution is realized.
The proposed model and algorithm are tested by two cases. Firstly, taking the case under a small-scale road network, the exhaustive method is used to calculate the accurate optimal solution to compare with the optimal solution obtained by the proposed model and algorithm. The result shows that the two optimal solutions are consistent, which illustrates that in small-scale road networks, the algorithm proposed in this paper can effectively calculate the optimal solution. Then, based on the actual case under a 301 cities road network, four solutions of region division are demonstrated and contrasted, including the control group solution without the constraints on the equilibrium of commodity requirements (scheme 1), the control group solution with the equilibrium constraints based on K -means clustering method (scheme 2), the control group solution with genetic algorithm in second stage (scheme 3) and the optimal solution based on the proposed two-stage algorithm (scheme 4). As a result, although scheme 4 takes the relatively more running time, it is better than the control groups to a certain extent, the delivery task allocation for logistics transport agents is more reasonable than scheme 1, and the total distance could be significantly reduced compared with scheme 2 and 3.
According to the current research content, the future research outlook is to add the constraints on distance equilibrium in each sub region for logistics distribution planning. Besides, aimed at delivery vehicle routing, the number of vehicles dispatched and the actual total mileage need to be calculated, which ensures that the delivery task can be assigned more rationally, and the logistics distribution efficiency can be improved.