A Hyper-Heuristic Algorithm for Time-Dependent Green Location Routing Problem With Time Windows

The Location Routing Problem (LRP), a branch of logistics management, has been addressed in many research papers. However, there are few papers on time-dependent LRP. And only a few of them take fuel consumption into consideration. To reduce the environmental pollution from vehicle emissions and the cost pressure on logistics, a novel model named the time-dependent green location routing problem with time windows (TDGLRP) is developed. Its objective is to minimize costs including opened depot costs, enabled vehicle costs and fuel consumption costs. In TDGLRP the speed and travel times are time-dependent function. A hyper-heuristic algorithm (HH) that consists of two levels, high-level heuristics (HLHs) and low-level heuristics (LLHs), is proposed to solve the TDLGRP. The Tabu Search (TS) algorithm is taken as the high-level selection mechanism, and the Greedy algorithm is taken as the acceptance mechanism. With reference to the Solomon benchmarks, we design a series of TDGLRP instances with 100 client nodes, and analyze the impact of client distribution characteristics on the path. Based on the TDGLRP model and HH, the end of the article gives the solution results of a large-scale instances with 1000 nodes.


I. INTRODUCTION
Urban logistics play an important role in the economy, society, environment, and citizens [1]. The Location Routing Problem (LRP) is a branch of logistics management [2] and conducts joint decision-making with regard to the locations of arbitrary types of facilities and the routing of vehicles [3]. As one of the most important optimization problems in logistics system planning [4], the LRP can be traced back to 1961 [5]. However, the importance of coordinating the location and vehicle route was not truly recognized until the 1970s [6], [7]. With further research, a number of LRP variants with different constraints have been developed, such as the LRP with time windows [8], the LRP with simultaneous pickup and delivery [5], [9], and the bi-objective LRP [10]. For more information on the variants of the LRP, the reader can refer to the following two surveys [11], [12]. In the last decade, consumers, businesses and governments have increased their attention to the environment [13].
The associate editor coordinating the review of this manuscript and approving it for publication was Zhiwu Li . City logistics, as a primary source of carbon emissions, should be formulated to initiate carbon emission reductions during related activities [14], [15]. As an essential tool in the supply chain, the LRP should consider fuel consumption and emission reductions. The LRP that considers environmental issues can be called the Green LRP (GLRP) [16].
In traditional LRP, that the travel time between two nodes are constant, and the Euclidean distances between nodes are often used as the travel times. However, in real road network, the travel speed and travel time are time dependent function. The time-dependent LRP problem is more instructive to the logistics.
The literature relating to time-dependent and green LRP are not very common. We searched the literature on GLRP in recent years and classified them by year. A. 2016 In this year, we found three papers on the GLRP. Koç et al. [1] investigated the combined impact of the depot location, fleet composition and routing decisions on vehicle emissions in city logistics. Its objective was to minimize the total costs including the depot costs, vehicle and routing costs and emissions costs. The feature of this paper is that the clients are located in nested zones with different speed limits. In addition, a new powerful adaptive large neighborhood search meta-heuristic was developed. Tang et al. [17] established a bi-objective model of costs and carbon emissions from the perspective of a sustainable supply chain network. In this paper, a multi-objective particle swarm optimization algorithm was used. A computational experiment and sensitivity analysis were conducted using data from the China National Petroleum Corporation. The results indicated that the research can be applied to actual supply chain operations. Based on the assumptions of time windows and split-delivery, Qazvini et al. [18] presented a mixed integer linear model for the so-called green routing problem. The model had been successfully applied to a light auto parts distribution chain.

B. 2017
Rabbani et al. [19] introduced a new variant of the Multiobjective Green Location Routing Problem (MOGLRP) to minimize the total traveled distances and the total costs including the vehicle fixed costs and variable travel costs and the CO 2 emissions. The fleet distribution is heterogeneous. Similar to Rabbani, Wang and Li [20] also considered the GLRP problem with a heterogeneous fleet, and considered the constraints of simultaneous pickup-delivery and time windows. The paper introduced the concept of temporal-spatial distance, which made the quality of the initial solution higher. Toro et al. [21] proposed a multi-objective GLRP model to minimize the operational costs and minimize the environmental effects. The model shows that the fuel consumption is related to the vehicle weight, speed, road slope and wind resistance. To simplify the calculation, the average speed and average slope were used in this paper. Zhang et al. [22] simplified fuel consumption as a function of load and distance.

C. 2018
In this year, many researches on GLRP mainly focused on its application and update solution algorithm. Chen et al. [15] and Wang et al. [23] successfully applied the GLRP model to cold chain logistics. Faraji and Nadjafi [24], Longlong et al. [25], Leng et al. [26], Zhao et al. [27] and Qian et al. [28] used the new hyper-heuristic algorithm to solve the GLRP. They proved the efficiency of the hyper-heuristic algorithm at solving the GLRP problems.

D. 2019-2020
From 2019 to the present, multi-objective GLRP problems remain the focus of research [14], [29]. At the same time, the influence of vehicle speed on fuel consumption and emissions has received increasing attention. For example, Leng et al. [4] and Koç et al. [1] divided the logistics distributed field into three speed zones and the speed of each zone was different.
The papers mentioned above have different optimization strategies for the fuel consumption and emissions. The optimization objectives of some papers are to minimize the costs, which includes the fuel consumption costs or emissions [1], [22]. One paper takes the fuel consumption as a constraint [18]. The other papers take the fuel consumption or emissions as one of several optimization objectives. Another difference between these papers is the method of calculating fuel consumption. As we all know, fuel consumption is affected by many factors, such as the vehicle's own parameters, weather factors, vehicle speed factors and so on [30], [31]. Among these factors, speed is the most critical one [14]. However, one part of the literature above did not consider the effect of vehicle speed and the other part did not consider the time variability of speed. In the actual road network, the vehicle speed is variable and dynamic because of the traffic flow, weather, accidents and other factors. It will lead to deviations of up to 20% in emissions for gasoline vehicles on an average day and up to 40% in congested traffic if a constant vehicle speed is used to calculate fuel consumption [32]. Therefore, this paper adopts a time-dependent GLRP model. The main work and contributions of this paper can be stated as follows.

E. PROBLEM AND MODEL
A mixed-integer programming model named TDGLRP is developed by using a comprehensive fuel consumption function. The objective is to minimize the total costs including the opened depots cost, fixed vehicle costs and fuel consumption costs. Compared to the traditional LRP, the delivery network is dynamic and the speed is a function of time. The schematic diagram of the TDGLRP problem is shown in FIGURE 1.

F. EFFICIENT OPTIMIZATION ALGORITHM
To solve the TDGLRP, a hyper-heuristic algorithm (HH) is proposed. There are two levels in the HH framework: highlevel heuristics (HLHs) and low-level heuristics (LLHs). The Tabu Search (TS) is taken as the high-level selection mechanism, and the Greedy algorithm is taken as the acceptance mechanism. Moreover, this paper applied an insertion method based on the travel time (IMTT) to generate the initial population, which is critical to the quality of the final solution.
The remainder of this paper is organized as follows. Section 2 focuses on the modeling. Section 3 introduces the IMTT and HH. Section 4 mainly introduces the results and analysis. The last section is a summary.

II. DESCRIPTION AND MATHEMATICAL MODEL
In this section, the fuel consumption model and the TDGLRP model will be introduced.

A. PROBLEM DESCRIPTION
The TDGLRP studied here is defined on an asymmetric directed graph G = (V ,E), where V is a set of nodes and E is a set of edges A hard time window means that if the vehicle arrives at client i earlier than a i, , it must wait until a i to start service. The capacity P j , the rent cost FD j and the coordinates (x j ,y j ) of the candidate depot j (j ∈ C) are known. Each vehicle h in the homogeneous fleet has the same capacity Q h and rent cost FV h . The symbol d ij represents the distance between the two nodes i and j (i = j ∈ V ). The travel time depends on the speed, which changes according to the departure time and the arc being traversed. The vehicle traversing different arcs has different travel speeds V (t), t ∈ S = {t0, t0 + , . . . , t0 + (M-1) },where t0 is the earliest time from any node in the network, is the time interval, M is the time zone.  There are two possibilities: one is that the vehicle arrives at B before T i+1 (see the solid line in FIGURE 2), and the other is that it arrives at B after T i+1 (see the dotted lines). For the second case, assuming that the vehicle reaches B in [T j−1 T j ]. The distance between A and B is d ij ,and vehicle speed at time t is expressed as V (t), then the travel time TT AB from A to B is calculated as follow formula:

C. FUEL CONSUMPTION MODEL
We estimate the fuel consumption and emissions that is based on the Comprehensive Modal Emission Model (CMEM) [33], which has been widely used in LRP. The CMEM can accurately predict the fuel consumption and emissions [1], [34]. if a vehicle h traveling from node i to node j, the fuel consumption over this segment can be obtained by (1) using the CMEM model: where P tract (kW) is the total traction power requirement; K ijh (kg) is the load of the vehicle h moving from node i to node j; P(kW) is the second-by-second engine power output; P acc is the engine power demand associated with the running losses of the engine (here, P acc = 0); FR (g/s) is the fuel consumption rate; v ijh (m/s) is the speed of the vehicle from node i to node j; d ij (m/s) is the distance between nodes i and j; F ijh (g) is the fuel consumption of the vehicle from node i to node j; and θ is the road angle, and here its value of each road is equal to 0 [2], [35]. TABLE 1(a) and (b) gives the parameters used in the above model for estimating the fuel consumption [36], [37].

D. MATHEMATICAL MODEL
Before providing the proposed model, several assumptions should be made: (1) each client must be served only once and must start being served within the time window; (2) each vehicle must return to the original depot, (3) The capacity of each vehicle and of each depot cannot be exceeded. The other  The proposed model seeks to minimize the total costs including the depot costs, vehicle costs and FC costs, and can be represented as follows.
Objective function: where: Distance function: Travel time function: The following constraints must be satisfied: i∈C j∈D The description and explanation are as follows. Equation (2) is the optimize objective of the TDGLRP;(3) is the costs of vehicles and depots;(4) is the fuel consumption function; (5) is total distances function; (6) is the total travel time function; Constraints (7) and (8) make sure that each client is served exactly once. Constraints (9) and (10) impose that each client is assigned to only one depot and one vehicle. Constraints (11)-(13) forbid unfeasible routes that do not return to the departure depot. Constraints (14) and (15) make sure that the demand of each client is met. Constraint (16) guarantees that the load of each selected depot must be less than its capacity. Constraint (17) is the dynamic equilibrium of the load of each vehicle after visiting client j. Constraints (18) and (19) guarantee that the vehicle capacity is not exceeded. Constraints (20)(21) are the time window constraints. Finally, the last three constraints are decision variables.

III. HYPEY HEURISTIC ALGORITHM
This section describes the hyper-heuristic algorithm that is used to solved the TDGLRP model. The hyper-heuristic(HH) algorithm system was defined as a ''heuristic selection heuristic'' algorithm [38]- [40]. The HHs algorithm have been widely used in operational optimization problems such as the Vehicle Routing Problem (VRP) [41], VRP with time windows (VRPTW) [42], LRP [43] and low carbon LRP [28]. HH is divided into two levels: low-level heuristics (LLHs) in the problem domain and high-level heuristics (HLHs) in the control domain. The LLHs provide a series of low-level heuristics and problem definition, objective function and other information. The HLHs automatically produce an adequate combination of the provided HLHs components to effectively solve the given problems [44]. The framework of hyper heuristic algorithm is shown in Figure 3. It is very important to design efficient selection strategies and acceptance criteria in the HH. There are many selection strategies (SA) include the simple random (SR) sampling, choice function (CF), genetic algorithm (GA) [45], [46], Tabu Search (TS) [47], [48], and quantum evolutionary algorithm (QEA) [25] strategies. SA falls into two categories: deterministic acceptance, which accepts the resultant solution based on the fitness or special rules; and non-deterministic acceptance, which accepts the resultant solution based on a threshold or probability [49].

A. LLH OPERATORS
LLHs can be viewed as a ''black box'', which are used to perturb the incumbent solution to guide the search by either intensifying or diversifying the search region [2]. In this paper, the LLHs are divided into three parts according to the object: inside one route pools, between two routes pools, and the depot pools. All the operators in the three pools are shown in TABLE 3.

B. HIGH-LEVEL HEURISTIC (HLH)
The Tabu Search (TS) is a basis for the development of HH, which guides the next search direction by establishing a tabu list that records the optimization process. Its advantage is that it can avoid falling into the local optimal solution. In this paper, the TS is adopted as the high-level strategy. The search process is as follows [28].
• Each LLH operator k has an initial score r k ∈ [r min , r max ].
• Select an operator with the highest score in the non-tabu list.
• Update r k via reinforcement learning method. If the operator k improves the current solution, the operator adds the value r k = r k + b (b > 0); otherwise, the operator subtracts the value r k = r k − b. When operator k cannot improve the current solution, it will be put into the tabu list with a fixed length L. In addition, another operator in the tabu list will be removed according to the principle of the ''first in, first out'' mechanism.

C. ACCEPTANCE CRITERION DESIGN
The acceptance criterion (AC), which is used to determine whether the child solution cf replaces the parent solution pf, directly affects the convergence speed and optimization accuracy of the HH. The HH proposed in this paper adopts a Greedy Algorithm acceptance mechanism. It accepts all the improved solutions and accepts the non-improved solutions with a certain probability pp, and it can be calculated using equation (25).
where f = 100 × (cf − pf )/pf . Q is the counter used to calculate the number of times the LLH operator continuously fails to improve the pf. As Q increases, the probability of replacing pf with cf is greater. If the selected operator improves the pf, reset Q to 0. The flow of the HH algorithm is as follows.
The In this section, we will expand the Solomon [50] benchmarks to obtain the TDGLRP experimental data. In the Solomon benchmarks, there are 56 instances with 100 client nodes, divided into 6 groups, namely C1, C2, R1, R2, RC1, and RC2. Each client is located in a 100 × 100 area. However, there is only one depot for each group of instances. In order to make the Solomon instances applicable to the TDGLRP problem, we randomly generate 10 depots within the range [100 100]. The data of 10 depots are shown in TABLE 4, where ''NUM'' is an abbreviation of number, ''cap'' is the capacity, ''coor''   is the coordinate, and ''costs'' is the fixed open cost. In each group of instances, the time windows of the 10 depots are the same (TABLE 5). And the data of all vehicles are shown in TABLE 6 [51]. These experimental instances are named TD-LRPTW.

B. THE TIME-DEPENDENT SPEED FUNCTION
In the TDGLRP, to simulate the urban road conditions and rush hour, the roads are divided into five types, and the total service time is divided into four equal time zones: the morning rush hour S1, the evening rush hour S3 and two off-rush hours S2 and S4. The road type is judged by equation (26), where Grade(i,j) is the type of road between nodes i and j, and Mod(a, b) is the remainder of the number a divided by the number b. For example, the type of the road between node 2 and node 5 is 3. The speed of each type of road at different time zones is shown in TABLE 7. For example,the speed is 1.8 if the vehicle is driving on the road 1 within the zone S1.

C. INITIAL SOLUTION
The initial solution is critical to the quality of the final solution. The initial solution generation process includes VOLUME 8, 2020 two steps: selecting the depots and arranging the routes. First, the 10 depots are sorted according to the center of gravity method. After selecting the first depot, the clients are assigned to this depot according to the construction algorithm. When the capacity of the distribution center is exceeded, the second depot is selected, and so on.

1) DEPOTS SORTING
The sorting steps are as follows: x Calculating the center coordinates: The center coordinates can be obtained by equation (26), where x i and y i are the respective coordinates of client i, and q i is the demand of client i. (26) y Calculating the distance matrix: The Euclidean distances between the center and each depot are denoted as DIS 0j , which can be obtain by equation (27): (27) z After that, the weighted value of each client is obtained by adding the European distance DIS 0j , the depot costs FD j, and the reciprocal of the depot capacity P j . The depots meeting the constraints will be chosen to open in order of A j from small to large. The value of A j is obtained by equation (29), where a 1 , a 2 , and a 3 are the coefficients.

2) CONSTRUCTION ALGORITHM
In this paper,one construction algorithm is proposed to generate initial solutions. The idea of this algorithm is to insert the clients one by one into the route according to certain conditions until all the clients are inserted into the route. Solomon's Insertion Algorithm [52] and the IMPACT Algorithm [53] are representative construction algorithms. However, they need to set many coefficients, and different problems need to set different coefficients in order to obtain better solutions. In this paper, we design one algorithm named the Insertion Method based on Travel Time (IMTT) to generate the initial solution. Below we will illustrate the solution process of the initial solution through a simple case. In this case, we assume that there is a depot (named depot0) serving five clients (named C 1, C 2 , C 3 , C 4 and C 5 ) with time window [a i e i ].
Proceed as follows: x First step: Randomly selecting a client node to generate a new route named ROU 1 : depot0-C 1 -depot0 (see FIGURE 4(a)).
y Second step: Calculating the travel time TT 01 from depot0 to C 1 by the method in section 2.2 Therefore, the time when the vehicle arrives at C 1 is AT 1 = max(TT 01 , a 1 ).
z Third step: Calculating the impact value which equals the total travel time of the new route when new client is

D. RESULTS AND ANALYSIS
The program is coded in MATLAB R2018a and executed on a computer with an Intel(R) Core (TM) i5-5200U CPU @2.20GHZ, 4GB of RAM and the Windows 7 operating system.
The depot selection parameters: a 1 = 100; a 2 = 1, a 3 = max(P 2 )/P j , where FD is the costs and P is the capacity of each depot.
The parameters of the HH:r k = 0, iteration = 200.

1) INFLUENCE OF THE INITIAL SOLUTION
In this section, the initial solutions of C1,which are generated by random method and IMTT algorithm, are compared. Then these initial solutions are further optimized by the HH algorithm. The results are shown in TABLE 8, where ''VE'' is the number of the vehicles,''TC'' is total costs, ''CPU'' is the computer times. It is obvious from the TABLE 8 that the solution results of IMTT algorithm is far superior to the random results. This shows that IMTT algorithm is very effective in solving TDGLRP problem.

2) OPTIMIZATION OBJECTIVES COMPARISON
Distances, costs or travel times are often taken as the optimization objectives in the traditional LRP or LRP with time windows. In this section, we will utilize the proposed IMTT and HH algorithm to solve instance C1 with three different objectives and compare the influence of the different optimization objectives on the solution. The three different objectives are min(f 3 ) (minimize distance,see equation (5)); min(f 4 ) (minimize travel times,see equation (6)); and f (minimize costs including vehicles costs, open depots costs and fuel cost,see equation (2)). The results are shown in the TABLE 9, where ''OB'' denotes the objective; ''BEN'' denotes the benchmark name; ''DP'' denotes the opened depots; ''VE'' denotes the number of enabled vehicles; ''TC'' denotes the total costs, which are calculated by f 1 + f 2 ; ''FC'' denotes the fuel consumption calculated using equation f2; ''TI'' denotes the travel time obtained using using equation f 4 and ''DIS'' denotes the travel distances calculated by equation f 4 . The bold fonts indicate the optimal value for the corresponding column.
In the above eight instances, the f objective obtains 4 optimal values in FC, 6 optimal values in TC,2 optimal values in TI,4 optimal values in DIS. The min(f4) objective obtains 4 optimal values in FC, 2 optimal values in TC,6 optimal values in TI,2 optimal values in DIS. The min(f3) objective only obtains 2 optimal values in DIS. FIGURE 6 shows the above results.
From the above results, it can be seen that the cost objective and the time objective are competitive. The cost objective can  save 2.96% cost than time objective, but it takes 1.74% more travel time. For logistic companies, it is more important to save costs. At the same time, the cost objective takes environmental factors into account. Therefore, the cost objective of TDGLRP proposed in this paper is relatively better.

3) INFLUENCES OF CLIENTS DISTRIBUTION
The Solomon instances have different characteristics, C1 and C2 are cluster, R1, R2 are random, and RC1,RC2 are semicluster. FIGURE 7 shows the characteristics of C2 and R2. In this section the influences of client distribution will be analyzed on the solution through C2 and R2 instances. For the sake of fairness, we set all the data (demands, time windows, depots data, vehicles data) of the 100 clients in the R2 to be the same as C2, except for the coordinates. The results,which are calculated under the cost objective, shown in TABLE 10 where dif = (X R1 − X C1 )/X C1 .  The values of R1 are far greater than C2 according to the results in the TABLE 10. At the same time,more vehicles are need in C2. That is to say, the clustered client distribution is more conducive to the route arrangement.

Instances:RC1 and RC2
Hypothesis:All the data (demands, coordinates,depots capacity,depots coordinates and vehicles data) of the clients in the RC2 are same to the RC1, except for the time windows of clients and depots. The results are shown in TABLE 11, where ''TIA'' is the means of the time windows, ''VE'' is the number of the vehicles,''VEA'' is the means of vehicles,''TC'' is the total costs,''TCA'' is the means of TC. The depot time window of RC1 is [0 240] which is much smaller than the RC2 ([0 960]). The average cost of the RC1 is 48763 and the average vehicle is 11; the average cost of the RC2 is 46969 and the average vehicle is 10. These data suggest that the wider the time window, the fewer vehicles needed and the lower the cost. The narrow time window needs more vehicles to meet the demands of all clients. In the RC1 instances, the average time window of the RC101 is the smallest, and 12 vehicles need to be activated, while the other 3 groups only need 11 vehicles. The total service time window [0 960] of the RC2 study is far greater than that of the RC1 study, so the four RC2 instances only need to activate 10 vehicles.

5) EXPERIMENTAL RESULTS OF ALL TD-LRPTW
The results of TD-LRPTW are shown in TABLE 12, where CPU is the average computational time.

E. LARGE-SCALE INSTANCES
Homberger [54] has extended the Solomon benchmarks to generate a large-scale instances with 1000 client nodes. In this section, we will use the IMTT and HH algorithm to solve these instances for reference by other researchers. Similarly, in order to adapt Homberger instances to the TDGLRP problem, the data of more depots and vehicles are needed.            1000 and the cost is 2500. The distribution of clients and depots are shown in FIGURE 10. The results are shown in TABLE 17.

5) S-RC1-1000 BENCHMARK INSTANCES AND SOLUTIONS
For the S-RC1-1000 example, the depot data are shown in

V. CONCLUSION
In this paper, the TDGLRP model considering environmental effects is proposed. Its optimization objective is to minimize costs including the fuel consumption costs, opened depot costs and enabled vehicle costs. Unlike the traditional LRP, the TDGLRP model sets the vehicle speed as a time-dependent function in a traffic network. In addition, the traffic network is divided into four periods and five road types. The initial solution is crucial to the convergence and speed of an optimization algorithm. To effectively solve the proposed problem, the IMTT algorithm is structured to generate an initial solution, and then the HH is utilized to optimize the initial solution. The high-level strategy of the HH adopts the TS method to select the low-level pools. The acceptance criterion is to accept all the improved solutions and accept the non-improved solutions with a certain probability. In this paper, 56 instances of the TDGLRP named TD-LRPTW are expanded on the basis of the Solomon benchmarks. These instances with different characteristics are structured to verify the effectiveness of the algorithm and model. The results indicate that the TDGLRP can effectively reduce the fuel consumption and the total costs. The actual urban road network is intricate and time-dependent. The solution results of several large-scale instances are given in Section 5.6 for reference.

AUTHOR CONTRIBUTIONS
C.Z. came up with the main idea of model and algorithm and carried out the simulations. Y.Z. helped with the structure and revisions of the paper. L.L. investigated the state-ofthe-art literature and pointed out some suggestions from the literature review.