Energy Efficient Order Picking Routing for a Pick Support Automated Guided Vehicle (Ps-AGV)

Order picker routing refers to the process of collecting a set of products with the minimum travel time. Recently, a new generation of Automated Guided Vehicles (AGVs) has been developed to assist human order pickers in order to minimize their travel time. These vehicles are using battery as energy source. However, the routing energy efficiency aspect of these systems remains unexplored. Yet any improvement in power consumption will ultimately reduce the DOD (depth of discharge) of the battery and increase its lifespan. For example, in many real AGV applications incorporating the effect of load mass has been neglected, although its importance. In most studies, the methodology proposed for the order picking routing problem does not allow neither the integration of the mass of each Stock Keeping Unit (SKU) nor the calculation of associated energy costs. Those studies are generally limited to ensure that all the items requested by an order are picked up with minimum travel time/distance. In this paper, an Energy Efficient Order Picking Routing algorithm named EE-OPR is proposed to realize an efficient AGV tour with an acceptable trade-off between energy preservation and travel time minimization. The proposed approach takes into account the mass of loads and its accumulation throughout the pick tour since it intensifies the rolling resistance losses on flat ground, especially at lower speeds. In this regard, an optimization method by means of dynamic states graph is developed. This method is applied to different warehouse layouts. The performance of the suggested algorithm is evaluated by comparing it with an approach minimizing only travel time consumption. Results show that the optimized tours, offered by EE-OPR are effective and robust, with an 18% average saving on the total cost of picking tour.


I. INTRODUCTION
With the rise of customization, the rapid growth of e-commerce, and labor shortage, the level of automation in warehouses and Distribution Centers (DCs) tends to increase in order to meet market requirements [1]. Picking the orders in some warehouses and DCs is done by Automated Guided Vehicles (AGVs) to minimize the pickers' unproductive walking time and improve the picking efficiency and ergonomics in picker-to-parts setup. These systems are The associate editor coordinating the review of this manuscript and approving it for publication was Adamu Murtala Zungeru . referred to as Pick Support Autonomous Guided Vehicles (PS-AGVs), AGV-assisted order picking systems, or simply Autonomous Mobile Robots (AMRs) [2]. An example of such systems is implemented by Fetch Robotics to optimize Case Picking [3]. Some of the existing research works focus on the economic aspects of PS-AGVs application. These works aim to study and provide PS-AGVs solutions that help in the business growth, match their solutions' operations and deliver a significant return on investment (ROI) [4], [5]. For instance, a crucial factor that is often overlooked in the literature and should be taken into account when using PS-AGVs is the energy cost of the vehicle exploitation as well as the cost of battery maintenance and its exchange, especially when the battery is dead. This factor is critical for the adoption of AGVs and in particular, for the improvement of the deployment cost of this technology. In addition, the energy available on battery constraint is generally a limiting factor for the range and length of an AGV deployment [6]. Moreover, since an AGV uses several batteries throughout its life, the total cost of these batteries is significant [7].
Material handling consumes a notable amount of energy in warehouses [8]. Although electrification and green warehousing are receiving increasing attention due to environmental awareness [9], [10], additional efforts in terms of operational decisions should be considered. The number of operational decisions related to PS-AGV routing per day can notably influence the amount of required energy, particularly when AGVs are used in pallets or cases picking. These types of AGVs are known as order picking trucks or upgraded traditional forklifts [5]. It should be noted that forklifts are known as the most energy-consuming material handling equipment used in warehouses [11].
In order to enhance the energy efficiency of an AGV, its energy requirements for order picking must be studied. The process of collecting a set of products with the minimum cost is called Order Picking Problem (OPP), Picker Routing Problem (PRP), or picking problem for short [12]. This problem is the most challenging concern related to warehousing operations since it takes about 50% to 75% of the total operating cost associated with labor and time [13], [14]. Most of the current research works aim to optimize the routing time/distance of the order picking tour [2], [15] while ignoring the energetic aspect. However, for a better choice, it is essential to understand and adequately quantify the required energy of different possible paths to improve the order picking routing efficiency. In addition to the distance traveled and the time spent, the energy consumption of an AGV depends on many factors, such as its speed, weight, and the transported cargo quantity. Hence, the higher the payload weighs, the more power an AGV requires [16].
Therefore, it is crucial to take into account the different weights of requested items when planning a picking tour. It is also important to note that warehouses and DCs manage a large assortment of Stock Keeping Units (SKUs) (in terms of size and weight) on variable schedules [1], [17]. Often, PS-AGV has to deal daily with different quantities and heavy and bulky loads of a wide variety of items with varying requirements. However, they are not limited to this kind of item [5]. In addition, the positions of the products may change in the DC or may be replaced by a different product. This variation of the type of items and their masses has an impact on the required energy to complete a picking tour. In the case of an unexpected increase in the picking tour due to the growth in mass, the recharge time has to be updated to avoid aggressive discharge and extend the battery lifespan. In this work, the importance of having a precise estimation of the vehicle energy consumption considering cargo weights is highlighted to make better routing decisions.
Therefore, improving the energy efficiency of AGVs extends their productive operating time between recharging stops (autonomy). This energy improvement can also increase the number of picked items per day, resulting in short-term savings because the cost of electricity has decreased and also due to the autonomy of AGVs. Moreover, any improvement that reduces power consumption results in the reduction of the DOD of the battery and thus improves its lifespan [18], [19]. In addition, these improvements help with long-term savings by reducing the cost of maintenance and (or) battery change and the number of robots required on the floor, providing better autonomy per vehicle.
Considering the increased interest in the concept of green warehousing and its respective potential for energy saving, the main focus of this work is to answer the following question: Given an order sheet, start, and endpoints, how can we optimize the AGV's routing energy and time cost in order to collect all the requested items from different listed positions in the warehouse while minimizing energy and time simultaneously? It should be noted that in some settings, the energy-saving path can be longer than the travel-time saving. If a slightly longer trip can save energy, it could be interesting for long-term savings, especially for a large warehouse. Therefore, it is interesting to reach a trade-off between time and energy savings. Besides, it is required to specify whether time has high priority in a particular context, like having too many requirements in specific periods. Hence, this paper is concerned with the routing problem of an AGV in a bi-dimensional way to manage time and energy requirements for order picking movement in warehouses.
The rest of the paper is organized as follows. Section II provides the literature review. The methodology is explained in Section III. Section IV presents the empirical results and Section V discusses the benefits and the limitations of the proposed approach. Concluding remarks and some suggestions for future research directions are provided in Section VI.

II. LITERATURE REVIEW
The single PS-AGV routing problem deals with the determination of the path which has to be traveled by the PS-AGV in order to collect a set of items requested by internal or external customers, in a distribution warehouse. This well-known problem is referred to as Order Picking Problem (OPP) in rectangular warehouses. Exact and heuristic methods have been widely utilized to deal with OPP in the literature. OPP is represented as a special case of the classical Traveling Salesman Problem (TSP), where the salesman is the AGV, and cities are items to collect [20]. This framework is about Steiner TSP (STSP) as not all cities are required but some of them (given the specific layout and location of items in a particular structure) [21]. The STSP is described by a directed graph G = (V , E) where V is a set of vertices and E is a set of edges. In this graph, P ⊆ V represents the required vertices and V \P depicts Steiner points. A Steiner tour of G is a closed walk that visits each vertex of P at least once. Therefore, there VOLUME 10, 2022 are two differences between a Steiner tour and a traveling salesman tour. The first difference is that in a Steiner tour, the Steiner points do not have to be visited. The second difference is that a Steiner tour may contain some vertices more than once [22]. In the classical STSP, the goal is to minimize the length of a Steiner tour in the digraph G. In our study instead, we aim to find the Steiner tour minimizing energy and time consumption.

2) ROUTING POLICIES FOR ORDER PICKING OPERATIONS
In 1983, Ratliff and Rosenthal (RR) [23] presented a polynomial-time solution for an exact order picking strategy in a single-block warehouse. The method proposed by RR was extended by Roodbergen and De Koster for a two-block warehouse [24]. Subsequently, Löffler et al. and Masae et al. [15], [25] dealt with order picking problems with arbitrary starting and ending points by applying the concepts of RR. A new solution to the routing problem was suggested by Scholz et al. [26] based on new mathematical formulations in order to take into account the specificity of the warehouse layout. Their formulation has a main constraint that consists of imposing the picking of one unit on the first pass over a required vertex. However, if the energy consumption is considered, that constraint can result in a sub-optimal solution. Pansart et al. [12] present two exact algorithms for OPP. In the first algorithm, they demonstrate that the problem can be solved optimally with Mixed Integer Linear Programming (MILP) using a sparse formulation strengthened by pre-processing and valid inequalities. The problem is seen as a STSP and the authors used a compact single-commodity flow formulation which has been proposed by Letchford et al. [27]. Thus, the picker has to deposit a unit of items each time he picks one (flow principle). It is important to mention here that the direction of traversal is crucial in this setting yet the post-processing step to find the picking tour sequence from the resulting tour sub-graph is not detailed. In their second algorithm, Pansart et al. [12] proposed a dynamic programming approach extending RR known algorithms from two cross-aisles to any number of cross-aisles to deal with real-life applications. Nevertheless, this method cannot accommodate side constraints such as flow directions and precedence. References [12], [28] Also, heuristics methods have been proposed in the literature for the same purpose: traversal (or S-shape), largest-gap, return, midpoint, and composite [29], [30]. Heuristics are mainly used for solving OPP since the optimal route may seem illogical to a human operator [31]. However, in the case of a semi-autonomous or completely autonomous system where the robot is the leader, it is no longer a problem. Koster and Poort [32] present a practical comparison between exact algorithms and heuristics by comparing the S-shape method with dynamic programming. In the S-shape strategy, aisles are fully traveled if there are products to pick, otherwise, aisles are skipped. For example, when the picker starts from the lower-left corner of the warehouse (depot) and enters and leaves aisles from different sides (front and rear), it returns to the depot after finishing picking resulting in an S-shape route. Koster and Poort conclude that despite the ease of use of the S-shape strategy, the optimal algorithms bring better savings in travel time. This result motivates the use of exact algorithms. For more details, Masae et al. [33] present a systematic literature review about order picker routing in warehouses.
Note that the S-shape routing is the most used method in PS-AGV systems. This method is near-optimal when there is an excessive pick density [14]. Löffler et al. [15] extend the RR algorithm for the problem of picking single order with given start and end locations. They also present an adaptation of the S-shape and gap strategies to fit with AGV-assisted order picking where start and end points could be different and are not limited to the depot location. However, this method, as well as other heuristics, are less suited to accommodate side constraints such as the energy consumption and the mass of the transported load.
It is important to mention that the main and common focus of the aforementioned studies is the travel time and/or distance. However, neither energy consumption nor the effect of masses on energy consumption have been considered and investigated.

B. AGV ENERGY PERFORMANCE
Due to environmental concerns, much research has been conducted on the energy performance of autonomous material handling vehicles in order to manage the greening process in factories and warehouses [34]. Some researchers have focused on the development of decision support tools for the selection of the type of material handling vehicles (such as Liquefied petroleum gas, diesel, or electric) in order to minimize the environmental impact of warehouse activities [35], [36]. Other research directions aimed to assess the factors that influence the energy needs of AGVs and to model these needs [37], [38]. Some researchers focused on charging optimization [39], [40] while others on improving the energy efficiency of mobile robots through motion planning and control [41], [42], [43], [44]. Finally, another interesting research works aim to improve the energy efficiency through routing decisions. Since the main goal of our work is to solve the order picking routing problem of an AGV in warehouses and DCs in an energy efficient way, we start by presenting some research studies that consider the minimization of energy consumption for the resolution of the OPP. Then, we present some other related works that focus mainly on energy efficient routing in the context of order picking.

1) ENERGY EFFICIENT ROUTING: RELATED WORKS
In the context of flexible manufacturing, Barak et al. [45] proposed an approach to modeling operation scheduling, machine allocation, and AGV scheduling while minimizing energy consumption. In particular, they used an adapted multi-objective particle swarm optimization method that takes into account both distance and load. Zhang et al. [46] proposed a path planning for a single-load AGV in a factory that efficiently uses energy. Some other research works consider energy efficiency for Robotic Mobile Fulfillment Systems (RMFS), such as Li et al. [47], Xu et al. [48], and Zhou and Zhu [49]. Unlike the picker-to-part system, in RMFS, it is the items that move and not the workers in the factory. The RMFS is typically arranged in a grid with storage zones of inventory pods, picking stations, and replenishment stations. Robots lift and carry square shelving units called inventory pods with items from storage locations to replenishment or picking stations. Under such a setting, the workers can fill or pick items from inventory pods. The research works described above, consider the routing energy efficiency of different types of AGVs, some of which consider the effect of the mass. However, these works deal with problems different from ours.
In addition, in the transport field, PS-AGV routing is strongly related to the vehicle routing problem along with the pickup and delivery problems. In particular, PS-AGV routing can be considered as a generalization of the TSP. The classical vehicle routing problem consists of finding the best route defined for rather a fleet to reduce transport costs. Continue within the framework of green logistics, some studies have been carried out to reduce the consumption of energy for the vehicle routing problem. For example, a novel load-based cost objective for energy minimizing vehicle routing problem is proposed by Kara et al. [50]. The problem is presented as Capacitated Vehicle Routing Problem (CVRP) and defined using integer linear programming formulations for the delivery and collection cases.

2) ENERGY EFFICIENT ROUTING FOR ORDER PICKING
As mentioned in Section II-A2, the goal of most of the approaches related to OPP in the literature is to reduce the travel time and/or distance while overlooking the environmental performance of the warehouse [35], [51]. However, there are some studies that attempt to find a trade-off between travel time and energy consumption minimization in order to optimize order picking routing. For instance, Ene et al. [52] developed a genetic algorithm for order picking problem in warehouses. This algorithm aims to minimize the service time as well as energy consumption via order batching and picking routing optimization. The authors proved by examples the significant energy saving when applying their approach. However, their work is based only on the vertical and horizontal speed and the traveled distance of the forklift to estimate energy consumption. Also, they assume a constant energy consumption per unit time, overlooking load and resistance forces in their calculation. Rojanapitoon and Teeravaraprug [53] introduced a new mathematical model for picker routing that minimizes the travel time and energy consumption given a variation of the level of traffic in a rectangular warehouse. Their mathematical model is then used on computer simulation software which has been presented in a previous work [54]. They compare their results with the time-staged model, and they validate them by the brute-force search strategy. Compared to the method that optimizes time, the authors reported that their proposed model optimizes both time and energy and saves up to 17% energy. However, they excluded the possibility of picking up the requested item on the second pass (if there is any) and overlooked the direction of travel in the construction of a complete picking tour which can lead to sub-optimal solutions. Lee et al. [55] developed an integrated dynamic algorithm as a solution to the electric forklift routing problem w.r.t battery charging constraints. That is, the algorithm considers the electric forklift's picking/put-away routes and battery charging schedules along with the number of electric forklifts. In addition to that, their algorithm takes into account the consumption of electricity in the warehouse. Also, Makris et al. [56] addressed the OPP from an energy efficiency point of view. They present a TSP-based routing algorithm in order to achieve a trade-off between travel time and energy consumption of order picking in the warehouse. The authors, however, exclude the weight from their energy consumption evaluation. Similarly, most of the research works do not consider mass as a critical factor for vehicle routing decisions, especially when considering the energy saving. Elbert and Müller [57] investigate the impact of transported item weight on the velocity of the order picker and travel time in a manual picker-to-parts order picking. In their work, they focus on the problem of storage assignment and propose new weight class-based storage assignment policies to reduce travel time.
Therefore, two important points are overlooked in the aforementioned research studies, which are considered in this study. We discuss these two points in what follows. First, in the majority of research papers related to the resolution of the OPP, the applied constraints in the mathematical formulation either force the solution to pass only once on each required picking location or to take the products from the order on the first pass. These formulations lead to an optimal or near-optimal result which minimizes the travel time of order picking tour. However, considering the energy gain in the optimization formulation was not considered. This gain can be obtained by passing through a required point without picking up the load (so as not to drag it) and by picking it up on the way back to save energy. Figure 1a shows an example of considering the energy gain by not lifting the required item in the first pass. In this illustration, we suppose a bidirectional graph with a set of vertices X , Y , Z and a set of edges (X , Y ), (Y , Z ). The wavy arrows represent the shortest paths between two nodes. The OPP here attempts to start from point X , to visit points Y and Z , and then to return to point X with minimal cost. In this example, the resulting shortest path requires the passage twice on the same node Y . Solution 1 does not consider the energy gain and hence carries the load in Y all the way from Y to Z , and then from Z to Y . However, Solution 2, which takes into consideration the energy gain, first picks the load in Z , and on the way back, it picks the load in Y . Second, most of the discussed works focus on the search of an optimal sub-tour and take the construction of a full tour for granted. Nevertheless, a change of direction of the same sub-tour can bring an energy gain if the loss of energy linked to the movement of heavy objects is delayed. Figure 1b gives an example of saving energy by changing the direction. We assume in this figure a different bidirectional graph with three vertices VOLUME 10, 2022  ). Here, the OPP attempts to start from point A, picks up the loads from points B and C and returns back to point A with minimal cost. According to this example, the load in B is heavier than C. Solution 1 does not consider the change of direction and therefore, it first picks the heavier load and carries it all the way to C and goes back to A. However, Solution 2 first picks the lighter load in C. Then, it picks the heavier one in B and goes back to A. This change of direction can lead to energy savings.
Hence, the main contribution of this work is to tackle these two not-well-studied points in the literature. Thus, in our approach called Energy Efficient Order Picking Routing (EE-OPR), we consider time, energy, and mass in the planning decision in order to achieve an efficient order picking tour for an autonomous material handling vehicle. Moreover, unlike classical RR-based approaches which are mainly based on the particular structure of rectangular parallel-aisle warehouses for creating the subproblems of a dynamic program, the proposed approach can be used for other warehouses with different arrangements and layouts. In fact, in order to solve the OPP while simultaneously minimizing time and energy, a dynamic program is developed. This program considers the problem as an eco-energetic STSP and transforms it into a shortest path problem (SPP) by creating an acyclic dynamic state graph and performing a graph search process.

III. METHODOLOGY
In this section, we present the methodology we followed for EE-OPR.

A. WAREHOUSE LAYOUT REPRESENTATION
A conventional single-block parallel-aisle warehouse with single depot is considered as the classic configuration. Such a warehouse consists of g vertical aisles and 2 horizontal crossaisles. Figure 2 presents an example of this case with six aisles and two cross-aisles. Aisles contain products on both sides, while cross-aisles make intersections through which the AGV can navigate. The warehouse structure can be described by the graph G 0 (V 0 , A 0 ), where V 0 = {v 0 , . . . , v k } is a set of k vertices (yellow and blue circles in Figure 2a) and A 0 presents a set of arcs denoting connections between vertices. In order to facilitate the selection of pick-list elements from their locations, the main set of vertices V 0 is divided into two subsets, denoted by V I and V L . V I defines intersections vertices in cross-aisles (blue circles in Figure 2). V L stands for other vertices, which account for possible picking locations (yellow circles in Figure 2a). An order given by a pick-list can be specified by another set V P ⊆ V L with carnality p ≥ 1. This subset contains vertices associated with p cases, which are described by their SKU and location in the warehouse (pink vertices in Figure 2b). This pick-list contains the order lines of a single customer (pick-by-order) or multiple customers (pick-by-batch). In the rest of the article, we use the term 'case' to present the total number of items or batches that are picked at the same pick location. In addition, we designate these cases by the vertex located in the corresponding position. For example, the picking location of the case i is expressed by v i . Furthermore, the masses of the cases are given by the set Y = {m 1 , . . . , m p } in which each element m i represents the mass related to the case i, in kilogram (kg). Also, m 0 denotes the mass of an AGV without load.
The OPP is considered as an eco-energetic STSP that is stated in a directed graph, G(V , A). Figure 2b presents an example of this graph with 16 vertices from which 6 are required. Figure 2a, and contains only relevant locations. These locations account for Steiner, V I , and pick-list, V p , vertices and are represented by the set V = {v 0 , ..., v n }. Consequently, the set A presents the arcs that connect adjacent vertices. The weight of each arc corresponds to the Euclidean distance cost, d ij , between the i th and the j th vertices, connected by an arc. Accordingly, the picking routing problem can be determined as optimizing the AGV tour to collect all products in the pick-list by minimizing its time and energy consumption while traveling from initial to target position (depot or other predefined positions). The optimal tour also takes into account the mass of pick-list elements.
It can be noticed that the objective is to find an eco-energetic tour not necessarily a Hamiltonian one given that it is an eco-energetic Steiner TSP problem. This allows vertices and edges to be traversed more than once, if desired. In order to achieve this target, a new scheme is proposed that is applicable to any warehouse layout. EE-OPR is based on the creation of dynamic states through the bit-Masking method [58]. Additionally, it takes advantage of the dynamized Dijkstra algorithm for the graph search [59]. EE-OPR considers the energy consumption of the AGV and the effect of the transported cargo weight.

B. ASSUMPTIONS
In this section, we list the assumptions considered in EE-OPR.
• Only the energy consumption of the AGV travel movement is considered. The time and energy waste for picking stops can be assumed to be constant or negligible. Moreover, depending on the type of the robot, if it has to lift itself cases, the energy loss due to this work is not taken into account in the mathematical modeling. This is because the energy demand to overcome the gravity of the cases is almost the same and does not affect the routing decision.
• The minimization of the total travel time is equivalent to the minimization of the total tour length, given that the vehicle can move uniformly (constant speed (V ) in both coordinates axis). Therefore, the energy consumption caused by the process of acceleration and deceleration is not considered.
• The definition of the pick-list is done beforehand so that the energy onboard is sufficient to finish the tour and that the volume and the mass of the cases do not exceed the AGV's capacity Q.
• The mass of all SKUs is available and by knowing the quantity required in each pick location, the mass of each case can be computed.
• The order picker continues to pick up all required items at the same storage location when the AGV stops for picking.
• The aisles are wide enough to allow two-way travel of AGVs. This assumption can be modified for parallel closed-end picking aisles by simply modifying the graph edges with unidirectional arcs representing the aisles and bidirectional arcs for edges of cross aisles.
• The depot location is displayed in the lower left-corner of the warehouse for simplicity. However, this assumption can easily be changed for any layout.
• A single AGV is considered in this case study. This can be easily extended to accommodate multiple AGVs.
• We assume that the robot can pass more than once on the same vertex during a picking tour.

C. DYNAMIC GRAPH CREATION
STSP is a combinatorial optimization process that can be formulated as a shortest path problem [58]. The solution to this problem is a path with the minimum cost that starts from the initial position with zero load and ends at the final position with all requested loads. EE-OPR is based on the dynamic creation of a state graph incrementally. Therefore, a new graph G (V , A ), called state graph, is created in addition to the spatial graph G (V , A). The set V represents vertices states and the set A defines arcs between two successive states. G (V , A ) is weighted by a cost function considering the previous state and direct transition costs. This function determines the cost of time and energy that is required to travel the distance d ij with a load m. Besides, the resolution of the shortest path problem is processed by means of the Bellman principle. These notions are detailed in what follows.

1) STATE VERTEX DEFINITION
As a particular case of TSP, the STSP can use the philosophy of bitmask arrangement emphasizing that each data is represented by a binary digit (0 or 1) with a particular permutation (bitmask) [58]. Let p vertices be the elements of a pick-list. For this list, the tuple a i = (a 0 a 1 . . . a p ) with p binary digits is created with the same order. This tuple defines the cases that have been picked at a certain time. It presents them by the binary value 1 at their corresponding location. For instance, a i = (0 1 . . . 0) shows that the second case of the pick-list has been picked by the robot. As a result, a i = (1 1 . . . 1) signifies that all cases have been lifted. The total number of arrangements for a i is 2 p . Since the problem is presented in an incomplete network graph and only a subset of 'visiting vertices' [22] is covered by the tour, the vertex information is also required to formulate the picking problem. In fact, the robot can be positioned in any vertex v i ∈ V (required or not) and transport 0 to p cases. Subsequently, spatial and temporal information is paired and referred to as states. A state is represented as a tuple s k = (v i , a j ) in which v i presents the vertex and a j expresses the arrangement. The spatial complexity of the state representation is n2 p , where n is the cardinality of V . Generally, the overall execution time of a TSP based on the bitmask Dynamic Programming (DP) with n cities to visit is O(n 2 * 2 n ) [58]. Therefore, the resolution of Steiner TSP using DP bitmask is O(n 2 * 2 p ). Nevertheless, the time complexity of our work can decrease considering the sparsity of the warehouse graph and the particularity of the Steiner points. This advantage can be realized by EE-OPR which is capable of creating only necessary states and edges. From this perspective, the targeted case can be presented in terms of a shortest path problem where s start = (v start , a start ) and s target = (v target , a target ) are starting and ending states, respectively. For these states, v start , and v target stand for starting and ending vertices. In addition, a start = (a 1 , a 2 , . . . , a p ) where a i = 0 ∀i ∈ {1, ..., p} and a target = (a 1 , a 2 , . . . , a p ) where a i = 1 ∀i ∈ {1, ..., p} define starting arrangement and target arrangement, respectively. In the rest of the paper, we will refer to starting arrangement and target arrangement as 'empty arrangement' and 'full arrangement', respectively.  , 11), respectively. In order to avoid creating all possible states, a new Directed Acyclic Graph (DAG), G , is considered. This dynamic graph is initiated by the startstate vertex. Afterward, it is expanded by state vertices and transition arcs based on the spatial graph G, the exploration, and the current load arrangement.

2) TRANSITION ARC DEFINITION
During the picking turn, the robot can move between vertices transporting 0 to p cases. This movement corresponds to a transition between two states that is represented by an arc a ∈ A . The adjacency matrix n2 p × n2 p can be used to define possible transitions over the entire states. However, this matrix includes impossible transitions and unattainable states. These unacceptable circumstances are: • loading more than one item at the same time, • decreasing load in the tour, • turning on a bit in the arrangement (zero → one) when its corresponding vertex does not exist in the pick-list, • activating a bit in the arrangement (zero → one) when its position does not relate to that of the vertex in the pick-list. Consequently, EE-OPR is employed to generate the states and arcs of G in a dynamic way. Additionally, the edge relaxation technique is used to update the paths associated with the existing states. The generation process is done during traversing the spatial graph G (the exploration phase).

3) STATE VERTICES AND TRANSITION ARCS CREATION
According to the above discussion, a state can be presented by several attributes consisting of name (v i , a), vertex v, arrangement a, mass m, predecessor state i−1 , and cost c (related to travelling from the starting vertex). The exploration phase involves the creation of states and transition arcs while passing through the graph G. At each step, a transition between two states is realized by a change of position (vertex) and an update of the arrangement (if possible). The vertex change can occur just between two adjacent vertices, v i , v j ∈ V : (v i v j ) ∈ A. Besides, changing the arrangement (state of loads) follows certain logic and there are two alternatives, described below.
• a 1 : the arrangement remains the same (Not carrying v j ). • a 2 : the arrangement is updated by turning on the bit corresponding to v j (carrying v j ) (see Figure 4). By pairing the arrangements a 1 and a 2 with a neighbor vertex, two possible states can be created based on the following.
• (v j , a 1 ): This state expresses the AGV movement to the next vertex v j with the same load.
• (v j , a 2 ): This state designates an AGV motion towards the next vertex v j , with its current load and picking load v j . If v j belongs to the pick-list, both possibilities are allowed. Otherwise, only the first case is permitted. Note that only one bit can change at every arrangement update. Figure 5 illustrates an example of the creation of successor states for the state (A, 00) in the graph of Figure 3. It can be deduced that the neighbor B is an item on the pick-list since both possibilities are demonstrated. The choices are explained by a tuple in which the digit related to the neighbor vertex can be 0 (state 1 ) or 1 (state 2 ). In this figure, the 'arrangement update' refers to the masking process, described in Figure 4.
The state creation procedure is described in algorithm 1 and algorithm 2. The first algorithm involves the function CreateState 1 . It takes as input the current state (its vertex u, arrangement a, mass m, and current cost J * ) as well as the neighbor spatial vertex v ∈ V . As output, it generates a tuple state 1 = (v, a) that pairs the successor vertex v and the same arrangement and mass and the cumulative cost to move from the current state to state 1 using transitionCost function defined in the next subsection. Furthermore, the predecessor attribute is set to the current state.
The second algorithm involves the function CreateState2 which takes the same input information plus the pick-list. The determination of y, the position/index of v in the pick-list is required. Then, a mask is applied to the current arrangement to give a new arrangement a 2 . This mask is a binary number composed of p bits set all to zero except the y−th bit which is set to one. a 2 is then paired to the successor vertex v to create state 2 = (v, a 2 ). In addition, a new mass m 2 is calculated which is the sum of the current mass and the mass associated with the vertex v. The cost of state 2 is calculated using the current cost J * and the transitionCost function. This latter utilizes the current vertex u, the successor vertex v, and m 2 . These two functions are used in the main EE-OPR algorithm.

4) COST OF THE TRANSITION IN TERMS OF TIME AND ENERGY
By defining vertices states and transition edges, a new states graph G (V , A ) is created. This graph (which is a DAG) starts from starting vertex s start = (v start , a start ) and branches out state vertices connected by transition arcs. The graph is weighted according to the criterion to be minimized (time only, energy only or both time and energy). We will call the methods that solve the OPP considering only the travel time, Travel Time Minimization (TTM) and those that consider only the energy, Energy Consumption Minimization (ECM). Let us define the model of the time required to move from state i to state j (t ij ) as follows: The computational function of the energy demand to move from state i to state j (e ij ) is defined based on the vehicle's longitudinal dynamics when moving along any arc (ij) FIGURE 6. Dynamic states graph G . VOLUME 10, 2022 represented by: where F T , M , V , ρ, A, C x , v w , g, µ and θ stand for the traction force at wheels, the vehicle total mass, the speed, the air density, the active aerodynamic surface of the vehicle, the drag coefficient, the wind speed, the gravity constant, the rolling resistance and the ground slop, respectively. Since most warehouses have flat ground, we have considered θ as null [46]. In addition, the wind speed in a warehouse can be neglected. Furthermore, since the vehicle is similar when moving along arcs, the aerodynamic force 1 2 ρAC x (V − v w ) 2 is similar on every arc. Moreover, the acceleration is limited to avoid sudden motion, jerk and tip over. Therefore, considering that AGV's velocity is constant, friction is the major external force applied to the AGV in an indoor context which is given by: Assume that: M R i and M L i are the total masses of the AGV when it reaches vertex i.vertex and it leaves it, respectively. The total mass in kg, includes its own mass m 0 , the carrying SKUs (cases), and the picker mass (if it is driven to travel).
It is important to mention that minimizing only e ij (like ECMs) can lead to longer and more time-consuming tours than the shortest ones (given by TTMs). For instance, let's assume a directed graph C n with 4 vertices A, C, D, and E . Thus, it is clear that P 2 overcomes P 1 in terms of energy. However, P 2 consumes more time than P 1 . It is therefore important to take into account the two criteria (time and energy) in order to reach a good trade-off between them and this is what EE-OPR aims for. Since t ij and e ij have different units, let us define c ij , the total cost in dollars to move from i to j, as follows: where c t and c e are two coefficients representing the time cost in dollar per second and the energy cost in dollar per joule, respectively.These two coefficients define the importance of each criterion. For example, in case of a large number of orders in a warehouse, the time factor is crucial and thus, it is recommended to increase the cost in dollars.

D. GRAPH SEARCH PROBLEM
In addition to the construction of the states graph, the cost associated to each state is updated following Bellman's Principle of Optimality [60]. This technique is inspired by dynamized Dijkstra's algorithm [61] which is considered as a dynamic programming successive approximation procedure. Considering the graph G (V , A ), each arc (i, j) ∈ A is weighted by c ij ∈ R ≥0 . Suppose that s start , s target ∈ V are the source and the destination vertices, respectively. We define by X a directed path of a sequence of vertices v i such that: is the path's cost that is, the sum of arcs' costs in X ; c(X ) = m−1 i=1 c v,v+1 . Obviously, the state vertices which constitute the resulting path are visited at most once. We specify that we are talking about state vertices and not the spatial vertices belonging to V . This means that we can pass twice on a vertex but with different arrangements.
Given the fact that eco-energetic STSP is seen as a problem of the shortest path between the state s start and the state s target , the following standard integer programming formulation is presented [62] (shortest path here designates optimal cost's path): subject to: where i and j denote states vertices, + i and − i designate the sets of successor and predecessor vertices, respectively, x ij represents the decision variable defining whether arc (ij) is part of the shortest path or not. It takes the value 1 in the first case and 0, otherwise. Constraints (1) and (4) specify that for each vertex that belongs to the shortest path, different from the start state and target state, must have the same number of incoming and outcoming arcs. In order to solve the presented eco-energetic STSP, EE-OPR is proposed. It follows the following steps: Create a queue of priority Q and initialize it with the start state with a cost of 0. While this latter is not empty, select the state with minimal cost, consider it as a current state and delete it from the priority queue. Then, define its composition (which vertex (u) and arrangement (a) it is composed of). Explore the vertex (u) by determining the neighboring vertices. These vertices will be used to create state 1 and state 2 , and to add them to the queue if they don't already exist. Otherwise, check if the current computed cost can be decreased (optimized) by going to the next state through the current state. This is described by the RELAX function 4. In parallel, the cost list and the pred list are created to save the added states with their costs and predecessors, respectively. An update of these costs and predecessors can be done using the RELAX function which tries to minimize the cost of the paths. The final cost of the target state represents the optimal cost of the picking tour. In addition, the (pred list) allows to trace the optimal path from the target states arriving to the start state. Note that EE-OPR prevents movement to the target vertex without having retrieved all the items in the pick-list. This is done during the exploration phase by the prohibition of the creation of states and transitions when the neighboring vertex is the target vertex but the arrangement is different from the 'full arrangement' (see the first if statement in the algorithm).
Therefore, if the target vertex is different from the start vertex, the only state that uses this vertex is the state target (the total number of possible states is thus less than (n − 1)2 p + 1. Otherwise, if the depot position represents both the starting point and the ending point, the two possible states in this position are the states having the 'empty arrangement' (at the start) or the 'full arrangement' (at the end). The total number of possible states is (n − 1)2 p + 2. Consequently, only arcs leaving the start state and arcs entering the destination state are kept. Moreover, in case a neighboring vertex v of a current vertex u belongs to the pick-list and the current arrangement is already a 2 (the bit corresponding to v in the arrangement is equal to one). In this case, state 2 is equal to state 1 . Hence, only state 1 is added.
Now, we will discuss the time complexity of EE-OPR. As the priority queue Q is represented as a binary heap, where operations are performed in O(log(q)) time (where q is the size of Q), the time complexity of EE-OPR is measured as follows. The time taken for each extract-min operation is O(log|V |). Moreover, iterating over all vertices' neighbors and updating their dist values is executed a total of O(|V |) times and each vertex priority update takes O(log|V |) time. Consequently, the total computational cost of both calculations takes O(|V | × log|V |) time [63].
As a result, the overall time complexity of EE-OPR is O(|V | × log|V |). Given that the maximum number of states that can be created is less than n2 p , then the overall time complexity of EE-OPR is O(n2 p × log(n2 p )).

IV. EMPIRICAL STUDIES A. WAREHOUSE LAYOUT
As shown in Figure 7, we assume a single block warehouse with g aisles, for each of which l horizontal picking positions are considered. This warehouse graph is chosen for the evaluation of our proposed method as it is the most common structure studied in the literature. However, we note that EE-OPR can be adopted to any other warehouse configuration.

B. COMPUTATIONAL RESULTS
In this section, we explain how the simulations are performed. The implementation is done using an Intel Xeon W-2102 computer with 128 GB DDR4 RAM and we used Python as a programming language. VOLUME 10, 2022 The effectiveness of EE-OPR is evaluated and compared with a method that we call Travel Time Minimizing approach (TTM). TTM proposed by Letchford et al. [27] is an approach that aims to only minimize the travel time of selection tours. It is based on a compact single-commodity flow formulation. TTM is also used by Pansart et al. [12] for solving a mixed-integer linear program. The latter uses only time criterion t ij in its objective function (z * = min (i,j)∈A (t ij x ij )). This technique is used to generate the shortest pickup tour for each instance. Moreover, in order to compare it with EE-OPR, the sum of the cost of each arc of the tour sequence resulting from the two methods is computed using the c ij function 5, giving the cost in dollars of executing each path (since TTM does not use c ij as cost function, the cost in dollars of TTM resulting tours is calculated after performing the TTM using c ij ). It is important to mention that TTM can not integrate the energy consumption since e ij is a function of M L i which is not defined by TTM according to its formulation. On the other hand, TTM reflects all exact algorithms that aim to optimize time or distance without considering the energy aspect through dynamic programming formulation that is not suitable for adding additional constraints [12]. More specifically, to evaluate the performance of our approach, we consider random demand scenarios with uniform demand distribution throughout the warehouse with random storage policies. This assessment is realized through the variation of: • the size of the pick-list, • the required pick locations in the warehouse (pick-list), • the masses of items to be picked up in different positions and, • the shape of the warehouse. As shown in Figure 7, the simulations are generated for four different configurations: Layout 1= (40 × 50), Layout 2= (25 × 80), Layout 3= (20 × 100), and Layout 4= (10 × 200). The distances between the aisles and the pick locations are 4 meters and 1 meter, respectively for all the setups. The choice of the warehouse shape is inspired by [31]. Each structure is explored under five scenarios with different pick-list sizes namely, 8, 10, 12, 14, and 16, respectively. For every scenario, 100 picking tours are simulated with arbitrary locations and masses in order to provide reliable statistical analysis. For each tour, a selection list of cases is chosen at random using a uniform distribution. In practice, cases can represent batches that group several orders in a pick-list to be separated in the packing station [31]. Cases in a selected list have various masses that are also generated at random between 10 kg and M max kg (maximum vehicle capacity). The sum of the masses of the pick-list items should not exceed the maximum vehicle capacity as follows: where s, i, j represent the scenario, instance (picking tour), and pick-up location, respectively. The simulation performed in the present work is carried out for an AGV with a maximum speed of 1.2 m/s, a weight of 1600 Kg, and a maximum supported load of 1200 kg. Figure 8 presents the results of the comparative study of our approach and the TTM method based on the mean and variance of the cost of each case. As shown in Figure 8, EE-OPR overcomes TTM in all the twenty settings (four layouts each of which having five different scenarios). In particular, the effectiveness of EE-OPR increases by moving towards the fifth scenario for all layouts when compared to TTM. In other words, we note a lower increase in cost as the size of the pick-list increases. Figure 8 shows an increase in the difference between the mean values of the tour cost in the scenarios of each layout. Such a result is achieved due to the EE-OPR ability to minimize the energy, which tends to increase for larger pick-lists with higher possibility of mass accumulation. Moreover, we observe a lower variance of the routing cost in EE-OPR by comparison to TTM, especially for scenarios 4 and 5 (with higher number of items). Thus, EE-OPR is less sensitive to variations in pick-list location and mass. Hence, the results prove that minimizing energy and time simultaneously can effectively decrease the cost of a picking tour.
We also notice that there are situations where TTM and our approach lead to the exact same result. This is logical and it is explained by the fact that our focus is on the effect of mass on energy consumption and routing decision making. Thus, if the mass variation of the items is low or zero, minimizing the energy consumed amounts to minimizing the distance covered. On the other hand, if the situation of passing over a position twice is not present and the direction of the traversal tour is luckily the same, the advantage given by our approach is no longer valid.
To better illustrate how our approach overcomes TTM, an example of an order picking problem of a simple graph is used (Figure 9). The graph is composed of 20 vertices of which 4 are required. These required vertices represent the pick-list vertices and are represented in red circles in the graphs. The required vertices are 18, 19, 15, and 16 which have the masses 100, 10, 70, and 20, respectively.
To solve this picking problem, TTM (Figure 9,a) and EE-OPR (Figure 9,b) are applied. The resulting tours of each method are plotted in the figure using red arrows. Note  that the segments constituting the path obtained by both approaches are exactly the same. In other words, the resulting paths of both approaches are similar in terms of distance traveled and also the shape of the red paths. However, the application of the approaches differs in the direction of the trajectory as well as the order of picking the items at the required positions.
Let v i be the position of vertex v and let v i ∼> v j be the shortest path between two vertices v i and v j and we assume that an AGV starts from a position v 0 , picks the loads at the VOLUME 10, 2022 required positions (red circles) then goes back to v 0 . Thus, applying EE-OPR leads to the following result: As shown in the results and by contrast to EE-OPR, TTM drags unnecessary masses longer because of path direction. Furthermore, EE-OPR allows the vehicle to pass through one required position without the obligation of picking up the item(s) on the first pass. For instance, unlike TTM, EE-OPR passes through vertex 18 without collecting the load (so as not to drag it). Afterwards, it moves to vertex 19 to collect its corresponding load and then returns back to vertex 18 to collect its load. In other words, the items located in vertex 18 are not picked on the first pass but on the second. It is also observed, in this example, that the path resulting from EE-OPR is different from that given by TTM (direction and time of items withdrawal) but with the same length/duration (shorted path). Moreover, EE-OPR resulting path has the same energy demand given by ECM (with the lowest energy consumption). It is important to note that this situation (EE-OPR selects a path that is both the most energy efficient and the shortest) is very common, especially when the size of the pick-list is small.
Consequently, applying EE-OPR leads to a 25% of reduction in total cost in dollars of the pickup tour by comparison to TTM. Now, this gain in terms of cost can be more significant when the shortest path between the vertices v 18 and v 19 is worth kilometres (in large area warehouses).
Hence, EE-OPR minimizes energy loss associated with moving heavy objects. In fact, if possible, it delays the pickup of a certain item in order to reduce the distance to be covered with this item and therefore reduces the cumulative amount of the next load while respecting the travel time constraint.

V. DISCUSSION
Applying our approach led to an average of 18% saving in the total picking-tour cost. However, the number of pick locations to visit in a single picking tour is assumed to be less than 18, respecting the capacity of calculation. This implies that the number of stops can be up to 18 in a picking tour, but the quantity of required items in each location can vary according to orders' batching. Order batching is a technique for grouping a set of orders into batches [64]. We keep the improvement of the limited number of 18 stops in a tour for future research.
However, this limitation is irrelevant for warehouses of heavy and bulky items. 1 That is, in such warehouses, items like large consumer electronics, carpets, or any other heavy items cannot be carried by the picker all the way back to the depot. Consequently, in these warehouses, the size of the pick-list is often small making the limitation of our approach w.r.t. the number of stops in a picking tour irrelevant.
It is important to note here that we assume that a single vehicle is used in the case study, which can be updated to accommodate multiple order pickers. We also assume that there are no obstacles on the travel path, thus enabling uninterrupted travel.
Note that the amount of the cost reduction achieved by our approach depends on many factors, such as the storage assignment policy, the type and the layout of the warehouse, the type and the weight of products, the size of the AGV, etc.
It is also important to mention that the routing, storage strategy, batching, zoning, and order release mode are components of the policy level which is highly dependent on the strategic level. In other words, the strategic level represents the system characteristics such as command cycle, mechanization level, warehouse dimensionality, and information availability [13].
Therefore, the efficiency of our routing algorithm depends on these characteristics. On the other hand, the computational efficiency of our algorithm depends on the locations to be visited (i.e., whether these locations are close, far or Scattered), which affects the speed of finding the target state.
Moreover, our approach can be used for different warehouse layouts with arbitrary starting and ending points of a tour. It suffices to define a Steiner graph containing the possible passage segments (arcs and vertices), the starting point (v s ), the ending point (v t ), and the required points (picklist). Then the starting state will be (v s , a s ) and the target state will be (v t , a t ). Such that a s and a t are 'empty arrangement' and 'full arrangement', respectively.
On the other hand, given that the definition of the picking problem in this study offers the possibility of a bidirectional movement of the AGVs in an aisle, EE-OPR can be suitable for low and medium throughput DCs. As if necessary, a congestion problem in the aisles may arise. However, a unidirectional movement may be required by simply changing the graph warehouse representation.

VI. CONCLUSION
Warehouses might represent a real threat to the environment as they might contribute to the rise of greenhouse gas emissions in supply chains. Consequently, many recent research works have been established to encourage the deployment of green and sustainable warehousing.
As typical material handling equipment in modern warehouses and distribution centers, the energy consumption of PS-AGVs represents a major part of the warehouse's total energy waste. However, current research related to order picking routing focuses mainly on travel distance and time optimization, while energy aspect is rarely considered. However, in many situations, energy is as important as time, especially during a period of low demand. An effective way to reduce the consumption of AGVs is to improve their operational efficiency and routing. In this study, an eco energetic routing for an PS-AGV was established. The approach called EE-OPR allows the robot to start from its depot to collect the items of orders from different locations in the storage area and to transport them to a defined position (the depot, packing station, or other defined target) while minimizing travel time and energy consumption simultaneously. We focus specifically on the effect of the mass transported for the decision-making of the order picking route to improve the AGV's energy efficiency without impacting the operating time. Moreover, unlike the RR-based methods, widely used in the literature, which are mainly based on the rectangular configuration of the warehouse (possible movements in a parallel aisles warehouse) to create their dynamic program, the EE-OPR is suitable for any type of warehouse layout. It is based on the dynamic creation of a state graph taking into account the energy demands of the vehicles and the weight of the transported cargo.
The solution to this problem is a path with the minimum cost that starts from the initial position with zero load and ends at the final position with all requested loads. First, an exploration phase is involved. It consists of the creation of states and transition arcs while passing through the spatial graph representing the warehouse. These states incorporate the information of (a) the position of the AGV in the warehouse (vertex of location) and (b) the mass of loads transported in each step with details of the items already picked. Besides, transition arcs are defined according to the possibility of picking up a neighboring vertex's item and are added based on the original graph of the warehouse. These arcs are weighted by a cost function based on the previous state and the AGV's energy consumption model having among the main factors the mass transported. In parallel, graph search phase is proceeded for the resolution of the shortest path problem using the Bellman principle.
Results obtained through different simulations indicate that EE-OPR always leads to better results compared to the approach based only on the minimization of travel time with an average gain of 18%. In the medium and long-term run, the potential for energy-saving gains can be achieved. Such gains can be the decrease in the cost of electricity, increasing the operational time of AGVs, and reducing the cost of maintenance or replacement of the battery. The change of storage strategy as well as batching strategy have, definitely, an effect on the gain that EE-OPR can provide by comparison to TTM. This effect can be studied in future research.