Data-Driven Simulation Modeling of the Checkout Process in Supermarkets: Insights for Decision Support in Retail Operations

We build a realistic agent-based model for simulating customer decisions of picking a checkout line at a supermarket that is calibrated to actual point of sale (POS) data from a major European retail chain. It is implemented on the open-access NetLogo simulation platform and is freely available to academics and practitioners interested in testing how different checkout zone layouts, as well as queue management and feedback strategies impact the overall efficiency of the checkout process. In particular, we show that when customers pick a line by minimizing the expected waiting time, not only is this choice beneficial for the customers themselves, as it leads to shorter waiting times in queues, but also for the supermarket management, since it yields shorter working times of the cashiers. As such, we provide guidance as to the feedback that could be provided to customers entering the checkout zone.


I. INTRODUCTION
Why do customers have to wait in queues in supermarkets?
The most general answer is that there are more customers than available points of sale (POS). A shortage of cashiers at a certain time and space constraints, which limit the number of checkouts that can be installed, are the most likely reasons. As the number of customers fluctuates throughout the day and their arrival to the checkout zone is not stable, managers have to balance between a shortage of available cashiers and the reduction of costly idle times.
The traditional approach to queuing at checkout is described as trying to answer the question ''how long must a customer wait'' [1]. The focus is on minimizing the time that a customer spends in a queue and this time is the main measure of customer satisfaction. Other studies emphasize the importance of customers' subjective perception of the waiting time. The queuing environment, social justice measured by adherence to or violation of the first in first out (FIFO) rule The associate editor coordinating the review of this manuscript and approving it for publication was Yichuan Jiang . and feedback provided to customers in queues are the typical objects of interest [2]- [4].
In various areas of commerce and services, innovations have been introduced to reduce customer discomfort when queuing. Apart from rules of queue organization (single vs. multi lines) and different types of devices for registration of goods and payments (service vs. self-service checkouts), dedicated queue management systems can be used. Numerous solutions are offered, but those applicable in typical grocery supermarkets mainly address two aspects [5]- [8]: • Forecasting the demand for cashiers to let managers optimize their operational decisions. Popular systems use detectors or video content analysis to count customers inside the store and predict the demand for checkouts using statistical or artificial intelligence methods.
• Providing feedback through checkout/customer information systems in the form of voice or visual messages about the current or future state of the checkout zone. The objective is to obtain a psychological effect (customers usually ''feel better'' about queuing when provided with an estimate of the waiting time) and/or a more efficient flow of customers.
In this paper we focus on the latter aspect. To this end, we build a realistic agent-based model (ABM) 1 for simulating customer decisions of picking a checkout line at a supermarket, that is flexible enough to allow for testing of different checkout zone layouts, as well as queue management and feedback strategies on the overall efficiency of the checkout process. The model is inspired by a checkout zone optimization problem faced by many grocery stores worldwide [9] and calibrated to actual POS data [10] from three grocery supermarkets located in a large city in Southern Poland.
The paper's contribution is threefold. Firstly, we extend the relatively sparse literature on checkout operations, which have been recently identified by Mou et al. [11] as one of the seven main operational decisions pertinent to store management. Interestingly, the authors argue that this topic -particularly the optimal configuration of the checkout zone and associated staffing decisions -will require more attention in the near future.
Secondly, by considering four different strategies of picking lines, we provide guidance as to the feedback that could be provided to customers entering the checkout zone. Observations of queuing behavior suggest that customers focus mostly on the length of the queue, but do not adjust enough for the speed at which the line moves [12]. If valuable feedback was provided to the customers via checkout/customer information systems, e.g., on the expected time spent in each queue, it could substantially increase the efficiency of the checkout process.
Thirdly, like in [13], [14], our model is implemented on the open-source NetLogo simulation platform [15], hence can be used without restrictions by practitioners and academics alike. Furthermore, it allows for easy to implement changes in the checkout zone, e.g., for 5 to 20 service and 4 to 6 self-service checkouts, as well as in the parameters of the stochastic models defining customer arrival, basket sizes and cashier work schedules. Given that our model is calibrated to actual POS data [10], it provides a more realistic test ground for studying various layout configurations and associated staffing decisions than typical simulation models.
The remainder of the paper is structured as follows. In Section II we review the literature on the use of simulations for analyzing checkout operations. Next, in Section III we introduce our agent-based model, then present the simulation scheme and the considered types of feedback. In Section IV we discuss the obtained results and provide insights as to the impact of individual customer decisions of picking lines on the overall efficiency of the checkout process. Finally, in Section V we wrap up the results and conclude.

II. LITERATURE ON SIMULATING CHECKOUT OPERATIONS
In one of the earlier studies on checkout operations, Williams et al. [16] used discrete event simulation (DEVS) in SIMUL8 to check how different values of 'queue trigger length', i.e., the threshold of congestion warranting the opening and closing of cash-register lanes, influenced the waiting time and staff effort for a particular shop in the US. Alvarado and Pulido [17] simulated different combinations of cashier baggers in selected Columbian supermarkets in Promodel, with the goal of creating a framework that allowed for a better scheduling of the staff. Miwa and Takakuwa [18] used POS data to simulate customer flow in a convenience store in ARENA; they argued that the obtained results can be used for in-store merchandising. Zhang [19] implemented an ABM in NetLogo with the objective of using it to reduce customer waiting time and operation costs of supermarkets. Yamane et al. [20] used ABM to support decisions regarding the number and location of scanning and payment stations. They concluded that the spatial distribution of POS units was crucial when ''collisions'' between clients moving to and between the stations were taken into account.
More recently, Rossetti and Pham [21] built a DEVS model in Java based on a case study of a retail customer checkout area. Then, they extended it in two directions: the first examined customers' criteria when picking a checkout line at a supermarket, the second -the checkout layout, in which the payment was separated from the checkout station. The results showed no significant difference in checkout time based on the line choice criteria. However, the average waiting time dropped significantly when payment was separated from the checkout area. Yang and Takakuwa [22] considered DEVS in Simio to simulate the checkout process in a retail store under various customer arrival conditions and queuing modes. In particular, they were interested in the results of the staff switching policy (i.e., rules) in the store. Kwak [23] investigated the effect of having express checkout lines in retail stores by comparing the waiting time and the queue length of two scenarios: universal checkout lines only and separated checkout lines with express counters. Utilizing DEVS in ARENA they found that the average waiting time of the separated checkout lines was not necessarily shorter than that of universal lines. Mou and Robb [24] used ARENA to investigate Real-Time Labour Allocation (RTLA), i.e., allocating an employee working on shelf replenishment to open a checkout or vice versa. They reported a 6.6% increase in market share compared a store without RTLA. Finally, Doniec et al. [25] built an ABM in Java to mimic customer activities in a supermarket. In their approach, an agent planned a trajectory and chose the checkout according to two criteria -the distance to reach the checkout line and the number of agents already waiting in this line. The authors argued that their model could be used to study customer flows, identify critical areas of congestion and optimize placement of new products.
The above literature review suggests that two types of simulation techniques -ABM and DEVS -are used for studying checkout operations in retail stores. In conventional DEVS, the state is changed only when an event occurs and the passage of time does not have a direct impact on the evolution of the system [26]. On the other hand, ABM does not have this limitation. In fact, this technique is the preferred tool for analyzing human behavior in service systems [27]. ABM provides an efficient way of dealing with the complexity encountered in real-world systems [28]. Its main advantage is that it does not rely on certain imposed global control mechanisms, but lets the system behavior at the macro scale emerge by emulating behavior and interactions at the micro, inter-agent scale [15], [29]. We follow this path and build a realistic ABM test ground for checkout operations that can be used to validate business support decisions in retail operations.

III. THE MODEL
We consider a queuing model inspired by a checkout zone optimization problem nowadays faced by many retail stores [9]. Based on supermarket layouts of a major European retail chain, we assume that the checkout zone contains 5 to 20 service and 4 to 6 self-service checkouts. Once a customer reaches the checkout zone, he has to pick the waiting line assigned to a particular checkout. Furthermore, we assume that each service checkout has a separate waiting line, while all self-service checkouts have one common line. We define the following agents: • customers, who fill baskets with items (i.e., goods, articles) and pick waiting lines following some rules (e.g., by minimizing the number of customers in line, see Section III-E), • cashiers, who staff service checkouts according to a prespecified work schedule, • service checkouts, which serve customers when staffed by cashiers, • self-service checkouts, which ''serve'' customers. The checkout process, illustrated in Figure 1, starts with the collection of articles and arrival to the checkout zone. Next, the customer picks a waiting line. By doing so the customer selects the type of checkout -service or self-service. Note, that in our model each customer is equally likely to select a given checkout type. This is in line with the recent empirical study [30], which found no significant demographic differences between users and non-users of self-service checkouts among 778 respondents in Singapore.
The next phase includes waiting in line as long as the checkout is busy. Note, that despite the fact that jockeying (i.e., switching the line in an effort to reduce the waiting time), balking (i.e., not entering the waiting line) and reneging (i.e., leaving the line before being served) unarguably take place, our own observations and interviews with line workers suggest that they are so incidental, that they do not affect significantly the queuing process. Hence, we do not model these practices. In the final step the items are scanned by the cashier (service checkout) or by the customer (self-service checkout) and the payment is made. This stage also includes bagging and idle time between serving the current and the next customer in line.

A. CUSTOMER ARRIVAL TO THE CHECKOUT ZONE
Simulating a realistic arrival process is not a trivial task. Several approaches have been considered in the literature. Williams et al. [16] constructed a dataset of customer inter-arrival times by examining videotapes and by direct observation. Additionally, service-completion data was used to check if the recorded inter-arrival times were reasonable and to provide guidance on the variability throughout the day and the week. They came to the conclusion that the inter-arrival rates were readily characterized by exponential distributions whose means varied with the time-of-the-day and the day-of-the-week. Also Kwak [23] and Qiu and Zhang [31] assumed exponentially distributed inter-arrival times (but with a constant intensity, i.e., a homogeneous Poisson process, HPP, see Chapter 9 in [32]). Alvarado and Pulido [17] defined four day-of-the-month scenarios (regular working days, Saturdays, holidays, payment days) and counted the number of arrivals per hour for each scenario; this yielded four 'percentage arrival distributions' that were used in their simulations. Miwa and Takakuwa [18] modeled the movement of customers in a store. Unlike us, they simulated customer arrivals to the store, not to the checkout zone. But like us, they used actual POS data. More precisely, they extracted arrival times (to the store) from the transaction logs and a regression estimate of the time spent in the store based on a time-study.
To build a model of customer dynamics in the checkout zone we extract information from actual POS logs collected in three grocery supermarkets located in a large city in Southern Poland, equipped with manned (service) and self-service checkouts. A sample dataset (from one of the three locations) is freely available for download, see [10] for a detailed description. Here we use a 14-day period from 1 to 14 February 2018, i.e., before the introduction of regulations banning shopping on some Sundays. A limitation of the POS logs is that they do not contain information about the arrival of customers to the checkout zone, only timestamps for the beginning (registration of the first item in the basket) and the end time of each transaction. Nevertheless, such data can help to identify the general flow of customers and its variability throughout the day and the week.
Firstly, we can assume that within a time window of, say, 60 minutes, the number of customers that arrive to the checkout zone is very close to the number of transactions that started in this window. This assumption is backed by real-life observations made by one of the authors, that the waiting time of a single customer in these supermarkets rarely exceeds 15 minutes and that -even in periods of heavy traffic -long queues are discharged relatively fast.
Secondly, based on the results of [16], we can assume that within short periods, i.e., measured in minutes, customer inter-arrival times to the checkout zone are exponentially distributed with a time-varying intensity or rate λ(t); mathematically this corresponds to the non-homogeneous Poisson process (NHPP). Our procedure to model λ(t) is as follows: 1) extract the number of transactions per minute from the POS logs → red dots in Figure 2, 2) for each hour-of-the-day compute the average number of transactions in a time window ±30 minutes around this hour → blue step function in Figure 2, 3) linearly interpolate between the hourly intensities to yield λ(t) → green curve in Figure 2.
Given a deterministic, but possibly seasonal, intensity function λ(t) and an upper bound λ ≥ λ(t) for all t, the NHPP can be -and is in our study -simulated using the thinning algorithm [32].

B. BASKET SIZE
Like the customer arrival to the checkout zone, the basket size also has to be adequately simulated. Both parametric distributions, e.g., geometric [17], lognormal [33], as well as POS data-driven [18] have been considered in the literature.
We follow the latter approach and randomly sample the basket size from the available POS data for a given hour of the week; in our dataset the number of transactions rarely falls below 100 per hour. The basket sizes are the smallest on Monday mornings (ca. 5% of baskets exceed 20 items) and the largest on Saturday midday (depending on the store, ca. 18-30% of baskets exceed 20 items).

C. SERVICE TIME
The checkout service generally consists of three separate activities: scanning or registration of articles, payment and bagging. However, due to data limitations or for the sake of simplicity some authors do not distinguish particular steps and treat service as a single step that covers all three actions. Regarding simulations, supermarket models typically assume exponential service times [34], but other distributions -like triangular [35], lognormal [36] or phase-type (PH) [37] have been considered as well. If we want to be more precise, we can decompose the service time into two factors that drive it: work amount and cashier speed. The latter usually does not differ that much between the cashiers and is often ignored in simulations. However, the work amount largely depends on the basket size and some authors postulate a functional form for this relationship. For instance, Miwa and Takakuwa [18] use POS data to fit two linear functions of basket size to determine registration and bagging times. We follow the latter data-driven approach, two linear functions of basket size to determine registration and bagging times. We follow the latter data-driven approach, but do not limit ourselves to linear dependence. The POS logs we have access to include timestamps for the beginning (registration of the first item in the basket; denoted by BeginDateTime) and the ending time of each transaction (denoted by EndDateTime). However, the latter may not exactly be the time when the operation is terminated, as it does not cover the activity of giving back the change to the customer (for a manned checkout). On the other hand, the times between transactions or 'break times' retrieved from POS logs include the idle times between consecutive operations, which actually are not part of the service activity, see Figure 2 in [10]. Nevertheless, given that idle times are very rare during peak hours, we can essentially eliminate their impact by analyzing only periods of high activity. Namely, we select the 20 most busy hours (basically Thursday and Friday mornings and Saturday midday) characterized by the highest number of transactions per minute within the studied two-week period. Such a selection yields a sample of several thousand observations (transactions) per store. To further limit the impact of idle times, we exclude 'outliers', i.e., transactions with break times and transaction times that exceed 1.5-times the respective interquartile ranges; the latter two are computed for the n-th transaction as: Visual analyses of transaction time vs. basket size scatter-plots exhibit a similar pattern for both service and self-service checkouts -the larger the basket the longer the transaction time, see Fig. 3. However, the relationship deviates from a linear fit and is better described by a power regression model of the form: where BasketSize(n) is the number of articles in the n-th basket, with a = 0.6984 and b = 2.1219 for service (R 2 = 0.71; top left panels in Fig. 3) and a = 0.6725 and b = 3.1223 for self-service checkouts (R 2 = 0.64; top right panels). On the other hand, the break time vs. basket size scatter-plots do not yield such a clear-cut picture. For self-service checkouts, the power regression model yields an acceptable fit: with a = 0.2251 and b = 3.5167 (R 2 = 0.14; bottom right panels in Fig. 3), but for service POSs no clear pattern is visible (R 2 < 0.01; bottom left panels). The longer break times for self-service checkouts likely result from the fact that after scanning each item has to be put on a scale which controls product weight and all items are packed into bags only after paying. This significantly lengthens bagging compared to a service checkout. The longer transaction times, on the other hand, may be due to less efficient scanning of items by customers than by the cashiers.
In our simulations the service time is the sum of the transaction and break times. The transaction times are generated according to Eqn. (3) with residuals randomly sampled from the respective distribution (see the histograms in Fig. 3). Similarly, the break times for self-service checkouts are generated according to Eqn. (4) with residuals randomly sampled from the respective distribution, while for service checkoutssimply randomly sampled from BreakTime(n) for n spanning all transactions.

D. AVAILABILITY OF CASHIERS
In the simulations, the number of open POSs plays a crucial role. The number of active cashiers is not constant over time. The arrival to the checkout zone is determined by a work schedule and planned by a manager. However, most of the employees work not only as cashiers but also perform other   Usually it takes about one minute after a call to move from the back office to the checkout zone. On the other hand, when there are no customers waiting to be served, a cashier is obliged to close the POS and go to the back office. Closing the checkout means that no new customer can enter a particular line. However, a cashier cannot leave the workplace before serving all customers waiting in line. In Figure 4 we present the activities of cashiers that are relevant for the simulated process. To simulate the availability of cashiers at a given moment, it is necessary to know the work schedule. Unfortunately, due to the multitasking of employees in the analyzed stores, the historical attendance list cannot not be used. The latter contains information for all employees, also those who did other tasks and were not available for checkout operations. However, the analyzed POS data contains information about the exact log-in and log-off times of all cashiers. Based on these logs, the actual cashier availability can be extracted using the following steps: 1) determine cashier availability a d,h,m for each day (d), hour (h) and minute (m) using log-in and log-off times, 2) calculate the available number of cashiers for each day and hour: d j = max m (a d,h,m ), where j = 1, 2, . . . , n and n is the number of hours in the analyzed period, 3) find the optimal work schedule that meets requirement d j , using linear programming for a work scheduling problem [38, chap. 2.4]. To simplify the computations we assume that shifts of all employees are equal and last four hours. If we denote by x j the number of employees that start working in hour j, the problem of finding the optimal work schedule can be formulated as follows: subject to To find the optimal solution of such a linear programing problem, we use the function lm() from the lpsolve package in R. In Figure 5 we plot the results for a sample store and day.

E. LINE PICKING SCENARIOS
When it comes to picking the checkout line, there is no consensus in the literature on how individuals actually do it. For instance, Kwak [23] assumed that customers always joined the shortest line. When considering express checkouts, Alvarado and Pulido [17] introduced an artificial index based on queue length (weight 60%) and the total amount of items in line (40%). If the number items in the basket was lower than a certain threshold, the customer would choose an express checkout if it was free. If not, the choice was made based on the value of this index. Rossetti and Pham [21] made a distinction between 'regular' and 'rush' customers, and studied two scenarios of picking lines -one based on the number of customers in line and one based on the number of items in baskets.
Since POS data cannot help in this case, we have to make some assumptions. Although we are aware that not all customer decisions have to be rational, in what follows, we consider line-picking scenarios where customers make rational choices, dependent on the available information sets. In the worst case, customers have no information about current lengths of queues and pick the lines randomly (scenario #0). In the best case they have at their disposal very accurate estimates of the waiting times in each of the lines (scenario #4). Scenarios #1-#3 represent various intermediate states of incomplete information; the higher the scenario number, the more complete the information set is: #0 The line is picked randomly (uniform distribution). #1 The line with the lowest number of customers is picked. VOLUME 8, 2020 If there is more than one line satisfying any of the above conditions, e.g., two lines have the lowest number of customers, then the choice among them is made randomly, using a uniform distribution. Note, that while each service checkout has a dedicated line, all self-service checkouts have one common waiting line.

A. SIMULATION SETUP
The simulations are conducted for three stores of the same supermarket chain. In Table 1 we present the basic characteristics for each store in terms of size, turnover and the number of customers. Actual POS data for a period of 14 days is used to obtain realistic estimates of the parameters. The agent-based model is implemented in NetLogo [15]. The graphical user interface for a sample simulation run is visualized in Fig. 6. The central window with black background illustrates the state of the system in real-time -open checkouts are in yellow, closed in red. The windows to the right are used to plot the evolution of selected characteristics.
In order to determine the number of simulation runs, we assessed variance stability [39]. 50 pilot runs were performed for all five scenarios, one store and four selected hours of the week: 20:00-20:59 on Saturday, 11:00-11:59 on Wednesday, 14:00-14:59 on Sunday and 11:00-11:59 on Saturday; the latter were chosen to represent the three  Table 2 in the two rightmost columns; note, that the lines for scenarios #1 and #2 overlap.
quartiles and the maximum of the distribution of the number of customers per hour. The minimum number of simulation runs was chosen according to: where c m V = σ µ is the coefficient of variation for run m, i.e., the ratio of the standard deviation to the mean of the sample, and E is the limit of that metric. Setting E = 0.1 yielded n min = 4, E = 0.075 yielded n min = 5, while E = 0.05 yielded n min = 20. In general, the stability of c m V was good already for 4-5 runs. Based on this analysis, we have decided to perform 10 simulation runs for all stores and scenarios considered.

B. TIME SPENT QUEUING
In Table 2 we report the means and standard deviations of the time spent queuing in each of the three stores, for the five considered scenarios. The values were calculated separately for all customers (i.e., including those served without delay, as there were no other customers at the chosen checkout) and only those that had to wait in a queue. The values cannot be compared directly between the stores because they depend on customer arrival, checkout/cashier availability and basket size specific for that particular store. Instead they should be considered as providing an overview of the results that can be obtained for different stores of this supermarket chain. On the other hand, the scenarios can be compared for each store.
Clearly, the random scenario #0 yields the longest mean queuing time, independent of the store. This scenario is also characterized by the highest standard deviation of the results. The remaining strategies seem very similar when the mean queuing time is computed across all customers. However, when it is calculated only for the customers in queues, there is a clear decreasing tendency (also in variability). The latter indicates that the larger the information set, i.e., the more accurate is the waiting time estimate for each line, the better the outcome of their decision to pick a line.
Kernel density estimates (KDEs) of the waiting time for customers who had to wait in a queue in store 1 are plotted in Figure 7; KDEs for stores 2 and 3 are qualitatively identical. The dashed lines indicate the mean values reported in Table 2. The KDEs were computed and plotted using the geom_density() function from the ggplot2 package in R, with default parameters (e.g., Gaussian smoothing kernel, bandwidth calculated using the rule-of-thumb estimator). VOLUME 8, 2020 FIGURE 8. The probability of waiting in a queue more than 5 minutes in store 1, for the five considered scenarios. Apparently, the random scenario #0 yields the most heavy tailed distribution with a mode at ca. 25 seconds. On the other hand, scenario #4 providing the most feedback to the customers yields the most symmetric distribution of the waiting times with a mode at ca. 70 seconds. It also exhibits the least heavy right tail -the number of customers waiting in a queue more than 180 seconds (i.e., 3 minutes) is lower than for the other four strategies. Hence, the ratio of customers with a positive shopping experience is likely to be higher in stores providing feedback that allows them to act as suggested by scenario #4.
To better illustrate this, in Figure 8 we plot the probability of waiting in a queue more than 5 minutes in store 1, for the five considered scenarios. Clearly, the random scenario #0 is suboptimal, while scenario #4 yields the shortest waiting times. The same is observed for store 2. However, scenario #4 fails to outperform scenarios #1-#3 during a few extreme hours for store 3, see Figure 9. On Friday (Feb. 9) afternoon and Sunday (Feb. 11) afternoon the probabilities of waiting are equally high in all except the random scenario, which suggests a shortage of cashiers and long queues. Apparently under such extreme conditions the line picking scenario does not play as an important role as other factors.

C. WORKING TIME AND CHANGEOVERS
The workload of cashiers is not only dependent on the adopted work schedule, which is the same for each scenario, but also on the current state of the queues. As discussed in Section III-D, the cashier leaves the checkout when the number of customers is small, and returns when queues appear. The time spent in the back office is usually used for performing other activities necessary for the operation of the store. Hence, an important measure of the effectiveness of a scenario is the sum of the time spent working at the cash register and the time of changeovers (walking from the checkout zone to the back office or back).
In Table 3 we report the time cashiers spent at the checkout, i.e., the effective working time, and the total number of changeovers in the whole simulated period (14 days); these values are averages across all simulation runs. Assuming that the time needed for the transition is about one minute, the total working time of the cashiers can be calculated. In terms of the effective working time, scenario #4 outperforms scenarios #1-#3 by ca. 2-7%, and the benchmark scenario #0 by ca. 53-73%. The differences are smaller for stores 1 and 3, and larger for the smaller store 2. While scenario #4 yields the highest number of changeovers, taking into account that the time needed for the transition to/from the FIGURE 9. The probability of waiting in a queue more than 5 minutes in store 3, for the five considered scenarios. Note, the anomalies on Friday (Feb. 9) afternoon and Sunday (Feb. 11) afternoon, suggesting a shortage of cashiers.
back office is rather short, it outperforms all other scenarios also in terms of the total working time. These results indicate that scenario #4 not only is better from customers' perspective (shorter waiting times in queues), but also from the manager's point of view (lower effective and total working times of the cashiers).

V. CONCLUSION
With the objective of obtaining a more efficient flow of customers in the checkout zone [9], we have built a realistic agent-based model (ABM) for simulating customer decisions of picking a checkout line at a supermarket. It is calibrated to actual point of sale (POS) data [10] from three grocery supermarkets located in a large city in Southern Poland and is flexible enough to allow for testing of different checkout zone layouts, as well as queue management and feedback strategies on the overall efficiency of the checkout process.
Our contribution is threefold. Firstly, we extend the relatively sparse literature on checkout operations, which have been recently identified as one of the seven main operational decisions pertinent to store management [11]. Secondly, by considering different strategies of picking lines, we offer guidance as to the feedback that could be provided to customers entering the checkout zone. In particular, we show that when customers receive an accurate estimate of the expected time spent in each queue and pick the line with the lowest time (i.e., scenario #4), the resulting dynamics are not only beneficial for the customers themselves (→ shorter waiting times in queues), but also for the supermarket management (→ lower effective and total working times of the cashiers). Thirdly, since our model is implemented in the open-source NetLogo simulation platform [15], it can be used without restrictions by practitioners and academics alike. Changes in the checkout zone layout, e.g., the number of service and self-service checkouts, as well as in the parameters of the stochastic models defining customer arrival, basket sizes and cashier work schedules are easy to implement. As such, our model provides a platform for testing various layout configurations and associated staffing decisions. However, our model can be further developed. For instance, switching the line in an effort to reduce the waiting time is not taken into account. Although our own observations and interviews with line workers suggest that jockeying is incidental and does not affect significantly the queuing process, in the COVID-19 times it could be seen as a factor increasing the infection rate. Studying such practices might lead to queuing policies that minimize virus spread in supermarkets.