Edge-Cloud Continuum Solutions for Urban Mobility Prediction and Planning

In recent years, there has been an increase in the use of edge-cloud continuum solutions to efficiently collect and analyze data generated by IoT devices. In this paper, we investigate to what extent these solutions can manage tasks related to urban mobility, by combining real-time and low latency analysis offered by the edge with large computing and storage resources provided by the cloud. Our proposal is organized into three parts. The first part focuses on defining three application scenarios in which geotagged data generated by IoT objects, such as taxis, cars, and smartphones, are collected and analyzed through machine learning-based algorithms (i.e., next location prediction, location-based advertising, and points of interest recommendation). The second part is dedicated to modeling an edge-cloud continuum architecture capable of managing a large number of IoT devices and executing machine learning algorithms to analyze the data they generate. The third part analyzes the experimental results in which different design choices were evaluated, such as the number of devices and orchestration policies, to improve the performance of machine learning algorithms in terms of processing time, network delay, task failure, and computational resource utilization. The results highlight the potential benefits of edge and cloud cooperation in the three application scenarios, demonstrating that it significantly improves resource utilization and reduces the task failure rate compared to other widely adopted architectures, such as edge- or cloud-only architectures.


I. INTRODUCTION
The rapid spread of Internet of Things (IoT) devices is generating huge volumes of data at the network edge [1]. Managing this data flow using highly centralized solutions, such as those based on cloud platforms, is extremely ineffective in terms of response time, network traffic management, power consumption, and scalability [2]. Uploading such huge volumes of data directly to the cloud leads to significant consumption of bandwidth and requires the use of high-power computing solutions to manage the resulting workload. Furthermore, in many application fields such as medicine and security, it is essential to offer low-latency and privacy-preserving The associate editor coordinating the review of this manuscript and approving it for publication was Nitin Gupta . services, as data transfer delay or malicious data manipulation can cause significant disservices and even loss of life [3].
In recent years, researchers and IT companies have proposed the adoption of the edge computing paradigm for processing data closer to where they are generated [4]. In this way, the following advantages can be achieved: i) low latency, since the computation takes place close to the data source; ii) energy saving, as battery-limited devices could offload computing tasks to edge servers for reducing energy consumption; iii) privacy preserving, since data are not necessarily uploaded to the cloud, but are processed and analyzed locally; and iv) scalability, as a strongly decentralized and distributed approach allows to manage increasing workloads efficiently. The benefits deriving from solutions based on edge computing can be complemented by using those provided by the cloud, as the latter allows to aggregate large amounts of data persistently and perform compute-intensive analyzes using scalable computational resources.
For all these benefits, edge-cloud continuum solutions are increasingly being proposed for new frontier application scenarios such as smart cities, industrial IoT and smart healthcare [5], [6]. Particularly, in the field of urban mobility, the process of collecting, integrating and analyzing data generated from many sources can greatly benefit from scalable architectures and proximity solutions [7]. For example, tasks like driver assistance, collision avoidance and traffic sign recognition, which require real-time analysis and low response times, can benefit from edge computing [8]. Differently, tasks like diagnostic data collection and analysis, route calculations and targeted advertising, which require a lot of computational resources and access to large datasets, can benefit from the use of cloud computing.
In this paper we analyze how the compute continuum can be exploited to efficiently manage tasks related to urban mobility in large-scale computing environments. In particular, an edge-cloud continuum architecture is exploited to analyze geotagged data generated at the network edge by the movements of IoT objects such as taxis, cars, and smartphones. Once collected, these data can be analyzed through machine learning algorithms in real-time to provide solutions to different problems in our daily life. For example, i) for taxis, discovering the location to which they will have to move to more likely find new passengers; ii) for cars, delivering targeted advertising based on the positions and interests of drivers; and iii) for tourists, recommending new points of interest to visit based on what they like. The main contributions of this work are (i) the description of three application scenarios, in which geotagged data, generated during the movements of IoT objects (e.g., taxis, cars, smartphones), are collected and processed by machine learning algorithms (i.e., next location prediction, location-based advertising, and points of interest recommendation); (ii) a modeling part that defines an edge-cloud continuum architecture able to manage a large number of IoT devices and to efficiently execute machine learning algorithms to analyze the data they generate; (iii) an experimental part in which different design choices are evaluated (e.g., number of devices, type of task, orchestration policies) to improve the performance of machine learning algorithms in terms of processing time, network delay, task failure and computational resource utilization. By evaluating different application scenarios in a real-world environment (the city of Rome) and using settings derived from actual data, we provide a complete and advanced understanding of the benefits of edge-cloud architectures for urban mobility management.
The achieved results showed that the use of an edge-cloud continuum architecture, supported by efficient orchestration policies (e.g., network-or utilization-based), improves resource utilization and ensures a lower task failure rate in comparison to the traditional cloud-or edge-only configurations, where data are entirely processed at the cloud or the edge respectively. Specifically, for all the considered application scenarios, the orchestration policies were able to obtain a significant reduction in processing time (up to 87% compared to the edge-only configuration), a drastic reduction of the number of failed tasks (up to 40% compared to both cloud-and edge-only configurations), and a good lowering of resource utilization (up to 38% compared to the edge-and cloud-only configurations).
The structure of the paper is as follows. Section II discusses related work and introduces the problem statement. Section III describes the proposed edge-cloud continuum architecture. Section IV presents three application scenarios as case studies and a performance evaluation by using two orchestration policies. Finally, Section V concludes the paper.

II. RELATED WORK
Urban computing is a research field that focuses on the study and development of systems and methods for supporting decision-making in urban environments using data generated in cities [9]. In particular, urban mobility is a sub-field of urban computing that refers to the mobility of people and vehicles within cities, including the challenges and opportunities associated with the planning, management and optimization of urban transport systems [10]. The analysis of large amounts of geotagged data generated by IoT devices installed on means of transport and road infrastructures can be used for many purposes, including traffic flow monitoring and transport route planning, decision-making to improve the quality of urban life and the provision of location-based services to citizens [11].
In this scenario, the edge-cloud compute continuum has emerged as a solution to process and analyze the data generated by IoT devices efficiently and in real-time [12], [13]. However, effective resource allocation and orchestration strategies are critical for maximizing the benefits of edge-cloud computing, and for this reason, researchers have focused on optimizing the placement of tasks and data in edge-cloud systems, considering factors such as performance, energy efficiency, cost and reliability [14], [15]. To this end, in the literature different techniques have been proposed that make use of supervised/unsupervised machine learning, deep learning and reinforcement learning [16].
Designing and testing large-scale and multi-layer edgecloud architectures are still open issues, especially for architectures composed of several components based on different technologies and software stacks [17], [18]. Using a large number of hardware devices for prototyping could be very expensive, as well as setting up real-world experiments could be logistically challenging [19]. For these reasons, simulating edge-cloud continuum solutions is important because it allows testing and evaluating system performance to identify and resolve problems or limitations before the deployment in a real context [20]. In particular, the simulation of edge-cloud architectures allows evaluating many aspects including i) the scalability of the system and its ability to manage large amounts of data generated by IoT devices; ii) the latency of the system, i.e., the time between data collection and processing, ensuring that the system provides results in realtime; iii) the ability to exploit both edge and cloud resources efficiently, to optimize data processing and transmission and to ensure energy sustainability as well. The main issues of modeling IoT systems and how simulation approaches can assist the design and validation of edge-cloud architectures are discussed in different research papers [21], [22], [23].
In terms of tools and software solutions, different open-source simulators have been proposed in recent years to simulate IoT environments, such as iFogSim [24], IoT-Sim [25] and EdgeCloudSim [26]. Several research works have made use of simulators to test the behavior of specific IoT applications on edge-cloud architectures [27], [28], [29], [30]. Unlike these, our work analyzes how a large-scale edge-cloud architecture can be leveraged to efficiently manage urban mobility applications based on machine learning. Through three application scenarios, we show how the data generated by different IoT devices can be efficiently managed and processed using an edge-cloud continuum architecture.

A. APPLICATION SCENARIOS
Urban mobility data can be used in multiple ways to improve people's quality of life and make cities more efficient and sustainable. For example, they can be used for traffic monitoring, route planning and transportation management, among others. We have decided to focus our attention on three different use cases where the geotagged data generated by three different types of IoT objects (i.e., taxis, cars and smartphones) are analyzed through machine learning algorithms. Specifically, the following cases are considered: i) the location to which taxis will have to move to more likely find new passengers; ii) targeted advertising based on the positions and interests of car drivers; and iii) suggestion of the next points to visit based on tourist preferences and behaviors.
Geotagged data generated by taxis can be used to predict their next destination, reducing route costs and traffic congestion. Unlike other forms of public transportation, taxis do not have fixed routes and plan their routes after a passenger is dropped off [31]. GPS trackers in taxis allow for real-time monitoring of the vehicle's location and trajectory analysis can be used to predict where a taxi will move next, known as the next location prediction problem [32], which can be modeled as a short-term or long-term prediction task. There are several methods in the literature for this problem, such as frequent patterns and association rules [33], [34], or machine learning-based methods like clustering and Markov chain-based framework [35] or neural network-based models [36].
Geotagged data from vehicles can be leveraged for location-based advertising, which can provide car drivers with relevant products, services and offers based on their habits while they are on the road. Popular approaches include using GPS data [37] from the driver's in-car navigation system to deliver location-based ads and offers (e.g., a driver passing by a restaurant might receive a coupon for a dis-counted meal), as well as using contextual data [38] such as time of day and traffic conditions (e.g., a driver stuck in traffic might receive an ad for a nearby coffee shop).
Similarly, geotagged data generated by people during their movements can be used to provide insights for destination planning, service design and marketing [39]. To achieve this, data mining algorithms are used to discover frequent patterns in user trajectories across interesting locations frequently visited by users, commonly referred to as Pointsof-Interest (PoIs) [40]. A common application is related to PoIs recommendation to suggest places to visit based on a tourist's route collected from the smartphone during a trip for improving touristic services [41], [42]. Machine learning and data mining models have been used in previous work to solve this problem, such as using a bidirectional LSTM neural network [43] or sequential pattern analysis [44].

B. MACHINE LEARNING SOLUTIONS FOR LARGE URBAN AREAS
With the growth of urban areas and the number of IoT devices, there is an increasing need to develop machine learning algorithms that can scale efficiently on distributed architectures such as those of the edge-cloud continuum. Federated learning, through the hierarchical aggregation of learning models, has emerged as a promising paradigm for overcoming the limitations of traditional centralized approaches, such as those related to bandwidth, latency, and centralized data processing and storage. In federated learning, data are kept on local devices and only model updates are shared, ensuring greater scalability and efficiency but also privacy preservation of sensitive data, making it suitable for large-scale IoT environments such as those of the Internet of Vehicles (IoV) and Intelligent Transportation Systems (ITS) [45].
Recently, several frameworks have been proposed in the IoV that use federated learning. For instance, Balasubramanian et al. [46] proposed a cooperative edge intelligence framework that uses a hybrid stacked autoencoder model called VeNet to perform anomaly detection and classification tasks among multiple edge devices in a decentralized manner. It consists of a local autoencoder that is trained on data collected by each edge device, and a global autoencoder that is trained on a subset of the data from all the edge devices. Similarly, Zhou et al. [47] introduced a novel two-layer federated learning framework for IoV that allows for aggregating models with different architectures and hyperparameters. This approach allows for more flexibility in model selection and greater performance of federated learning frameworks. Overall, the experimental evaluations on real-world and large-scale datasets demonstrate the scalability, efficiency, and potential benefits of using federated learning in urban computing contexts, making it suitable in large-scale IoT environments.

III. SYSTEM ARCHITECTURE
Although cloud computing provides high scalability with dynamic resource allocation, it may raise performance issues as a result of the centralization of data collection and processing [48], [49]. On the other hand, an edge-cloud continuum architecture might address these issues by enabling efficient and fast management of the massive volume of data generated by IoT devices. In particular, these architectures enhance computation capabilities and scalability while reducing network congestion and failed tasks. For these reasons, such architectures can also have a significant impact on urban mobility applications. Figure 1 shows a three-layer edgecloud continuum architecture for supporting urban mobility.
The edge-cloud continuum leverages all the resources from the edge of the network (e.g., IoT devices) to the core (e.g., cloud data centers) [50]. Specifically: • The device layer includes the components that are leveraged by vehicles and humans to share information during their movements across different urban cells, which define a partitioning of an urban area. These components (e.g., GPS, infotainment devices, on-board cameras) produce a very high volume of data in different formats and in real-time, which is sent to the edge server of the current cell. This data can be combined with the personal data of the users (e.g., preferences and behaviors) and information about the surrounding environment, to deliver advanced, customized and contextaware services.
• The edge layer includes heterogeneous hardware components (e.g., gateways, micro data centers), which serve as elements of the infrastructure that collect and partially process raw data generated at the device layer.
• The cloud layer provides access to a large set of computing and storage resources, which can be dynamically allocated for executing tasks that cannot be performed by edge servers. From the client's perspective, the cloud is an abstraction for remote and infinitely scalable computing and storage resources. For these reasons, it has emerged as an effective computing paradigm to meet the challenge of processing big data in a limited time and to provide an efficient data analysis environment. The edge layer includes a key component called Edge Orchestrator (EO), which is responsible for managing and coordinating the execution of tasks, determining whether each task will run on the edge or cloud. It can be programmed to apply different orchestration policies to optimize the overall performance of the architecture. These policies can take into consideration many parameters, such as network congestion, data volume to be processed, and status and load level of both edge nodes and cloud. Two orchestration policies were employed in this work, namely network-based (edge/cloud-NB) and utilization-based (edge/cloud-UB), whose pseudocode is shown in Algorithm 1. In particular, for each task to be scheduled, the cell and the associated edge server are identified from the coordinates of the IoT object generating that task (lines 3-4). Then, the desired orchestration policy (i.e. utilization-based or network-based) is applied to decide where the incoming task must be executed. Specifically, the utilization-based policy schedules tasks based on the utiliza-Algorithm 1 Edge Orchestrator 1: Initializing EO and orchestration policy p. 2: procedure GetServer(task, coord, θ 1 , θ 2 ) 3: cell ← getCell(coord) 4: edgeS ← getEdgeServer(cell) 5: layer ← null 6: if p == Utilization_Based then 7: edgeUtilization ← getEdgeUtilization() 8: if edgeUtilization>θ 1 then 9: layer ← CLOUD 10: else 11: layer ← EDGE 12: end if 13: else 14: wanDelay ← getUpDelay(task.getDevice(), CLOUD) 15: wanUBW ← getBandwidthUtilization(wanDelay) 16: if wanUBW <θ 2 then 17: layer ← CLOUD 18: else 19: layer ← EDGE 20: end if 21: end if 22: return (layer == EDGE)?edgeS : cloud 23: end procedure tion of edge nodes (lines [6][7][8][9][10][11][12]. If the average edge utilization is greater than a fixed threshold (i.e., θ 1 ), the incoming task is offloaded to the cloud (lines 8-9); otherwise, it is assigned to the edge layer (lines 10-11). The network-based orchestration policy (lines 14-21) measures the network delay from the device that generated the task to the cloud (line 14). For deciding where it must be executed, a dummy task that uploads and downloads 1 MB of data is exploited. In detail, the algorithm measures the upload delay which includes both the transmission delay (i.e., the time required to transmit the data over the network) and the processing delay (i.e., the time required for the cloud to process the request). Particularly, the transmission delay includes both the time required to transmit the request and response over the network, which depends on the size of the data being transmitted and the available transmission rate, and propagation delay, which is due to the distance between the server and the cloud. Then, this delay is leveraged to determine the percentage of used bandwidth compared to the maximum bandwidth (line 15). If it is less than a fixed threshold (i.e., θ 2 ), the incoming task is offloaded to the cloud (lines [16][17]; otherwise, it is assigned to the edge layer (lines [18][19]. In the end, according to the chosen layer, the task is assigned to the cloud or the edge server of the current cell (line 22).
In the experimental section, the values of thresholds were chosen according to conventions often used on cloud platforms. Indeed, in different technical reports [51], [52], [53], a threshold value is used to determine when to scale the computing resources (e.g., 80% of the total resources). This is because, if the percentage of resource utilization reaches the threshold value, it can indicate that such resources are under pressure and may not be able to handle any further requests.

IV. EXPERIMENTAL EVALUATION
To evaluate the performance of the proposed edge-cloud continuum architecture, we used the EdgeCloudSim simulator and considered three different urban mobility scenarios (taxis, cars and tourists with smartphones). Among the open-source simulators discussed in Section II, we have chosen EdgeCloudSim, which is particularly well-suited for modeling urban mobility scenarios, since it supports different architectures, devices, and device mobility [26]. Table 1 reports the main parameters required by the simulator along with their description.
The three different applications are concerned with urban mobility in which machine learning algorithms are used to analyze large sets of geotagged data generated during the movements of IoT objects. In particular: 1) Application scenario 1 is about the taxi destination prediction problem, aimed at establishing the next position where taxis will have to move to have a better chance of finding new customers.
2) Application scenario 2 models the problem of delivering location-based and targeted advertising to car drivers based on the position of the car and the interests of the driver. 3) Application scenario 3 concerns the next location recommendation problem applied to tourists, aimed at suggesting new points of interest to visit based on their movements collected by their smartphones. For each scenario, we considered three common tasks: • Data collection task: it consists in collecting and preprocessing the data generated at the device layer (e.g., data generated by IoT objects).
• Training task: it consists in training a machine learning model, which is regularly updated with new mobility patterns. In these experiments, we used a centralized approach for model update. This turns out to be an appropriate choice according to the size of the urban area and the number of devices considered, in contrast to the approaches based on federated learning which are more suitable for larger urban areas scenarios.
• Prediction task: it exploits the trained model for suggesting the next location where a taxi should move to find new passengers or to provide location-based advertising and suggestions to car drivers and tourists. Table 2 reports the main parameters used to configure the simulations, which have been extracted from official reports of public administrations or scientific papers. In particular, we used Rome in Italy as the reference city, and according to the official report [54] we have defined the number of taxis, cars and tourists, i.e. 10K, 100K and 100K respectively. The city has been divided into 100 cells covering about 1km 2 each. The infrastructure is composed of an edge layer with 100 edge servers configured as a virtual machine (VM) having 4 cores, 4 GB of RAM and 64 GB of storage memory, and a cloud 38868 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. layer configured as a VM equipped with 8 cores, 32 GB of RAM and 1 TB of storage memory. In our simulations, IoT devices follow a Nomadic Mobility Model, in which the time a device remains in a cell before moving to a nearby one is taken from an exponential distribution. The mean value of the exponential distribution is set to 140 seconds for application scenarios 1 and 2, and at 900 seconds for application scenario 3. In fact, considering that the average speed of a vehicle in the center of Rome is estimated at 26 km/h [55] and that the city area has been divided into cells covering areas of 1km 2 , vehicles at this speed cross a cell on average in about 2.3 minutes (i.e., 140 seconds). Instead, the average speed of a pedestrian walking at a slow pace is 4 km/h [56], therefore the time taken to pass from one cell to another is about 15 minutes (i.e., 900 seconds).
The training and inference times together with the information on the hardware characteristics reported in [35], [40], and [44], have been used to determine the type of tasks and their average length for application scenario 1, 2 and 3 respectively. Moreover, for each task (i.e., data collection, training and prediction) in each application scenario, a set of parameters are required, including the Poisson interarrival, the active/idle period time of tasks, and the amount of data that is downloaded and uploaded. The Poisson interarrival time is used to define the rate at which the devices generate the different types of tasks. Overall, for data collection and prediction tasks modeling, the interarrival times were set to low values to reflect the high level of devices' activity in the urban area under consideration. Conversely, higher values were chosen for the training tasks to model that they are less recurring (e.g., periodic retraining of the machine learning model might occur once a day). The active and idle periods are used to control the amount of time a device spends actively generating or not generating a specific task, while the upload/download data sizes control the amount of data generated and transmitted by devices in the simulation. For example, large upload and download data sizes indicate a task with high data transmission requirements, such as the prediction and data collection tasks that involve significant data exchange. Among all the parameters described, the number of IoT objects, the Poisson interarrival time and the task length are the ones that mostly drive the results of our simulations and, for this reason, we mainly focused on them to define the different tasks in the application scenarios. Other parameters required to configure and run the simulations (e.g., active/idle period and download/upload data size) were defined differently for each task but uniformly across scenarios.

A. PERFORMANCE EVALUATION
We carried out a large number of experiments to evaluate the edge-cloud continuum architecture. To make the simulation results more significant, we repeated the experiments 10 times for each input configuration and reported the mean values. The experiments are used to assess the behavior of the edge-cloud continuum solution compared to centralized ones that exploit only cloud or edge resources. Specifically, the four configurations we evaluated are the following: • Cloud-only: tasks are performed exclusively on the cloud.
• Edge-only: tasks are performed directly on the edge. • Edge/cloud-UB and edge/cloud-NB: tasks are performed locally on edge servers or remotely in the cloud based on the policy of the edge orchestrator (i.e., network-based and utilization-based). Regarding the edge/cloud configurations, as discussed in Algorithm 1, the decision whether to offload a task to the cloud or perform it on the edge server is driven by two main parameters, i.e. the two thresholds θ 1 and θ 2 . In the experimental evaluation of all application scenarios, the threshold for the utilization-and network-based policies was set at 80% [51], [52], [53], which means that the computing and network resources are preserved from being used no more than 80% of their capacity to avoid their saturation.
The configurations were evaluated and compared on four different metrics, which are the average processing time, percentage of failed tasks, network delay and VM utilization.

1) APPLICATION SCENARIO 1: NEXT LOCATION PREDICTION FOR TAXIS
In this section, we present the main results we obtained for the scenario related to taxi destination prediction. Figure 2   reports the performance metrics for each of the four configurations (i.e., cloud-only, edge-only, edge/cloud-NB, and edge/cloud-UB). As stated before, the application is modeled to simulate the city of Rome, which has around 10k taxi licenses according to official data [54]. However, we considered a variable number of taxis, ranging from 5k to 12.5k, to investigate how a different number of taxis can impact the performance metrics in the different configurations.
The average processing time obtained by the different configurations is shown in Figure 2(a). The achieved results are stable and consistent over the different runs and exhibit low variance. On average, the relative standard deviation is at most 4% compared to the mean value over the 10 runs. In particular, the edge-only showed the worst results, with a significant drop in performance as the number of vehicles increased (the processing time increases from around 10 seconds with 5k vehicles up to around 70 seconds with 12.5k vehicles). Instead, the cloud-only configuration achieved a very low average processing time. However, it dramatically increases the number of failed tasks as the number of vehicles increases. In fact, as shown in Figure 2(b), the percentage of failed tasks for the cloud-and edge-only architectures increases rapidly as the number of vehicles increases. In particular, a steep increase can be observed when using more than 7.5k vehicles: this means that, as long as there are few vehicles, the cloud-only architecture can handle the incoming workload better than the other configurations, but as the number of devices increases it leads to a higher percentage of failed tasks. On the other hand, the use of the edge orchestrator leads to a lower task failure rate (on average 6.8% for edge/cloud-NB and 1.3% for edge/cloud-UB). This is a crucial aspect to be considered since in many contexts having a high number of failed tasks can compromise the usability of the IoT application. Also for this metric, the results obtained show a low variance, with a relative standard deviation that is at most 2% with respect to the mean value. Figure 2(c) shows the average network delay. In particular, it emerges how the cloud-only configuration generates a very high network delay because of data transfer from the edge layer to the cloud, resulting in a significant increase in communication delay (up to around 98% higher than the edgeonly solution), while processing data locally at the edge does not produce significant effects. For the network delay, simulation results showed a negligible relative standard deviation below the 1%. Figure 2(d) illustrates the average VM utilization obtained by the different simulated configurations, with a relative deviation from the mean value being at most 7%. The edge/cloud-NB achieved the best result showing a low utilization of resources while keeping, as discussed, a low processing time and a low percentage of failed tasks. Reducing the use of VMs is a crucial aspect in large-scale applications that involve large computational resources because it allows for optimizing costs and energy consumption. Additionally, reducing the risk of saturating computational resources allows for handling any unexpected workload peak that may occur. It should be also noted that the edge-only configuration produces a significant increase in the VM utilization for a high number of vehicles, but it still achieves a lower task failure rate than the cloud-only one (see Figure 2(b)). If we analyze in detail the percentage of VM utilization for the two edge/cloud configurations, we can get more details about the behavior of the edge orchestrator. In particular, Figure 3 shows the percentage of VM utilization on both cloud and edge when considering 12.5k vehicles. In this case, the utilization-based policy results in a higher utilization of the edge resources (73% compared to 49% of cloud), while the network-based policy produces a higher utilization of cloud resources (57% compared to 12% of edge).
It is worth noting that a task can fail for one of three reasons: VM capacity, low network bandwidth or due to mobility. In particular, if the utilization of a VM is too high, it may reject incoming tasks. Similarly, if too many vehicles connect to the same edge server, the network may become congested and tasks may fail. Finally, a task may fail due to the vehicle moving from one cell to another, according to the Nomadic Mobility Model. As an example, if we analyze the percentage of failed tasks in the cloud-only configuration, we find out that only the 0.02% fails due to low computation capacity, as the cloud has enough computational resources, while almost all failed tasks are due to network congestion. On the other hand, the edge/cloud-NB is able to balance data traffic between cloud and edge, avoiding sending traffic over the WAN when it is congested.
Overall, the edge/cloud-UB and edge/cloud-NB showed the best results, outperforming the conventional cloud-or edge-only architectures. Compared to the edge-only architecture, the use of the edge orchestrator leads to a drastic reduction in processing time, which ranges from 30% for edge/cloud-UB to 87% for edge/cloud-NB. In addition, compared to both cloud-and edge-only architectures, it permits to reduce the number of failed tasks (up to 38% for edge/cloud-UB and 40% for edge/cloud-NB) and the VM utilization (up to 29% for edge/cloud-UB and 38% for edge/cloud-NB).

2) APPLICATION SCENARIOS 2 AND 3: LOCATION-BASED ADVERTISING FOR CAR DRIVERS AND POI RECOMMENDATION FOR TOURISTS
In this section we present the main results obtained by simulating application scenarios 2 and 3, which model the problem of location-based advertising for car drivers and points of interest recommendation for tourists. For the sake of brevity, we have not considered a variable number of IoT objects (i.e., vehicles and people), but only the one closest to the real one. In particular, we considered 100k IoT objects for both scenarios, which is about the number of cars and tourists that move around the city of Rome every day [54]. Figure 4 reports the performance metrics for each of the four configurations, i.e., cloud-only, edge-only, edge/cloud-UB and edge/cloud-NB. Specifically, Figures 4(a), 4(b), 4(c) and 4(d) report the average processing time, percentage of failed tasks, network delay and VM utilization, respectively.
In both simulated scenarios, the edge/cloud-NB configuration shows a better processing time than both edge-only and edge/cloud-UB (up to 81% and 76% lower respectively). It should be noted that the computational resources of the cloud allow for a lower processing time. However, as for application scenario 1, the cloud-only configuration is affected by a high percentage of failed tasks and high network delay. Indeed, leveraging both edge and cloud resources, as long as the network is not congested and the WAN delay is negligible, reduces the time required to complete a task.
Concerning failed tasks, the edge/cloud configurations obtain very low failure rates (less than 4%). Particularly, Figure 5 details the percentage of failed tasks due to network congestion, VM capacity and mobility for both configurations. In these two scenarios, mobility from one cell to another is the main cause of task failure. However, especially in application scenario 2, an important part of the tasks fails due to a lack of resources at the edge layer.
Regarding network delay, the simulation results did not reveal significant differences between the two orchestration policies in the edge-cloud continuum, while the cloud-only configuration is heavily affected by data transfer from the edge layer.
Finally, regarding VM utilization, the edge/cloud-NB and edge/cloud-UB showed a lower percentage than the cloudand edge-only configurations, reducing the risk of saturating computational resources and allowing for better management of the incoming workload. In particular, the average VM utilization of the edge/cloud-NB is up to 22% and 48% lower than the edge/cloud-UB for the two simulated scenarios. Overall, the edge/cloud-NB configuration performed better than the edge/cloud-UB configuration, reducing the processing time by 76%, the number of failed tasks by 3%, and the VM utilization by 27% on average.

V. CONCLUSION AND FUTURE WORK
With the pervasive diffusion of IoT devices, the edge-cloud continuum has been proposed to combine the advantages of edge computing in processing data closer to where  they are generated with those of the cloud in supporting compute-intensive tasks. In this paper, we explored the use of edge-cloud architectures for supporting three urban mobility scenarios (i.e., next location prediction, location-based advertising, and of points of interest recommendation), in which machine learning algorithms are used to analyze large sets of geotagged data generated during the movements of IoT objects (e.g., taxis, cars, smartphones).
Several experiments have been carried out for assessing the benefits of the edge-cloud continuum over the traditional cloud-or edge-only architectures. In particular, we exploited a simulation-based approach for designing and testing IoT applications by using an edge-cloud simulator and two orchestration policies, based on network (edge/cloud-NB) and computational resources (edge/cloud-UB) utilization. The achieved results demonstrated that the edge-cloud continuum architecture, coupled with the defined orchestration policies, outperforms traditional cloud-or edge-only architectures, obtaining a significant reduction in processing time, task failure rate, and resource utilization.
Future research efforts will be devoted to developing advanced orchestration policies that can exploit machine and 38872 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
deep reinforcement learning to improve task scheduling in the edge-cloud continuum. Such policies can be further tested using emulators instead of simulators to evaluate how software interacts with the underlying hardware. Furthermore, in ever-growing urban areas with ever-increasing numbers of IoT devices, it will also be necessary to think about how algorithms can scale efficiently on edge-cloud architectures. Hence, future work should evaluate how machine learning paradigms such as federated learning can overcome the limitations of centralized solutions in large-scale IoT environments.

DATA AND CODE AVAILABILITY STATEMENT
In order to reproduce the experiments reported in the paper, the open-source version of EdgeCloudSim is available on https://github.com/CagataySonmez/EdgeCloudSim, while all the parameters required to run the simulations are reported in the paper.
LORIS BELCASTRO received the Ph.D. degree in information and communication engineering from the University of Calabria, Italy. He is currently a Research Fellow of computer engineering with the University of Calabria. His research interests include cloud computing, social media and big data analysis, distributed knowledge discovery, and data mining. In 2012, he received a scholarship from the Institute of High-Performance Computing and Networking of the Italian National Research Council (ICAR-CNR). Open Access funding provided by 'Università della Calabria' within the CRUI CARE Agreement