Latency-Aware and Proactive Service Placement for Edge Computing

Smart IoT devices and applications in smart cities exchange important real-time information with their environment. However, a subset of these systems may face limitations in analyzing and processing the required large amounts of data to meet ultra-low-latency criteria. This limitation could be attributed to factors such as constrained CPU and battery resources. Thanks to the 5G and edge computing capabilities, a viable solution involves migrating a subset of these latency-sensitive and computationally intensive tasks to edge nodes and servers. This strategic service placement ensures a safe continuity of the application. In this paper, autonomous cars operating in smart cities, engaging in continuous data exchange with their external environment to meet real-time and latency-sensitive requirements, serve as an illustrative example of smart applications. The car’s decision service is strategically placed on edge nodes through a proactive (re)placement approach designed for dynamic and mobile environments. This approach uses a quality of service (QoS) metric prediction degradation module, which leverages Exponential smoothing methods to identify a suitable edge node for hosting the car’s decision module, with latency as a key criterion. Multiple configurations for outlier detection techniques are evaluated. A proof-of-concept validates the chosen model by comparing it to the AutoRegressive Integrated Moving Average (ARIMA) and the proposed proactive service (re)placement approach. This approach ensures the continuity of the placed module, suggesting the feasibility of locating non-critical modules on edge nodes.

Latency-Aware and Proactive Service Placement for Edge Computing Henda Sfaxi , Imene Lahyani , Sami Yangui , Member, IEEE, and Mouna Torjmen Abstract-Smart IoT devices and applications in smart cities exchange important real-time information with their environment.However, a subset of these systems may face limitations in analyzing and processing the required large amounts of data to meet ultra-low-latency criteria.This limitation could be attributed to factors such as constrained CPU and battery resources.Thanks to the 5G and edge computing capabilities, a viable solution involves migrating a subset of these latencysensitive and computationally intensive tasks to edge nodes and servers.This strategic service placement ensures a safe continuity of the application.In this paper, autonomous cars operating in smart cities, engaging in continuous data exchange with their external environment to meet real-time and latencysensitive requirements, serve as an illustrative example of smart applications.The car's decision service is strategically placed on edge nodes through a proactive (re)placement approach designed for dynamic and mobile environments.This approach uses a quality of service (QoS) metric prediction degradation module, which leverages Exponential smoothing methods to identify a suitable edge node for hosting the car's decision module, with latency as a key criterion.Multiple configurations for outlier detection techniques are evaluated.A proof-of-concept validates the chosen model by comparing it to the AutoRegressive Integrated Moving Average (ARIMA) and the proposed proactive service (re)placement approach.This approach ensures the continuity of the placed module, suggesting the feasibility of locating non-critical modules on edge nodes.Index Terms-ARIMA, edge computing, quality of service (QoS), service placement, time-series forecast.

I. INTRODUCTION
A CCORDING to [1], the number of megacities with over 10 million inhabitants is projected to increase from the current 21 to 29 by the year 2025.The interconnected IoT devices are expected to increase by 16% by the end of 2023, resulting in 16 billion devices [2].As mentioned in [3], the number of IoT connections is forecast to grow from 14.6 billion to 30.2 billion in 2027.Furthermore, the widespread implementation of advanced networks promotes the growth of smart cities and related services, including such sectors as smart health and smart agriculture.Several Henda Sfaxi, Imene Lahyani, and Mouna Torjmen are with the ReDCAD, ENIS, University of Sfax, Sfax 3038, Tunisia (e-mail: Henda.Sfaxi@enis.tn;Imen.Lahyani@enis.tn;Mouna.Torjmen@enis.tn).
Digital Object Identifier 10.1109/TNSM.2024.3375970challenges frequently arise in this context, two of which are outlined below.Firstly, numerous connected devices have limited computational and battery capacities [4], [5].
Secondly, contemporary and future mobile and smart service applications, including but not limited to virtual/augmented reality (VR/AR) and connected cars, are latency-sensitive.These applications involve a substantial number of connected devices and sensors, generate a significant amount of data that needs real-time processing, may require considerable computational resources, and demand an important storage capacity [6].Despite leveraging its substantial storage and computational capabilities, cloud computing cannot guarantee the low-latency and real-time response requirements these applications often call for, as it is typically deployed within the core network [1].However, edge computing can ensure the criteria mentioned above, owing to the proximity of its nodes and servers provided at the edge network.Furthermore, studies [6] and [7] propose relying on citizens' connected devices as edge nodes to enhance resource availability within the edge network.Hence, smart and mobile applications may utilize cloud resources, edge resources, or a combination of both, depending on their requirements.
On the other hand, enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC) represent two significant commitments in the era of 5G and 6G networks.These services, among others [8], promise to enhance location awareness [6], [9] and communication among connected devices.Specifically, these services are anticipated to streamline interactions between mobile and IoTbased applications and edge nodes and devices, facilitating the implementation of latency-sensitive use cases.
The applications can offload specific tasks to external nodes to harness the computational and storage capacities of cloud and edge computing, on the one hand, and address the constrained computational capacity and battery life of specific connected devices, on the other.This practice is commonly referred to as service placement.Service placement serves various objectives, such as optimizing energy consumption within the system, achieving faster task execution than on the original devices, and fulfilling other intended benefits, as stated by [10].Various metrics and parameters have to be considered when addressing the service placement problem (SPP).These include the dynamicity of the network, the mobility support of the edge nodes, and optimization strategies such as determining the Quality of Service (QoS) metric(s) to prioritize, as discussed in [7].Additionally, it is imperative to guarantee the uninterrupted functionality of the placed service to make it effective.Enabling seamless mobility for both end-users and edge nodes while maintaining uninterrupted access to desired services and optimal performance stands out as a significant challenge in edge computing, particularly in the context of SPP.Frequent changes in end-user locations can lead to significant delays or packet loss.In such scenarios, services have to be seamlessly transferred from previous edge nodes to new ones or placed back on the original end-user device, ensuring continuous operation without interruptions.Regarding mobility within SPP, the literature review can be categorized into two primary approaches: reactive and proactive techniques.Reactive techniques for service placement involve the detection of deteriorating QoS, QoE, or mobility conditions among end-users and edge nodes as a precursor to making service placement decisions.In contrast, proactive techniques focus on predicting the degradation of fixed QoS and QoE metrics or forecasting the mobility patterns of end-users and edge nodes to anticipate their movement behavior.
This work focuses on a proactive approach to address the service placement problem in edge computing, considering the network dynamicity and mobility of end-users and edge nodes.

It provides two key contributions:
• Contribution 1 (CONTR1): A prediction model for a QoS metric, • Contribution 2 (CONTR2): A proactive service placement capability utilizing the prediction model (CONTR1) to detect QoS metric degradation in dynamic and mobile edge computing environments.The proposed predictive module facilitates the selection of a suitable edge node for service (re)placement based on the predicted latency value as a QoS metric.The module selects the edge node with the lowest predicted latency for service placement while ensuring service continuity.As long as the edge node remains responsive and the predicted latency value remains low (i.e., when predictedLatencyValue 0), the service remains on the edge node.In case a degradation is predicted, the service is relocated to a more appropriate edge node if possible; otherwise, it is used within the car.
To achieve (CONTR1), Exponential smoothing models are used alongside a set of techniques for detecting outliers to construct the prediction model for a latency QoS metric degradation prediction.The selection of these techniques is based on specific evaluation criteria.Regarding (CONTR2), the chosen model is substantiated through a proof-of-concept, which involves a comparison with AutoRegressive Integrated Moving Average (ARIMA) and validates the proposed proactive service (re)placement approach.This proof-of-concept aims to offload a critical service for demonstration purposes only, showing that the car can operate safely and interact with all other nodes in its neighborhood while ensuring it stays autonomous and keeps operating in a proper and safe way.While offloading critical services for next-generation autonomous vehicles is challenging to implement in real-world scenarios, mainly due to security reasons, the authors believe that it is indeed feasible for non-critical modules such as infotainment and multimedia applications.
The remainder of this paper is structured as follows.Section II introduces the motivating use case and its associated requirements.Section III provides a literature review on SPP in edge computing.Section IV gives insight into the essential background and fundamental concepts related to the time series forecast models.Section V introduces the proposed algorithm for latency forecasting and outlier detection.Section VI discusses the configurations for detecting outliers and their interpretation.Section VII details the developed prototype as a proof-of-concept.Finally, section VIII draws on the main conclusions reached by the paper and outlines directions for future research.

II. MOTIVATING USE CASE AND REQUIREMENTS
This section introduces the relevant use case, elucidates the underlying motivation, provides a formal problem statement, and outlines the work's requirements.

A. Autonomous Cars in 5G Networks
Connected autonomous cars are poised to replace conventional vehicles in the near future, potentially reducing traffic congestion and minimizing accidents resulting from human errors during driving.These vehicles can also be designed to accommodate seniors and individuals with reduced mobility.Operating within smart cities, autonomous cars continuously exchange vital information with their external environment, supporting the latency-sensitive embedded applications, particularly the decision-making system, in responding to incoming data.The vast data generated by the cars' embedded sensors may exceed the processing capabilities of the onboard systems, leading to challenges in meeting low-latency requirements [11].
5G and edge computing offer various features, including low-latency communication, location awareness, and substantial computing capacity, which can benefit connected devices [6], [9].Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC) are key offerings in the 5G and 6G network era, particularly in smart cities.These services enhance communication between connected devices and streamline interactions between autonomous cars and edge nodes, enabling various latencysensitive applications.
Given these capabilities, autonomous cars can optimize their energy consumption and leverage the computational resources of edge computing by offloading specific tasks to edge nodes [10], among other advantages.

B. Motivations and Research Issues
Autonomous vehicles, equipped with extensive sensor arrays, generate large volumes of real-time data during their journeys, which may exceed the capabilities of their on-board systems to process promptly under low-latency conditions [11].To address this challenge, one viable scenario involves offloading specific computational tasks associated with vehicle control to external edge nodes.A relevant study [12] conducted within a smart city framework explored a similar concept.In their proposed architecture, the vehicle's operating area was divided into zones known as 'pops.'The autonomous vehicle determined the most suitable edge node for offloading specific computational tasks within each pop.The vehicle's essential modules encompass the decision module (enacting driving policies), the perception module (collecting navigation data), the navigation module (calculating optimal routes), the data collection module (collaborating with other devices to gather real-time information), and the storage module (archiving data generated by other modules).The first four modules are integrated within the autonomous vehicle, while the storage module resides in the 5G core network.Notably, a simulation assessed the viability of migrating the data collection module to various edge nodes within the nearest pop.The results indicated that relocating this module outside the vehicle might not be advantageous.
Given the substantial data volumes the autonomous vehicle generates, which on-board systems may struggle to analyze within low-latency constraints [11], exploring alternative strategies for relocating other modules while ensuring continuous functionality warrants consideration.

C. Problem Formalization
Considering a hypothetical scenario, this study focuses on migrating the autonomous car's decision module to an edge node.As the autonomous car moves from one location (pop) to another, illustrated in Figure 1, it must identify an optimal placement-i.e., an edge node capable of temporarily hosting the latency-sensitive decision module.Some examples of the considered edge nodes in this study include servers, other autonomous cars, and devices owned by citizens.
The decision module plays a vital role in the vehicle's driving engine and must remain consistently available and operational during the car's journey.
Therefore, selecting an edge node with low-latency responsiveness is crucial to ensure the module's continuous availability.If the chosen edge node experiences delays or ceases to respond promptly-possibly due to relocation, low battery, excessive workload, or when the car transitions to a new location-the migrated service should be swiftly relocated to another suitable edge node or back to the car itself.
To address this requirement, a migration engine with a predictive approach first selects an edge node within the current location based on low-latency Quality of Service (QoS) criteria for hosting the decision module.Subsequently, it monitors the QoS metric for potential degradation, preemptively researching and transitioning to another available and suitable edge node.One of the migration engine's primary roles is to ensure the continuous operation of the decision module, which justifies using a predictive technique.
In this study, "latency" refers specifically to network latency, encompassing edge nodes and edge servers collectively referred to as "edge nodes" without differentiation.

D. Requirements
To fulfill the paper's objectives, three distinct requirements are defined: • Requirement 1 (REQ1): The placement service must accommodate the dynamic nature of the network, addressing its evolving topology during runtime.• Requirement 2 (REQ2): The placement strategy should consider the potential mobility of both the computing edge node (where services are hosted and executed) and the end-users (e.g., autonomous vehicles in the use case).• Requirement 3 (REQ3): The placement system should exhibit adaptability and possess the capability to recover from potential degradations in prediction quality.

III. LITERATURE REVIEW
The literature extensively addresses service placement problems in edge computing, proposing various solutions.Ensuring continuous service access and optimal user performance presents a significant challenge, particularly when factoring in end-user and edge node mobility.This section provides an event-based overview and classification of the reviewed literature on SPP.Events stem from the mobility of endusers, edge nodes, and network dynamicity.The classification, summarized in Figure 2, is divided into two main categories: one focusing on end-user and edge node mobility management and the other addressing network dynamicity management.Two primary approaches are considered within the network mobility management category: reactive (triggered by metric degradation) and proactive (predictive metric degradation).The section also compares the paper's contributions and directly related works in the reviewed literature.

A. Edge/User Mobility Management in an Edge Computing Context
Ensuring seamless mobility for both end-users and edge nodes and maintaining uninterrupted access to desired services and optimal performance presents a considerable challenge within edge computing.Frequent relocations of end-users or edge nodes can lead to significant delays or packet loss.Therefore, a robust system must possess the capability to seamlessly (re)place services from the previous nodes to the new ones, ensuring continuous service delivery.In the realm of mobility, the literature review categorizes approaches into two main types: (1) reactive techniques and (2) proactive techniques.Proactive techniques aim to predict the degradation of fixed QoS and QoE metrics and anticipate the mobility This review covers relevant works in both reactive and proactive approaches within the context of two edge computing implementations [29], Fog Computing (FC) and Multi-access Edge Computing (MEC).These implementations differ notably, with FC emphasizing medium context awareness, while MEC leverages high context awareness due to the proximity of devices functioning as intermediate edge nodes.
1) Reactive Mobility Management Approaches: Various works aspire to tackle the issue of end-user mobility by offering dynamic1 placement strategies.The study of [7] classifies different approaches of dynamic placement services in the fog context into supporting mobility or not.Regarding the techniques used to ensure service placement with a reactive approach, the works can be classified into three categories.
a. Works using Machine Learning for SPP: Authors in [28] propose an Edge Cognitive Computing (ECC) architecture that ensures ultra-low latency and high user experience.They use a Q-learning method to find the optimal node to host the service.b.Works using Metaheuristic for SPP: Other works, such as [27] and [26], solve the SPP in an online learning mode using the Multi-armed Bandit (MAB) theory.However, they did not consider the time dependency of sequential decisions when migrating the task from one server to another.In [25], the authors express the SPP as a contextual MAB problem.They can decouple the time dependency using the contextual feature vector to describe user behavior information.Furthermore, they incorporate user preference into the placement decision-making to achieve better personalized service performance.c.Works using other techniques for SPP: In [23], the authors aim to maximize the QoS for end-users by studying joint service placement and model scheduling of edge intelligence (EI) services.They formulate the problem as an integer linear program.In [22], the authors focused on minimizing computation time and energy consumption for end-users and edge nodes using a mixed-integer non-linear programming problem.They considered end-users and edge nodes' CPU frequencies and uplink bandwidth to an optimal service placement.They presented analytical expressions that enable lowcomplexity calculation of the most efficient resource allocation decisions.Without requiring any future user mobility as prior knowledge, several works, such as [24], utilize Lyapunov optimization to address the SPP by treating it as an online queue stability control problem.

2) Proactive Mobility Management Approaches:
Other works propose to deal with end-user mobility by providing a proactive approach that predicts selected QoS metrics and executes migration service once a QoS degradation is predicted.However, understanding the movement behavior of end devices of edge nodes may be helpful for efficient service placement.Thus, several works predict user traffic to proactively migrate the service based on the predicted trajectory.The surveyed literature is classified, namely, regarding the technique used.problem by developing traffic prediction tools that utilize statistical, rule-based, or deep machine learning methods.The authors apply user-traffic prediction to enable proactive resource management and optimize service placement for increased efficiency.b.Works using Metaheuristic for SPP: Authors in [18] and [19] address the problem by proposing a two-timescale framework that jointly optimizes service placement and request scheduling under storage, communication, computation, and budget constraints.Data placement needs to be adapted over time to serve time-varying demands while considering system stability and operation cost.Like in [16], researchers propose a predictive service placement in Mobile Edge Computing based on user mobility.They study predictive service placement with a limited short-term prediction based on a T-slot algorithm that employs a frame-based design to predict user mobility.To achieve this, they utilized the two-timescale Lyapunov optimization method.Similar to [16], the authors in [17] suggest a service placement algorithm that integrates user mobility prediction using a frame-based design.c.Works using statistical models for SPP: Authors in [11] propose a service migration and resource manager for Vehicular Edge Computing that ensures the necessary vehicle resources.Authors in [15] introduce an intelligent service migration and resource management algorithm in edge-enabled environments that considers the latency constraint.In both works, the ARIMA model is used for prediction.

B. Edge Dynamicity Without Mobility Management in an Edge Computing Context
This section addresses the dynamic nature of the edge infrastructure, which does not account for the mobility pattern of edge nodes.The edge network exhibits high dynamism due to entities frequently joining and departing from the network.In [7], the authors attribute this dynamism to factors such as network link instability, failures, and resource capability variations.Consequently, the challenge lies in devising a placement strategy to install, replace, or remove services efficiently while meeting QoS constraints.In [14], the authors aim to minimize resource costs and fulfill QoS requirements through dynamic service provisioning in the fog infrastructure.At the same time, in [13], a strategy is proposed to dynamically identify host nodes and manage unexpected changes in deployed component frequencies.

C. The Paper's Contributions
Due to the substantial mobility of certain end-users and edge nodes, solutions must ensure continuous service availability and uninterrupted request fulfillment.Services need the capability to migrate across diverse edge nodes in sync with the mobility of end-users/edge nodes.While a limited number of works have addressed the mobility of end-users/edge nodes in their solutions, it is worth noting that most migration approaches focus solely on end-users' mobility.Additionally, The work presented in this paper aligns with the previously described approach, employing a proactive method rooted in statistical Exponential smoothing models to forecast latencyrelated QoS degradation.
Table I provides a condensed summary of literature references that adopt proactive mobility management strategies based on the requirements outlined in Section II-D.As a reminder, the requirement (REQ1) pertains to the dynamic evolution of the network topology, (REQ2) addresses the potential mobility of edge nodes and end-users, and (REQ3) focuses on the capacity to rebound from possible degradation in prediction quality.
References [16], [19], [20] do not address network dynamicity and mobility of end-users and edge nodes.Authors in [20] guarantee recovery explicitly.References [17], [18], [21] consider network dynamicity but do not account for the mobility of end-users and edge nodes.It is unclear whether they address recovery through degradation prediction.Authors in [11] and [15] tackle network dynamicity and guarantee recovery but do not address the mobility of end-users and edge nodes.This paper distinguishes itself by assuming network dynamicity, focusing on QoS latency degradation, and addressing the mobility of end-users and edge nodes, including recovery through degradation prediction.

IV. BACKGROUND AND FUNDAMENTAL CONCEPTS
This section briefly introduces the concept of time series, including its definition and core components, and presents a selection of prediction models employed for univariate time series forecasting.

A. Time Series
A set of recorded observations x t , each corresponding to a specific time t, can be referred to as a time series (TS) [30].A time series can be univariate or multivariate.It is called univariate when it represents a single set of collected observations ordered in time.In this study, a discrete univariate time series is considered.

B. Time Series Decomposition
According to [31] and [32], a time series may include a possible: • Trend (fall or rise in the mean): T t • Seasonality (recurring cycle): S t • Remaining random Residual: R t Depending on its model, i.e., whether it is additive, multiplicative, or combined [33], the TS can be considered as follows: • Additive: The way to extract a component from a given TS is called decomposition.This decomposition allows the separation of the TS data based on its core components for different purposes.

C. Prediction Models for Time Series
There are different prediction methods for univariate time series.For example, linear methods like Exponential smoothing, ARIMA and its variants, and non-linear methods like Markov Switching [34] or ARCH models.
The choice was made to use Exponential smoothing models due to their proven efficiency in online and short-term time series forecasting, as demonstrated in previous studies [35], [36], in addition to their simplicity [37].ARIMA is used as the benchmark model due to its robust forecasting performance and low complexity [11].
1) Exponential Smoothing: Exponential smoothing methods use weighted averages for time series data, emphasizing recent observations.The statistical models for these methods are state space models and referenced as ETS for (Error, Trend, Seasonal), each featuring measurement and state equations for observed and unobserved components (e.g., level, trend, seasonal).Both models come in two variations: additive and multiplicative errors, producing the same point forecasts with matching smoothing parameter values.This paper focuses on two methods: Simple Exponential Smoothing (SES) for stationary time series, i.e., a TS with time-independent statistical properties [38]; and Holt's Damped Linear Trend method [39], [40] for non-stationary data.Given the exclusive interest in the point forecast, the model with additive (A) errors is utilized, making 'E' in the adopted ETS models stand for 'A'.
a. Simple Exponential Smoothing model: A simple exponential smoothing model with additive errors, or ETS (A, N , N ), can be written as [38]: Equations 1 and 2 are referred to as the measurement equation and state equation, respectively, where: y t is the current observation, l t−1 is the previous series level, i.e., the predictable portion of y t , ε t is the random error, l t is the adjusted level, and α is the smoothing parameter.b.Damped Holt's linear model: A damped Holt's linear method with additive errors, or ETS (A, A d , N ), can be written as [38]: where: y t , l t−1 , ε t , l t , and α remain the same as in equations ( 1) and ( 2).b t estimates the trend at time t, and β is the trend's smoothing parameter.2) ARIMA: ARIMA models may represent different types of time series, e.g., exclusive autoregressive (AR), which tries to predict the future values of the TS based on its own p past behavior, i.e., lagged values; exclusive moving average (MA), which tries to predict the future values of the TS based on its own q past forecast errors; and mixed AR and MA (ARMA) series.The integrated part I of order d represents the number, i.e., d, of the differences needed to make the TS stationary.
An ARIMA model of orders p, d, and q, i.e., ARIMA(p, d , q) can be written as [38]: where y t is the differenced series, c is the intercept, φ

V. MODEL DESIGN AND WORKFLOW OVERVIEW
The vehicle's migration engine requires knowledge of the appropriate edge node for migrating the decision module.For this purpose, a predictive approach is pursued, involving creating a prediction service.This service assesses edge nodes based on specific criteria, with latency being the QoS metric.The edge node with the lowest latency is selected for hosting the decision module.Subsequent sections detail the prediction service and various techniques and combinations to improve the forecasting process.

A. Workflow Overview
The objective is to forecast latency values based on historical data.This process, depicted in Figure 3, comprises two primary stages, each consisting of several phases.
The first stage, the Preparation stage, encompasses the initial four phases.Its purpose is to collect TS data and ensure their stationarity while identifying and addressing outliers.
The second stage, the Prediction stage, involves the remaining two phases.Its goal is to identify an appropriate model for predicting new latency values and assess the quality of these predictions.This entire process is repeated for each incoming latency value.

B. Algorithms & Flowcharts
This section outlines the stages and phases presented in the workflow overview of Figure 3, delving into the algorithms and approaches used in each.The augmented Dickey-Fuller (ADF) unit-root test is utilized to verify the TS stationarity, ensuring that its statistical properties remain constant over time.
Under the assumption of a normal distribution, if the TS is stationary, the ETS(A, N, N) model is used; otherwise, the ETS (A, A d , N ) model is chosen.c.Outliers' detection: An outlier, as explained in [41], is an observation that significantly deviates from others, suggesting a distinct mechanism might have generated it.This deviation, as discussed in [38], implies that the observation has an extreme value compared to other homogeneous observations in the time series (TS) [42].
Outliers can represent anomalies, errors, or rare events, and their nature can be assessed by experts in some cases.This work utilizes automated outlier detection, testing two scopes and techniques to identify the most suitable approach.
i. Outliers' detection scope: The following subsections describe two utilized scopes: detecting outliers across all time series or within their residual components.
• All the time series: This scope consists of detecting the outliers of the given time series without a decomposition phase, i.e., considering the TS as a whole.• Boxplot technique: Boxplot is a statistical and data analysis tool to study distribution characteristics and detect outliers [42].Even if Boxplot is a "visual" technique, it can be computed, and results can be automatically retrieved.• Moving mean technique: Moving mean or rolling mean technique, with an adapted 3σ rule criterion, also called "three standard deviations from the mean," [43] [44] can be used to detect outliers.It considers two thresholds: a minimum and a maximum.If a value is outside those thresholds, it is considered an outlier.The minimum threshold thr min and the maximum thr max may be computed respectively by equations 7 and 8 as follows: The parameter p refers to the width of the moving mean.After some experimentation, p is fixed to 2. If at least one outlier is detected, the process moves to the outliers processing; otherwise, the process moves directly to the prediction phase.d.Outliers processing: When detected, the choice is made on whether the outliers are removed or replaced by the median.

VI. MODEL IMPLEMENTATION AND EVALUATION
This section builds upon the techniques discussed in the previous section regarding the prediction service.It presents and discusses the results of tested prediction configurations, including various outlier detection and processing methods.

A. Configuration-Based Prediction Setup
The simulation scenarios ran on a single machine equipped with an Intel Core i5-4210U CPU @ 1.70GHz 2.40 GHz processor, 12 GB RAM, and Windows 10 (64-bit).Using Python, the statsmodel library [45], which supports ETS(A, N, N) and ETS (A, A d , N ), and Jupyter Notebook as the simulation environment, a series of experiments were conducted.
The dataset under consideration was generated in a prior study involving an operational autonomous car prototype introduced in [12].The dataset comprises 10,000 rows, with a 1-second interval between each measurement.Each row provides information on the latency between the car and a neighboring edge node.
The experiments conducted on the dataset involved altering the TS size and the scopes, techniques, and policies used for detecting and processing outliers.Each distinct variation is referred to as a 'configuration.'The most relevant configurations are listed in Table II.For example, in the first configuration, C 1, the outliers' detection is made directly on the TS, i.e., without decomposing the TS, using the Moving Mean technique and with a replacement policy.The prediction is made considering a sliding window size.
QoS latency predictions are conducted with different configurations, and results are compared to detect the most suitable one for the case study.The subsequent sections contain the presentation of the findings.

B. Configuration-Based Prediction Results and Discussion
This subsection presents the results for execution time and evaluation metrics of the configuration-based prediction.It includes comparisons of selected configurations, where one parameter is altered at a time while keeping the others fixed.The section also provides an interpretation of the results.the fluctuations in the real data, particularly for C2.Regarding evaluation metrics, C3 has slightly lower values than C2.However, C2 demonstrates an advantage in computational efficiency, with a lower average processing time than C3.
Considering the trade-off between accuracy and efficiency in a real-time context, C2 is slightly preferred over C3, making the sliding window size the recommended option.ii.Outliers' detection scope: In this scenario, a comparison is made between two configurations: C4, which operates on the time series, and C5, which utilizes the residual part of the time series.Figure 5 presents prediction plots for C4 (in green) and C5 (in pink), along with the actual data (in blue).These plots more accurately capture the fluctuations in the real data, particularly for C5.Regarding evaluation metrics, C5 has slightly lower values than C4.However, C4 has an advantage in computational efficiency, featuring a lower average processing time than C5.In this context, choosing C4 is preferable over C5, making the time series decomposition not the recommended option.iii.Outliers' detection technique: In this scenario, a comparison occurs between two configurations: C1, employing the Boxplot technique, and C2, utilizing the moving mean technique.Figure 6 displays prediction plots for C1 (in green) and C2 (in pink) alongside the actual data (in blue).These plots more effectively replicate the fluctuations in the real data, particularly for C1.This observation finds support in the RMSE values, where C1's RMSE is slightly lower than that of C2.Additionally, regarding computational efficiency, C2 exhibits a slight advantage with a lower average processing than C1.Considering the balance between accuracy and speed, C2 emerges as a slightly superior choice to C1, making the moving mean technique the preferred option.iv.Outliers' processing operation: In this context, a comparison is drawn between two configurations: C3, replacing outliers with the median value, and C4, removing outliers.Figure 7 presents prediction plots for C3 (in green) and C4 (in pink) alongside the actual data (in blue).These plots equally replicate the fluctuations in the real data.Regarding evaluation metrics, both C3 and C4 have identical values.However, C4 exhibits an advantage in computational efficiency, with a lower average processing time compared to C3. C4 can be a preferred choice over C3, making the removal of the outliers the recommended option.2) Discussion: In the applied use case and from this experiment, an effective latency prediction model, coupled with outlier detection and processing, would involve the sliding window size.Outliers would be detected without time series decomposition, employing the moving mean technique for detection and removal processing.

VII. PROOF-OF-CONCEPT
This section introduces the prototype validating the paper's contributions, namely the proposition of a QoS degradation prediction model (CONTR1) and a proactive decision module placement based on QoS degradation prediction (CONTR2).

A. Implementation Details
A global overview of the prototype architecture is depicted in Figure 8.
The configuration of the prototype involves: • a Raspberry SunFounder PiCar-V representing the autonomous car, • a Raspberry Pi 3 Model B (Quad Core CPU 1.2 GHz, 1 GB RAM), and a Raspberry Pi 2 Model B (Quad Core CPU 1.2 GHz, 1 GB RAM) devices acting as edge computing nodes.The car and the edge nodes communicate through Wi-Fi.Flask,2 Postman, 3 Putty,4 Docker, 5 and DockerHub6 are employed to develop and test Dockerized Flask Web applications representing the car's prediction, decision, and perception modules.The code related to these modules can be found in GitHub. 7The prediction module uses the statsmodel [45] and pmdarima [46] libraries, which support ETS(A, N, N) and ETS (A, A d , N ), and ARIMA, respectively.

B. Experiment Results and Lessons Learned
This subsection presents and explains the experimental results contributing to the lessons learned.Both models perform similarly in terms of accuracy, exhibiting improved behavior when the outlier detection and processing phases are included in the prediction process.However, in terms of execution time, the ETS models demonstrate superior performance.b.Proactive decision module based on QoS degradation prediction: Based on the QoS latency degradation prediction model discussed in Section VI and the preceding subsection, the proactive decision module initially selects an appropriate edge node to host the car's decision module.It replaces the decision module in an alternate edge node upon detecting predicted latency degradation.If unavailable, the module shifts operation within the car, ensuring continuous and uninterrupted functionality.Within the framework outlined in Section VII-A, the Raspberry SunFounder PiCar-V collects latencies from two Raspberry Pi units while in motion, using the ping command.A prediction is generated for each unit, and the one with the lowest latency is chosen.At the experiment's outset, the decision module operates uniformly across all nodes for simplicity.When an edge node is chosen, its associated decision module is accessed as necessary.Latency predictions are computed every 20 seconds to detect degradation; if identified, the decision module on the car is activated.The process of identifying a suitable edge node is then reiterated.2) Lessons Learned: The utilization of ETS models demonstrated superior overall efficiency compared to the ARIMA model, with improved predictions through outlier detection and processing phases.
In this hypothetical use case, placing the proactive decision module in the chosen edge node enables the detection of QoS degradation beforehand, ensuring secure and continuous car operation.This proactive approach, detailed in Section II, enables the prediction of QoS degradation.
Regarding the support of the requirements outlined in Section II-D, the paper's contributions, defined in the introduction (Section I), meet these requirements: • (REQ1) and (REQ2) are addressed by using a dataset containing latencies of moving car and edge nodes in the simulation (Section VI-A) and in the prototype (Section II).
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
• (REQ3) is addressed by (CONTR1) and (CONTR2), as the system is designed for recovery.

VIII. CONCLUSION AND FUTURE WORK
This paper proposes a proactive service (re)placement approach based on the prediction of QoS degradation.The degradation prediction relies on Exponential Smoothing methods to determine a suitable edge node for placing the car's decision module, with the selection based on minimizing latency.Multiple outlier detection configurations were evaluated.
A proof-of-concept validates the chosen model, -by comparing it to ARIMA-, and the proposed proactive service (re)placement approach.This approach ensures the continuity of the placed module, suggesting the feasibility of locating non-critical modules on edge nodes.
The next step involves comparing Exponential Smoothing models, a deep learning model, and a hybrid model (Exponential Smoothing + Deep Learning) for improved prediction.Subsequently, additional criteria for QoS degradation prediction, such as the accessibility of resources (CPU/GPU capacity), availability, and the battery level of edge nodes, will be considered.

Manuscript received 14
December 2023; revised 22 February 2024; accepted 23 February 2024.Date of publication 11 March 2024; date of current version 21 August 2024.The associate editor coordinating the review of this article and approving it for publication was B. Martini.(Corresponding author: Henda Sfaxi.)

Fig. 1 .
Fig. 1.Autonomous car riding and searching for edge nodes.
a. Works using Machine learning for SPP: Authors in[21] propose a framework based on Mobility-Aware Deep Reinforcement Learning to handle end-users mobility and consider the latency constraint.By adapting Machine learning techniques, reference[20] tackles the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

•
Time series' residuals: This scope involves outlier detection based on the residual component of the time series.The time series is decomposed into its core components, as detailed in Section IV-B, focusing on the residual component R t for outlier detection.The choice of decomposition depends on the nature of the time series, which can be either additive or multiplicative.This paper assumes that the time series follows either an additive or multiplicative pattern, and experiments are conducted to identify this pattern computationally.ii.Outliers' detection technique: For both detailed scopes, two techniques are used for the detection of outliers: the first one is the Boxplot technique, and the second one is the moving mean or rolling mean technique.These techniques are detailed in the following subsections.
The predicted latency value is compared to the real value, and a set of evaluation criteria, i.e., Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE), are processed based on these values.These metrics help select the most suitable approach for the paper's use case.
Utilizing the dataset from the preceding Section IV-A, the prediction module, employing ETS models, selects a suitable edge node with the ARIMA model as a benchmark.Experiments validate the chosen model and the outliers' detection and processing phases.Two sets of predictions are made-one includes detecting and processing outliers before the prediction process, and the other omits this step.The comparison is based on accuracy and execution time.TableVshows the RMSE, MSE, and MAE for latency predictions from ETS and ARIMA models, with and without outlier detection and processing phases.Values are rounded to four decimal places.

TABLE V SUMMARY
OF RESULTS OF VALIDATION CRITERIA FOR ETS AND ARIMA MODELS TABLE VI ETS AND ARIMA EXECUTION TIME STATISTICS

Table
VI provides execution time statistics for the models.