Next-Generation Edge Computing Assisted Autonomous Driving Based Artificial Intelligence Algorithms

Edge Computing and Network Function Virtualization (NFV) concepts can improve network processing and multi-resources allocation when intelligent optimization algorithms are deployed. Multiservice offloading and allocation approaches pose interesting challenges in the current and next-generation vehicle networks. The state-of-the-art optimization approaches still formulate exact algorithms, and tune approximation methods to get sufficient solutions. These approaches are data-centric that aim to use heterogeneous data inputs to find the near optimal solutions. In the context of connected and autonomous vehicles (CAVs), these techniques show an exponential computational time and deal only with small and medium scale networks. Therefore, we are motivated by using recent Deep Reinforcement Learning (DRL) techniques to learn the behavior of exact optimization algorithms while enhancing the Quality of Service (QoS) of network operators and satisfying the requirements of the next-generation Autonomous Vehicles (AVs). DRL algorithms can improve AVs service offloading and optimize edge resources. An Optimal Virtual Edge Autopilot Placement (OVEAP) algorithm is proposed using Integer Linear Programming (ILP). Moreover, an autopilot placement protocol is presented to support the algorithm. Optimal allocation and Virtual Network Function (VNF) placement and chaining of the autopilot, based on several new constraints such as computing and networking loads, network edge infrastructure, and placement cost, are designed. Further, a DRL approach is formulated to deal with dense Internet of Autonomous Vehicle (IoAV) networks. Extensive simulations and evaluations are carried out. Results show that the proposed allocation strategies outperform the state-of-the-art solutions and give better performance in terms of Total Edge Servers Utilization, Total Edge Servers Allocation Time, and Successfully Allocated autopilots.


I. INTRODUCTION
Autonomous Vehicles (AV) [1]- [3] target the deployment of service chains including sensors sensing, computer vision, localization, High Definition (HD) Map building, path planning, and control. AV sensors include ultrasonic, cameras, radar, LiDAR, and Dedicated Short-Range Communications (DSRC) devices. Moreover, AV sensors generate heterogeneous data and need intelligent data fusion methods and machine learning techniques such as object The associate editor coordinating the review of this manuscript and approving it for publication was Venkateshkumar M. classification, inference, and artificial intelligence models that ease the autonomous driving process. In recent AV sensors deployments, vehicles use a wide variety of sensing competences. Moderately, a vehicle has seventy sensors including ambient light sensors, accelerometers, gyroscopes, and moisture sensors. The combination of the Internet of Autonomous Vehicle (IoAV) sensors with different situational awareness, failures, and real-time response determines the AV complexities that they need to have a comprehensive software. Vehicles already had a number of functionalities for combining real-time sensing with perception and decision-making. VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ A self-driving process needs to solve complex AV issues through sensing, real-time object detection, classification, and segmentation [4]. The most important key issues in autonomous driving include safety, low latency [5], [6], high data rate, software accuracy, map completeness and correctness, augmented sensor fusion, etc.
Generally speaking, AVs gather a huge amount of raw data to figure out the world and incorporate data from other sensors, lasers, and radar to get a richer understanding of the environment. Then, it performs localization to figure out precisely its location in the world. Path planning and control modules are then enabled to chart a course through the world to reach the destination and makes appropriate decisions while driving such as steering wheel, hitting the throttle, hitting the brake, etc.
Current Internet of Things (IoT) platforms for AVs do not enable low-latency and real-time data processing, and require offloading data processing to near edge computing servers. In general, AVs connect to the edge platforms using 5G cellular networks to access real-time data analytics for required applications [7]. Edge servers can collaborate with other edge nodes in the vicinity, thereby creating a local distributed peer-to-peer (P2P) network beneath the cloud. Recently, the edge/cloud computing servers host AV service chains and provide feedbacks and decisions to the physical infrastructure.
In this paper, we contribute by virtualizing the main autopilot functions using Network Function Virtualization/Software Defined Networking (NFV/SDN) techniques. This step generates virtualized instances that can be deployed as a service in the network operator edges.
Moreover, despite the importance of the optimization task in virtualized architectures, such functions are missing in the global network architecture. Therefore, we try, in this paper, to contribute with an exact Optimal Virtual Edge Autopilot Placement (OVEAP) algorithm to decide about the optimal point of operations where the autopilot function should be hosted. Then, we proposed a Deep Reinforcement Learning (DRL) based Virtual Edge Autopilot Placement (DVEAP) algorithm to deal with large scale networks. These tools help network operators to manage their resources and collaborate efficiently with the Original Equipment Manufacturer (OEM).

A. PROBLEM STATEMENT
The main problem considered in this paper is the offloading approaches for AVs. It considers offloading autopilot functions, such as computer vision, perception, localization, planning, etc. to the near edge computing served by a 5G network operator. Moreover, the problem supports partial and/or full offloading, where the AV can offload one or multiple autopilot functions. The statement is described as follows: maximize the succeeded placement of the autopilot VNFs while minimizing the number of edge servers, where to place the offloaded autopilot functions at the distributed network edge, and how to guarantee an optimal quality of driving in terms of edge servers utilization and allocation, successfully allocated autopilots, average network delay, service time, and processing time.
We consider different types of constraints in our virtualization and offloading process. We introduce CPU, GPU, storage, RAM, Bandwidth, VNF chaining (without decomposition), minimum edge servers, and 5G link constraints. The proposed constraints are divided into different types: system type constraints, network type constraints, 5G link capacity, and Quality of Service/Quality of Experience (QoS/QoE) constraints. Moreover, edge type constraints include maximum number of simultaneous connections per server.
Recall that none of the previously described contributions considered the virtualization of a vehicle autopilot, neither the QoS/QoE. OVEAP optimization will maximize the total number of offloaded autopilots for AVs connected over 5G New Radio (NR) protocol. The objective is to minimize the cost of computing resources (i.e. active Mobile Edge Computing (MEC) servers). The main goal of our study is to propose optimization algorithms for edge-assisted autonomous vehicles and the underlying Artificial Intelligence (AI) mechanisms to optimize edge computing resources and vehicle networks.

B. MAIN CONTRIBUTIONS
The main contributions in this paper are as follows: The rest of the paper is as follows. Section II highlights the related work background to ease the understanding of the paper contribution and give a detailed related work in the field of edge-assisted autonomous driving architectures, communication protocols, and optimization algorithms. Section III describes our proposed Advanced Autonomous Driving 53988 VOLUME 10, 2022 (A2D) communication protocol. Section VI introduces the proposed mathematical programming approach for A2D. Section V enhances the optimization module with an Edge based Deep Reinforcement Learning (EDRL) approach. Section VI evaluates the proposed approaches and the work is concluded with some research directions in the final section (i.e. Section VII).

II. RELATED WORK BACKGROUND
Different techniques, standards, and norms adopted by AI, 5G, V2X, IoV, and Edge/Cloud computing in the area of edge assisted autonomous driving. Recent applications include the use of Artificial Intelligence (AI) techniques and Deep Learning (DL) models in self-driving cars, autonomous driving services management, edge computing assistance, and online optimization (e.g., Tesla AI [8], DeepRM, DeepMind, Deep Traffic). The popular Reinforcement Learning (RL) methods are Trust Region Policy Optimization (TRPO), Policy Gradient (PG), and Q-Learning (QL). More importantly, the rapid development of Internet of Vehicle (IoV) is also making autonomous driving a reality [9]. This section categorizes the AVs domain work into two sides as follows:

A. AUTONOMOUS VEHICLES AND EDGE COMPUTING CONVERGENCE
In [10], the authors present the state-of-the-art approaches that leverage the edge-computing paradigm in the autonomous driving field. However, it missed the discussion about current edge AI work and optimal resources allocation design.
Datta et al. [11] formulate some research and engineering challenges for developing cloud-based environment for connected car services. The test-bed services are running in a virtualized environment and are deployed using Micro-services. It leverages edge servers for vehicular data annotation and local processing with actuation. The work missed the use case of autonomous vehicles. Further, the authors do not take into account recent communication technologies that might enhance the overall performance metrics such as end-to-end delay, latency, and safety.
Liu et al. [12] set up an end-to-end prototype, which supports Wi-Fi, LTE, and DSRC communication technologies. The authors evaluate the performance in terms of network latency, power dissipation, and system utilization. They propose different communication and networking schemes that connect On Board Units (OBU) nodes to network gateways. Moreover, they implement communication prototypes using ROS messages. However, the prototype does not include the communication between OBUs, gateways (RSUs), and edge servers.
In [13], the authors design and propose a low power edge computing system for real-time autonomous robots and vehicles services. They propose an offloading strategy that decides when and where to offload autonomous driving tasks. Their work is only focusing on minimizing the power consumption of the edge platform. However, we are considering relevant metrics such as servers utilization, multi resources' utilization, QoS, and safety.
In [14], the authors addresses the issue of how to process large data volumes and still meet the objectives of the connected and autonomous vehicle driving. Therefore, they propose the introduction of edge and fog computing nodes as an assistance layer of processing. Further, they rise the problem of how to process large data volumes as quickly as possible. For this purpose, they propose moving machine learning models and functions to where data is generated and not collected such camera devices. Still, the authors do not take into consideration virtualized architectures at the network edge that can add flexibility, programmability, and control of the compute-intensive embedded-autopilot modules.
In [15], the authors propose an end-to-end machine learning algorithm for the entire autonomous driving procedures. The approach uses deep neural networks models to map directly collected IoAV sensory inputs, such as front-facing camera images, to driving actions such as steering angle. The work is of practical interest, however, it missed virtualized edge computing facility that helps in machine learning training and IoAV resources allocation.
In [16], the authors propose an autonomous vehicle world model that represents the vehicle's view of its road environment. The model takes as an input the heterogeneous information gathered from in-vehicle sensors, V2X communication, and a priori information (e.g., roads and intersections information) and outputs real time events that trigger and feed the decision-making module. They have used Cyclab [17], an open-source 3D simulator, to simulate the proposed framework. The work does not take benefit neither from edge computing nor from AI techniques. Moreover, the authors used a static world model that does not take into consideration navigation safety, communication latency, and performance indicators to evaluate the world model.
In [18] the authors propose a three-tier architecture for Vehicular Edge Computing (VEC) domain. It supports a high level of scalability, real-time data delivery, and mobility. The authors leverage SDN and NFV virtualization techniques to add more flexibility, control and a global view of the moving vehicles. This latter uses V2X communication technologies such as DSRC, LTE, and 5G to reach either the edge servers or the centralized cloud. They rise serious VEC technical issues and challenges such as latency, scheduling, load balancing, offloading, resource management, and security/privacy. The authors survey the main challenges and opportunities when vehicle network meets edge computing. However, the work missed an overall architecture that defines the most appropriate network modules and edge techniques. Moreover, the authors should take into consideration the V2X communication protocols in presence of the NFV/SDN paradigms. They also missed optimization techniques related to the network/edge convergence according to safety indicators when designing the allocation loop of autonomous vehicles.
In [19], the authors present an edge-cloud computing model for autonomous vehicles using the open-source software platform Autoware [20]. They believe that their proposed edge-cloud computing model for Autoware-based autonomous vehicles reduces the execution time and the total deadline miss. Among the main missing modules in their platform, the work consider neither the in-vehicle computing resources management, nor the Vehicle to Edge (V2E) communications.
In [21], the authors propose surrogate: an edge architecture for self-driving cars with OpenStack and ETSI opensource MANO. It aims at virtualizing the in-vehicle OBUs at the distributed edge platform and managing Multi-Access Edge Computing (MEC) layers that process real-time vehicle requests. The work suffers from optimal virtual OBU (vOBU) management and orchestration algorithms at the virtualized edge surrogates. Moreover, vOBU manager module needs to take into account solver instances related to the IoAV network scale and driving conditions.
In [22], the authors describe how to build a self-driving car, applying AI and ML techniques to train and test until the car drives safely. They are collaborating with Waymo Company, which offers cars having 4 million miles of driving and 2.5 billion simulated miles. Then, AI/ML models are feeded with the gathered data for training and knowledge extraction. The work missed the edge computing assistance for efficient computing and scalable processing. We believe that the work is interesting and autopilots can take benefit from these results.
In [23], the authors studied the problem of V2X service placement. They proposed an ILP technique to decide where to offload services, taking into consideration the limited computing resources, available at the network edges. They introduce decision variables to indicate the optimal service placement. Then, they proposed a greedy approach to deal with large network scales. The main objective of the proposed approaches is to minimize the overall delay experienced by vehicles. The authors do not take into consideration the AV requirement in terms of reliability, latency, and data rate. Moreover, ILP formulations need heuristics to deal with large autonomous vehicle network scales. In our contribution, we virtualize the entire autopilot function while ensuring V2X services constraints.
In [24], the authors implement a unified autonomous driving cloud infrastructure that supports heterogeneous applications. It can efficiently gather huge amount of raw data and perform distributed simulation in order to stress offloading algorithms. It can also perform offline DL model training and augmented HD Map generation. The authors rise the issue of heterogeneous applications deployment (i.e., simulations, offline DL model training, HD map generation) that need different infrastructure and orthogonal requirements. Indeed, deployed applications share all gathered data as inputs while assuring the required storage cost. Therefore, they propose a unified cloud infrastructure that supports heterogeneous AV applications. They introduce some design considerations in implementing the unified infrastructures including Spark RDD, Alluxio, and heterogeneous computing substrates. Despite the relevant work, the authors do not take into consideration the AV services placement, orchestration, and management which represent key modules in edge intelligence over virtualized AI-IoT architectures.
In [25] the authors propose a cloud based self-driving car which can optimize the in-vehicle data storage issues. They propose to free autonomous vehicles from all data and download everything from the cloud as per the need of the travel. Their solution allows to free vehicles from raw data and rely on a centralized cloud infrastructure for the drive. The authors assume a persistent network connectivity to the cloud and a sufficient in-vehicle storage to back the data in the case of limited network availability. The proposed cloud infrastructure is not clear and need to integrate scheduling algorithms that allocate gathered data to CPU cores and servers. Moreover, it missed distributed edge computing servers that efficient process sensitive application data.
In [26] the authors proposed Carcel, a cloud-assisted system for autonomous driving. The cloud platform has access to data from AV sensors and the roadside infrastructure environment. It assists autonomous vehicles to detect/avoid obstacles such as pedestrians and other vehicles that may not be directly detected by the AV sensors. It helps autonomous vehicles to plan efficient paths. Then, the authors introduce a cloud-based planner module along with request, sender, and receiver modules. They implement the planner module within the cloud using the Robot Operating System (ROS). The cloud-assisted system tracks request messages from the cloud, and accordingly transmits the sensor information in the form of UDP packets to the cloud. Their proposed metrics are the response time and the distance to pedestrian. We believe that the work is of practical interest, however, it missed virtualization techniques and VNF manager module that ease the allocation procedure of the autopilot chain on the cloud. Moreover, the edge/fog facility is missed from the overall architecture.
In [27] the authors explore a distributed computing architecture that addresses on-vehicle and off-vehicle computation as will be needed to support connected and autonomous driving. They suggest local/edge computation rather than offloaded to cloud servers in order to reduce the end-toend latency. We believe that AI approaches may ease edge resources' management to satisfy Connected Autonomous Vehicles (CAV) requirements in terms of safety, latency, and bandwidth.
Different efforts have been proposed for Vehicular Edge Computing, where the edge servers provide different services according to the application. For instance, Ye et al. [28] proposed a service offloading framework in a fog computing environment, where the vehicles are used as mobile fog servers to provide services to connected end-users and also execute offloaded tasks from roadside cloudlets. The Genetic Algorithm (GA) is used as an allocation strategy to achieve tasks offloading with the least cost. The proposed strategy achieve good results in terms of minimizing data transmission cost. However, the impact of obstacles and congestion in communication is not studied in this work. Moreover, we are motivating to use fog and edge computing capabilities to ensure the vehicle self-driving itself.

B. ARTIFICIAL INTELLIGENCE AND EDGE COMPUTING CONVERGENCE
In [29], the authors introduce the paradigm of edge intelligence that introduces the convergence of Edge Computing (EC) and Artificial Intelligence (AI). They categorize the utilization of Machine Learning (ML) on the wireless edge into three parts: resource management, networking, and localization. In [30], the authors study the converge of edge computing and deep learning techniques. In [31], the authors propose the use of deep learning for the Internet of Things (IoT) with Edge Computing. In [32] the authors use artificial intelligence methods in recent 5G wireless networking scenarios. In [33], the authors explore the role of Artificial Intelligence (AI), Machine Learning (ML), and Deep Reinforcement Learning (DRL) in the evolution of smart cities. In [34], the authors introduce AI as a Service (AIaaS) on Software-Defined Infrastructure (SDI). In [35], the authors propose an intelligent robust routing using artificial intelligence approaches. In [36] the authors integrates AI modules in the Network Simulator (NS) to simulate real environments and agents spaces. Recent studies [37], [38] focus on better clarification of the convergence of AI, edge computing, DL, and network telecommunication with respect to 5G standards and 6G vision. Besides, a huge effort is devoted to ensure the edge intelligence convergence through universal virtualized architectures and AI techniques [39], [40].
Literature lacks intelligent AI/DRL techniques that may improve the edge computing resources management in the autonomous driving context. Table 1 summarizes the state-of-the-art approaches in the autonomous vehicle field using edge-computing facility.
Hereafter, we propose our autopilot protocol along with a reference architecture. Then, exact and artificial intelligence based optimization algorithms for compute intensive autonomous driving services allocations are proposed according to a reference architecture.

III. PROPOSED AUTOPILOT PROTOCOL A. SYSTEM DESIGN
Distributed edge computing enables the multiplexing of heterogeneous and virtualized networks over different AI-IoT architectures corresponding to the isolated tenants and domains to satisfy different application requirements.
The physical infrastructure in next generation networking architectures consists of isolated logical networks, including IoT, IoV, and IoAV which is the topic of this paper. It is worth VOLUME 10, 2022 mentioning that Unmanned Aerial Vehicles (UAV), Cloud-Radio Access Network (C-RAN), 5G New Radio (NR) [46], and SDN/NFV customers such as vCDN operators [47], Mobile Virtual Network Operator (MVNO) [48] represent also potential end-users in our proposed AI-driven networking architecture.

B. PROPOSED ADVANCED AUTONOMOUS DRIVING ARCHITECTURE USING EDGE ARTIFICIAL INTELLIGENCE
We propose an end-to-end, reliable and low latency communication architecture that allows the allocation of compute-intensive autonomous driving services, in particular autopilot, to share the resources on edge servers and improve the level of performance for autonomous vehicles. In Figure 1, we highlight the proposed Advanced Autonomous Driving (A2D) architecture for the proposed autopilot use case. The A2D architecture consists of three main layers/entities as follows: • The Centralized Cloud Computing Layer: It acts as the cloud autopilot and is responsible for processing Non-Real-Time (NRT) autopilot VNFs. • Computing Platform: It consists of a heterogeneous set of CPU, GPU, TPU and FPGA substrate. We propose to leverage an edge computing infrastructure to virtualize the above integrated autopilot. The virtualized autopilot includes mainly OpenStack layers for VNFs management [49] and Kubernetes cluster for containers orchestration [50]. It is easily deployed using automation services according to the resources' availability at the network edge. In case of resources miss, the cloud computing servers are used. It is a centralized entity that assures autopilots applications orchestration, and multi-edge management.

D. PROPOSED ADVANCED AUTONOMOUS DRIVING COMMUNICATION PROTOCOL
The main steps of the proposed protocol are detailed hereafter (see Figure 1): 1) Autopilot Slicing: each autonomous vehicle can request for offloading some autopilot functions. It requests the near edge node, representing by gNB or 53992 VOLUME 10, 2022 RSU (i.e., a 4G/5G base station) to enable local edge resource discovery and VNFs allocation.

2) Resources Discovery in Connected Edge Nodes:
when the access point receives autopilot functions offloading request, it generates VNF components or slices. Then, it selects a set of connected edge nodes that can satisfy each VNF requirements in terms of CPU, GPU, RAM, storage. The selected set of connected edge nodes, called Virtual Edge Servers (VES).
Resource discovery procedure is based on the computing and networking capabilities of the servers.

3) Autopilot VNFs Offloading/Allocation: when the
VES is selected, the access point starts the process of VNF offloading by allocating each slice a free device resources (from the selected VES that can satisfy the slice request requirements). It is worth mentioning that an optimization algorithm is used to select the optimal points of operations where VNFs can be offloaded according to the aforementioned system and network requirements. Still, the cloud computing may represent a solution in the case of edge resources miss. This case may occur when the access point cannot select a VES that can meet the demands of the autonomous vehicles set. 4) VNF Components Graph: this is the optimization results of the allocation procedures that indicate the placement of each VNF component. After launching the VNFs in the VES/cloud servers, optimal control commands are sent directly to the access point. 5) Results Forwarding: in the last step, the access point forwards optimal control commands to the autonomous vehicle while satisfying its requirements. For the sake of clarity, we show in Figure 2 the main communication steps between the edge computing and the connected autonomous vehicle as follows: • Connected autonomous vehicles send instantaneous states such as position, speed, and next decision of the autopilot to the Edge/Cloud. • The Edge/Cloud Autonomous Driving (AD) service collects the raw data, creates the world model for each section of the road, and communicates with Cloud AD Autopilot.
• The Edge/Cloud AD Autopilot sends the global model, generates a high level request for each autonomous vehicle node such as speed request and lane request positioning.
• The Integrated Autopilot merges the Edge/Cloud autopilot inputs and embedded/local inputs to decide to anticipate and act locally.
As explained above, the A2D protocol needs some intelligent optimization algorithms that allocate autopilot VNFs to the optimal/near-optimal edge servers.

IV. OVEAP OPTIMIZATION MODEL
The optimization of autopilot's VNFs placement in edge computing architectures has achieved more attentiveness. It is similar to the placement of Virtual Machines (VMs), where the VNFs are composed of containers or VMs that can execute the needed network functions. We propose a mathematical programming model based on Integer Linear Programming (ILP) in order to model the autopilot VNFs (i.e., a service chain: traffic flow automation between the instantiated functions) offloading in the virtualized edge architecture. The algorithm takes as inputs the system capacity in terms of storage, networks and computing. It aims then to optimally place autopilot VNFs upon the virtual edges. Autopilot VNFs are offloaded in order to increase the safety and decrease the end-users and network devices loads. The optimization algorithm for service offloading in VMEC is modeled, implemented, and evaluated in the next subsections.

A. OVEAP MULTI-RESOURCES AWARE MATHEMATICAL FORMULATION
For the sake of better clarifying the mathematical formulation, we propose the notation of the main parameters. Let MEC be the set of homogeneous edge servers in terms of vCPU, vGPU and vStorage resources. We use EA as a set of autopilots, where each one of them, has a chain of VNFs to be processed at the edge periodically.
In this section, we specify the parameters and the constraints that are defined and proposed in formulating the optimization/analytic model. This formulation takes as input the multi-resources requirements of the autonomous vehicles and determines the placement of autopilot VNFs to the optimal location. OVEAP will speed up the processing of virtual functions by allocating the available resources while ensuring that it does not exceed the edge server capacity. We quote in Table 2 the main system parameters and decision variables of the proposed mathematical programming approach.
The binary variable x indicates the placement of the autopilot VNF on the MEC Server mec ∈ MEC. It represents a Service Instantiate Graph (SIG) that defines the optimal points VOLUME 10, 2022 Further, a binary variable y is needed to track the MEC server's computing and networking utilization. It is formulated as follows: The general ILP algorithm tries to maximize the succeeded placement of the autopilot VNFs while minimizing the number of active MEC servers. Then, the objective function can be formulated as follows: According to the proposed OVEAP protocol, the autopilot VNFs placement procedure is constrained by system infrastructure, resources (CPU, GPU, RAM, Storage), and network constraints. Hereafter, we formulate the algorithm constraints related to the autonomous driving service offloading in virtual edge computing.
where t ea,vnf is the processing time matrix of the autopilot VNF (ea, vnf ).
• Non-negativity constraints: Used variables x and y are binary in order to decide efficiently (without relaxation) about the autopilot VNFs allocation.

B. OVEAP: COMPLEXITY AND TRIGGERS 1) OVEAP COMPLEXITY
OVEAP algorithm is a non-deterministic polynomial time approach which is feasible with a few instances. It is an NP problem that has an exponential number of feasible solutions.

2) OVEAP TRIGGERS
OVEAP algorithm is proposed to be executed in an autopilot manager entity with respect to ETSI standards. It has to control, manage, and orchestrate the VNFs running autopilot nodes. This offloading is executed after the following triggers: • System resources (computing, storage, and memory) constraints through c ea,vnf , g ea,vnf , s ea,vnf and r ea,vnf parameters.
• Networking constraint through L mec,av and b L ea av parameters.
• Autopilot offloading requests prediction: OVEAP predicts the incoming autopilots as well as the corresponding VNFs. Then, it makes the decision about the optimal points of placements. Once satisfying the above requirements, the algorithm is executed periodically.
Exact optimization algorithms are non-deterministic polynomial time approaches. In the following section, we propose artificial intelligence techniques to solve the above optimization problem. Autopilot VNFs are items to be offloaded into distributed edge servers according to the resources discovery procedure.

V. DVEAP: DRL-BASED VIRTUAL EDGE AUTOPILOT PLACEMENT ALGORITHM
Artificial Intelligence defined optimization tries to replace the tedious process of ILP by recent AI techniques. We insert Deep Learning modules at the network edge that collect, process, and analyze the raw data. Then, online decisions are taken in order to self-organize the network. Still, data-centers are used to process the big data that does not require real-time optimization. The AI-driven placement of autopilot VNFs at the network edge provides a lot of benefits, including safety and efficient processing. This leads to a minimum latency and enhances the overall path planning and driving qualities.

A. DVEAP: THE PROPOSED RL MODEL
We consider distributed edge servers with multi-resources (CPU, GPU, RAM, Storage, and Bandwidth) that represent the network environment. Then, autopilot VNFs are jobs that arrive to the cluster with an online fashion in discrete time-steps as shown in Figure 3. At each time step, the cluster manager/scheduler chooses a VNF to place according to the Deep-Q-Network (DQN) agent. This latter predicts the near optimal actions using Deep Neural Network. We assume that the VNF demands in terms of required resources are known upon arrival.

1) THE STATE SPACE
We represent the state space s t as the current placement of autopilot VNFs on server slots.

2) THE ACTION SPACE
The action space a t is the placement of a computing-intensive autopilot VNF on a server slot. The placement takes into consideration the server capacity (in terms of slots). In fact, the agent will not place the autopilot VNF on a server slot if it is occupied by running another VNF. The action is monotype where the agent process VNFs one by one until processed all the incoming user requests.

3) THE REWARD SPACE
The proposed reward r is the placement cost of such a VNF. It is measured as the number of servers used after performing VOLUME 10, 2022 an action. It is formulated as follows r t = i if i Edge Servers are Occupied 0 Otherwise.
As shown in the above equation 16, the main objective of the agent is to minimize the total opened servers. In other words, the agent objective will be minimizing the total discounted cumulative rewards.

B. DVEAP: THE PROPOSED DQN-DRL
In the DRL algorithm, we use Deep Neural Networks (DNN) to approximate the above RL model while considering the same state-action-reward state spaces. A succession of layers of neural networks are used to map the input state to the output action. In Algorithm 1, we describe the pseudocode of our proposed DRL algorithm. Select a server a.
• with probability select a random server • Otherwise select the server that has the max a Q(s, a ) 8: Place the autopilot SFC's VNF on the Edge Computing Server a.

9:
Observe the incurred allocation cost r and the new edge state s (new allocation and another incoming autopilot SFC's VNFs) 10: Store the experience {s, a, r, s } in the replay memory.

11:
Sample a random transition from the replay memory.

12:
Calculate the target for each mini-batch transition (r + γ × max a Q(s , a )) 13: Train the Q network using the following loss Loss = 1 2 * (r + γ × max a Q(s , a ) − Q(s, a)) 2 14: s = s' 15: until No incoming autopilot VNFs from all the SFC We use the Stochastic Gradient Descent (SGD) algorithm [51] to perform Deep-Q-Network (DQN) agent training. Then, we tune the main hyperparameter to decide about optimal DNN configurations such as epoch/iteration numbers, optimizer parameters, and action selection strategies. As shown in Algorithm 1 Line 13, SGD algorithm uses the following Bellman equation in order to minimize the loss function (squared error) between target and current Q-values: Q(s , a ). Then, DNN weight are updated using back-propagation process. The training procedure is an offline procedure which is performed before the deployment of the DQN-DRL algorithm at the edge network. Then, the algorithm will be used for real time resources allocation. Network operators can update the pretrained model according to their needs. It is worth mentioning that the training time depends on the available resources and data types. In our case, we use the structured data types which do not require a significant time.
The DRL based approach consists in reducing ILP time and RL state space complexities by reducing the number of iterations to be considered in the optimization while including more parameters.

VI. PERFORMANCE EVALUATION: DRL-DVEAP VERSUS ILP-OVEAP
In this section, we evaluate our proposed solutions as follows: • Optimal autopilot resources allocation strategy is evaluated in small scale IoAV networks.
• For the sake of dealing with large-scale networks, where many AV autopilots require offloading at the same time, AI-DRL method is evaluated and compared with the optimal approach. We evaluate the proposed algorithms (ILP and DRL) using different optimization tools. CPLEX [52] is used to evaluate the exact ILP model, while TensorFlow and Keras [53] are used to configure and implement the DRL algorithm. As explained above, we consider the following autopilot VNFs: vPerception, vLocalisation, vPlanner, and vControl. Indeed, the autopilot chain is composed of a set of VNFs. Recall that the objective is to place each autopilot VNF in the edge server, while assuring the chaining of all the VNFs of the same autopilot. Tables 3 and 4 show the different configurations used in the evaluation.  It is worth mentioning that we evaluated the proposed algorithms in small and large scale using simulation. We simulate the datacenter using the ''x86'' architecture, the ''Linux'' operating system and the ''Xen'' hypervisor. For the sake of better clarification, network and system parameters are shown in Table 5. Recall that the simulation considers the offloading of sensor data to the edge. This latter creates the world model according to the vPerception VNF. Then, the vLocalisation adds the position of the vehicle in the environment. The vPlanner traces the chart between the source and the destination and the vControl executes the path by sending the control command to the embedded vehicle.

A. KEY PERFORMANCE INDICATORS (KPIs)
For the interest of assessing the efficiency of the proposed approaches (OVEAP ILP and DVEAP DRL), we propose different KPIs as follows: • Successfully Allocated Autopilots: it represents the number of successfully offloaded autopilots at the network edge.
• Average Network Delay: It represents the average network delay between autonomous vehicles and edge computing servers.
• Service Time: it represents the end-to-end response time including autopilot VNF submission, resources discovery, VNF offloading, and computation results forwarding to the end-user. It is coupled with the availability of resources and the offloading decisions, where an efficient service is characterized by low execution time.
• Processing Time: this KPI represents the duration needed by the OVEAP controller to complete the execution of all submitted autopilot VNFs. In general, the availability of resources among virtual edge servers is an important metric that enhances the efficiency of processing through decreasing the processing time.

B. COMPUTING ARCHITECTURES BASED ANALYSIS
To quantify the behavior of OVEAP, we compare our proposed OVEAP optimization algorithm with state-of-the-art computing architectures. We quote the relevant architecture as follows: 1) Embedded Computing: This architecture allows the local execution of autopilot modules, while edge computing is still enabled to receive VNFs. 2) Edge Computing: This architecture prioritizes edge computing servers for autopilots services offloading. 3) Cloud Computing: This architecture allows only the use of the centralized cloud computing.

C. OBTAINED RESULTS
In this section, we introduce two scenarios in order to evaluate the optimization algorithms. The first scenario targets small networks, while the second deals with large scale networks.

1) THE SMALL-SCALE SCENARIO
For the sake of better selecting the appropriate allocation strategy (OVEAP or DVEAP), different network scales (i.e. small and large) are considered as follows. In Figure 4, we show the total resources' utilization at the network edge for Edge1 and Edge2 respectively in small scale network.
In Figures 4a and 4c, we plot autonomous vehicles configurations against the TESU metric. Results show that the efficiency of the proposed DRL-DVEAP algorithm, since it converges to the exact ILP OVEAP in terms of placement cost. In addition, Figures 4b and 4d show that DVEAP algorithm gives a non-significant placement time (in terms of a few micro seconds) compared to OVEAP that still has a feasible placement time.

2) THE LARGE-SCALE SCENARIO
We first considered the entire autopilot allocation. In Figure 5, we plot the main KPIs against the number of autonomous vehicles. We consider the exact ILP OVEAP approach taking into account the above constraints.
In Figure 5a, we plot the network average delay between autonomous vehicles and edge computing servers.
In Figure 5b, we show that the edge computing reduces the computing load on cloud computing when the autonomous vehicles are increasing.
In Figure 5c, we show that the edge computing reduces the average service time comparing to the embedded and cloud computing architectures. Indeed, embedded autopilot offload the heavy VNF to the near edge for efficient edge processing.
In Figure 5d, we show that edge computing reduces the processing comparing to cloud computing.

a: THE IMPACT OF AUTOPILOT VNFs ALLOCATION ON THE PROCESSING TIME
In Figure 6, we show the processing time variation when increasing the number of autonomous vehicle. Results show VOLUME 10, 2022    feasibility and the efficiency of the proposed edge computing architecture by reducing the service time shown. This result validates the edge assistance autopilot function's offloading, since that this KPI reflects the total application time.
In Figure 8, we quantify the behavior of the DRL DVEAP algorithm in a large-scale network according to the different edges configurations. We plot autonomous vehicle number ranging from 20 to 100 against TESU . Results show that increasing the computing capacity helps in better offloading autopilot functions.
In Figure 9, we show the limit of the DRL approach in a very dense network constituted by a hundred of autonomous vehicles requiring services offloading. Results show that most of the autopilots are successfully allocated on MEC servers.
In Figure 10, we evaluate the DRL approach in terms of VNF offloading time. The result proves the feasibility and the efficiency of the proposed algorithm in large scale. 53998 VOLUME 10, 2022

VII. CONCLUSION AND FUTURE WORK
In this paper, we have proposed an Artificial Intelligence approach for autopilot placement (offloading) at the network edge. First, we have proposed an end-to-end architecture for edge computing assisted autonomous driving. Then, we have proposed an optimal allocation approach (OVEAP) that decides about optimal autopilot functions placement. Further, to deal with dense IoAV networks, a deep rein-forcement learning approach (DRL-DVEAP) is formulated and implemented. Based on different configurations of edge environments, the proposed DRL achieves a good result in terms of offloading cost and processing time. In the future work, we will focus on autopilot VNF migration in case of near edge discovery scenario.
MOHAMMED LAROUI received the B.Sc. and M.Sc. degrees in computer science from Djillali Liabes University, Sidi Bel Abbés, Algeria, in 2015 and 2017, respectively. He is currently pursuing the Ph.D. degree in computer science from Paris University and Djillali Liabes University. He visited Telecom SudParis, Paris, France, in 2017, as a Researcher, where he worked on vehicular networks. His research interests include cloud/edge and fog computing, vehicular networks, IoT, next-generation networking, and internet. His research interests include wireless area body networking (WBAN) for medical and health applications, wireless sensor networking, QoS in WSN, middleware for 5G mobile, and sensor networks. He participated and still participates in several national and international research projects. He is with the Technical Program Committee of different ACM and IEEE conferences, including Globecom, ICC, WCNC, PIMRC, IWCMC, and chaired some of their sessions. He is also a reviewer on a regular basis for major international journals. He is a member of IEEE Communication Society.
HOSSAM AFIFI received the Ph.D. degree in computer science from Inria Sophia Antipolis, in 1992. He visited Washington University, St. Louis, as a Postdoctoral Researcher, where he worked on IP switching techniques. He was appointed as an Assistant Professor with ENST Bretagne (now, Mines Atlantic), in the field of high speed networking. He is currently a Professor with Télécom SudParis-Institut polytechnique de Paris, SAMOVAR Laboratory, where he works on machine learning for mobile and security protocols. After his tenure and a sabbatical with Nokia Research Labs, Mountain View, USA, he took the current position with Telecom SudParis since 2000. His current interests include vehicular and user-centric behaviors.
EMAD ABD-ELRAHMAN received the B.Sc. degree in electronics engineering from Mansoura University, Egypt, in 1999, the M.Sc. degree in electronics engineering from the Computers and Systems Department, Mansoura University and National Telecommunication Institute, Egypt, in 2004, and the Ph.D. degree in computer science and telecommunication from the University of UPMC and Telecom SudParis, in 2012. In 2008, he joined the University of UPMC-France (Paris-6) and IMT (Institute Mines-Telecom) Telecom SudParis. He spent three years as a Guest Researcher at the RST Department, Telecom SudParis (IMT)-CEA Saclay-France, from 2014 to 2016. He has been an Associate Professor with the National Telecommunication Institute, Cairo, Egypt, since 2018. His current research interests include networking, optimization, multimedia, multi-modal traffic in ITS, virtualization SDN/NFV, and cloud computing. He is involved in many European and French projects, including the UP-TO-US, DVD2C, and CA-ITS. VOLUME 10, 2022