oneVFC—A Vehicular Fog Computation Platform for Artificial Intelligence in Internet of Vehicles

We are witnessing the evolution from Internet of Things (IoT) to Internet of Vehicles (IoV). Internet connected vehicles can sense, communicate, analyze and make decisions. Rich vehicle-related data collection allows to apply artificial intelligence (AI) such as machine learning and deep learning (DL) to develop advanced services in Intelligent Transportation Systems (ITS). However, AI/DL-based ITS applications require intensive computation, both for model training and deployment. The exploitation of the huge computational power obtained through aggregation of resources present in individual vehicles and ITS infrastructure brings an efficient solution. In this work, oneVFC, a tangible vehicular fog computing (VFC) platform based on oneM2M is proposed. It benefits from the oneM2M standard to facilitate interoperability as well as hierarchical resource organization. oneVFC manages the distributed resources, orchestrates information flows and computing tasks on vehicle fog nodes and feeds back results to the application users. On a lab scale model consisting of Raspberry Pi modules and laptops, we demonstrate how oneVFC manages the AI-driven applications running on various machines and how it succeeds in significantly reducing application processing time, especially in cases with high workload or with requests arriving at high pace. We also show how oneVFC facilitates the deployment of AI model training in Federated Learning (FL), an advanced privacy preserving and communication saving training approach. Our experiments deployed in an outdoor environment with mobile fog nodes participating in the computation jobs confirm the feasibility of oneVFC for IoV environments whenever the communication links among fog nodes are guaranteed by V2X technology.


I. INTRODUCTION
The number of vehicles used worldwide is expected to rise from one billion in 2010 to two billion in 2030. Vehicles have become sensor platforms able to sense, communicate, analyze, and make decision [1], [2]. Internet of Vehicles (IoV), a network allowing data and information exchange among vehicles, things such as roadside infrastructure, humans, and the environment is becoming a reality thanks to Vehicle to Everything (V2X) technology which is based on two pillars being 5G-LTE and Dedicated Short-Range Communications (DSRC) [3]. Those intelligent vehicles and networks give rise The associate editor coordinating the review of this manuscript and approving it for publication was Celimuge Wu . to Intelligent Transportation Systems (ITSs), providing services like forward accident alarms, collision avoidance, traffic congestion mitigation, platoon of vehicles, autonomous driving vehicles, etc.
Advanced ITS services are Artificial Intelligence (AI)-based applications whose efficiency is enhanced by rich data collection in the IoV environment [4]. Vehicle-related data obtained from sensors in vehicles, Global Positioning Systems (GPSs), electronic toll tags, vehicle On Board Units (OBUs) rapidly increase in volume and variety. ITS data are also collected from other sources, like ITS infrastructures such as loop detectors, infra-red sensors, ultrasonic sensors, and closed-circuit television (CCTV) cameras, travelers (who use web browsers, mobile apps, social networks).
Such complex and abundant data resources help to train AI models with better accuracy. AI-based applications require rich data sets and intensive computation, both in training and deployment phase.
Previously, cloud computing played a major role in big data analytic platforms. However, the uploading of huge amounts of data to the cloud, together with the advent of latency sensitive services make fog computing which ''extends the cloud computing to the edge'', a welcome or mandatory extension [5]- [8]. Distributed and parallel processing at the network edge can provide local view-based analytics, lower data and service delivery latencies, and enhanced data privacy. It can be a perfect choice for the heterogeneous, dynamic, distributed IoV environment. However, as computing and communication workload for ITS services varies over time and location, capacity planning of fog nodes is a challenge. To enhance resource utilization efficiency, vehicular fog computing (VFC) is created to exploit a huge computational power through aggregation of individual vehicles' resources and other devices in the ITS infrastructure [9], [10]. The feasibility of leveraging computing and communication resources of slowly moving cars in cross-section regions or of parked cars for supporting advanced vehicular applications has been investigated.
Various deployment scenarios of VFC have been studied. In those scenarios, the fog nodes can be computing devices installed in buses or taxis that process data being offloaded from client vehicles, when those buses or taxis are travelling alongside those clients [6], [11]. The fog nodes can also be parked vehicles that take the role of static backbone nodes for fog computing. They can also be vehicles stuck in traffic congestion that form a cluster or computing devices combined with V2X Road Side Units (RSUs). The mobility of vehicles can be leveraged as an effective way of organizing computing resource migration [12]. Other research focuses on task assignment and resource allocation, which are essential concerns in shared resource environments. The studies in [13]- [16], show how to assign computing tasks to be parallelly computed by a set of vehicular fog nodes to satisfy objectives related to quality of service such as latency, image/video resolution under constraints related to communication bandwidth, computation capability and energy consumption of these nodes. In [17]- [19], a contractor auction-based approach is applied to stipulate or negotiate the provided resources together with the benefits obtained in terms of parking fee reduction when leveraging the computing units of parked cars.
In this paper, our aim is to realize the management of parallel service computations on various computing devices installed in vehicles. The platform, named oneVFC, adopts the 3-layer VFC architecture in which the role of management and orchestration is taken up by the fog layer [20]. We propose a hierarchical structure including fog manager nodes and fog worker nodes. The manager nodes can be seen as fixed nodes like V2X-RSUs, whereas the fog worker nodes are vehicle OBUs. A manager node will manage the available computing resources of the worker nodes in its neighborhood and will direct computing tasks to them according to a task assignment algorithm. This algorithm will realize specified objectives, such as minimizing the serving processing time.
We propose to use the oneM2M standard for creating the oneVFC platform. oneM2M is an IoT middleware standard for realizing interoperability between heterogeneous Machine to Machine (M2M)/Internet of Things (IoT) systems active in different service domains, such as smart transportation, smart city, smart health etc. It is increasingly used in commercial deployments [21]. oneM2M is a joint effort of eight national and international standard organizations, and has more than 200 members including national telecom companies (telcos). Various big telcos have deployed oneM2M commercial platforms [22].
The choice of oneM2M is motivated as follows. Firstly, a oneM2M-based architecture for the VFC platform will facilitate the integration of already existing ITS systems, e.g., electronic toll collection systems, traffic lights, city parking systems, CCTV cameras, which are often deployed using different protocol families. Secondly, the physical nodes like vehicle OBU, V2X-RSUs, and the information exchange among nodes can be represented as resources accessible through the publish/subscribe mechanism. A oneM2M node can support more than one underlying network interface, among which innovative technologies that allow high bandwidth and low latency communication in V2X. For vehicle OBUs, both DSRC and 5G enable parallel Vehicle to Vehicle (V2V) and Vehicle to Infrastructure (V2I) communications. Based on the required Quality of Service (QoS), traffic belonging to a certain application may use DSRC network while other traffic will use a 5G network.
We decompose the aggregated service flows of AI-based applications into several ''operation primitives'' that can be used to orchestrate the computing workloads distributed over various available computing resources. Then, we propose a communication and a computation management scheme that can be used to optimize task assignment and resource allocation, considering the constraints of node capacity and availability since most of the computation devices are installed in personal cars.
We evaluate the proposed oneVFC platform on a lab scale testbed, for managing the deployment of AI-based applications on various machines. A significant reduction of application processing time, especially for high workloads, or for service requests arriving at high pace is obtained. The oneVFC platform also facilitates the deployment of AI model training in the Federated Learning approach. The data message structure and procedures in oneVFC allow to monitor the operation states of each computing node and to support adaptive task assignment and resource allocation. Deploying oneVFC on a testbed setup in university's outdoor parking space demonstrates its efficiency for managing computing units attached to travelling vehicles that take computation jobs to lower overall service processing time. The benefits brought by the oneVFC platform for distributing computation over various fixed and mobile fog worker nodes are confirmed, and those benefits will be increased with the availability of DSRC's and 5G's data rates.
The paper is structured as follows. Section II presents our analysis of ITS applications and the challenges of VFC deployment in IoV environments. Section III examines the details of the oneVFC architecture, paying particular attention to the reasons for choosing the oneM2M standard for the VFC platform, and to the oneM2M based functional structure of manager fog nodes and worker fog nodes. Section IV shows the resources and procedures to manage information/data flows by using service primitives of the oneM2M standard. In section V, the model of task assignment and resource allocation is proposed and the applicability of a particle swarm optimization algorithm for task assignment to worker nodes is discussed. Section VI presents two use cases for evaluating oneVFC on the available testbeds. The final section draws a conclusion and discusses future work.

II. VEHICULAR FOG COMPUTING ARCHITECTURE FOR AI-BASED APPLICATIONS
The AI-based services process huge amounts of data generated by CCTV cameras installed along the roads in combination with data from sensors, cameras on vehicles. The service outcomes are announced to data/service users and to on-road vehicles. Recently, Deep Learning (DL) models, e.g., Yolo (You Only Look Once) [23], Convolutional Neural Networks (CNN) and their variations [24], are widely used for object detection [25]. DL model-based programs can detect cars, trucks, buses, people, etc., according to the defined object classes in the model. They become the main components for vision-based applications in many fields including traffic congestion detection and management, autonomous vehicle safety, etc. Deep Learning models require a large dataset and high computational demands for training the models, as well as high computational efforts for their real-time execution on bundles of collected images. Note that deep learning is a sub-set of machine learning, and machine learning is a subset of AI.
Generally, AI/DL-based ITS applications generate two types of computing services requests being: 1) AI-based model exploitation: this involves the real time application of AI/DL algorithms or other techniques on collected data from the surroundings to solve actual problems in traffic management, road safety, etc. . . A timely response must be extracted from the rich dataset. It can be beneficial to split the data in smaller pieces that each can be processed by other edge devices. Hence, parallel data processing at various vehicular fog nodes could shorten the service response time. 2) AI model training: predictive model training, especially with DL algorithms, requires intensive computation and data storage, and is usually performed in the cloud using popular tools such as Google Colab, Kaggle, and Azure Notebooks [26], [27]. As the demand for training has grown faster than the increase in computing resources, distributing the computation across multiple machines has become mandatory. In addition to distributed learning, a new distributed training approach, named Federated Learning (FL), is being studied intensively [28]. In FL, the global model can be found by aggregating the locally trained models. Those local models are trained with local data sets at local devices. To preserve data privacy, only model parameters, instead of data, are exchanged. In both distributed learning and federated learning, parallel model training on various machines will increase communication efficiency and data privacy. To satisfy the rising demands for AI-based ITS applications, VFC is created to exploit a huge computational power through the resource aggregation of individual vehicles and other devices in the ITS infrastructure. The architecture of VFC follows a hierarchical 3-layer functional architecture consisting of device layer, fog layer and cloud layer as shown in Fig.1. The device layer includes 'data owners' being devices equipped with sensors/ actuators, that collect raw data of any kind and it also includes 'data users' being end-user devices that generate service requests. The cloud layer consists of local system servers and has a connection to the Internet cloud. The local servers are mainly used to store large amounts of raw or processed data for further analysis and are occasionally used to process instant services. The middle layer or fog layer consists of mini local servers attached to cellular-based base station (BTS) units or RSUs of DSRC, fixed fog nodes. Another kind of nodes in the fog layer are OBUs on vehicles which are mobile fog nodes. The fog layer will perform computation tasks for services that need instant response to service requests from data users. VFC must deal with sophisticated orchestration of heterogeneous resources considering their availability fluctuation due to resource mobility. Fog nodes may arbitrarily join or leave and offer variable communication quality in terms of bandwidth and reliability. Moreover, the interconnections among fog nodes, data owners and data users are based on heterogeneous transmission technologies and protocols. As such, VFC's architectural design must meet the following requirements: -Connect, interoperate, and monitor heterogeneous resources, -Manage the availability of distributed resources for efficient usage, -Orchestrate the information flows from data owners, the computing tasks on fog nodes as well as the feedback to data users. In the following sections, we will analyze the suitability of oneM2M for realizing the VFC platform. We will provide the detailed design of oneM2M functions and services to build VFC functions, including the management of devices, resources, services, and data.

III. ONEM2M BASED ARCHITECTURE FOR VFC
oneM2M is an IoT middleware standard for interoperability of heterogeneous M2M/IoT systems in different IoT service domains, such as smart transportation, smart city, and it is increasingly used in commercial deployments [21]. It supports a rich set of Application Programming Interfaces (APIs) for data exchange among things, machines, humans. The role of a middleware in IoT/IoV networking system is to provide storage and information processing services. A oneM2M-based architecture for VFC platform will facilitate the integration of already existing ITS systems, and leverage various authentication and authorization schemes, an important asset for providing a secure system [29].
oneM2M has classified five types of nodes, including infrastructure nodes (INs), middle nodes (MNs), application service nodes (ASNs), application dedicated nodes (ADNs) and non-oneM2M device nodes (NoDNs). A node's functions are classified into three layers, namely, the application layer, common service layer and network service layer; but not every node has all layers. Each layer has its own entities: -application entities (AEs) stand for the applications in devices, gateways, or servers, containing the business logic of service applications, -common service entity (CSE) is a set of common service functions for M2M/IoT services, allowing messages to be communicated coherently, regardless of the underlying network layer, -network service entities (NSEs) are the underlying network services that are available for the CSE. It defines two domains being the infrastructure and field domain. INs reside in the infrastructure domain while the field domain contains the other types. INs correspond to cloud platforms or servers of the application system, MNs can be field gateways which usually have sufficient resources and support both field protocols and IoT protocols. ASNs or ADNs can be smart objects or on-vehicle devices, both supporting general IoT protocols. ASNs are different from ADNs in two aspects. ASNs should have sufficiently rich resources, but ADNs have constrained resources. ASNs can relay information to other ASNs or ADNs, but ADNs cannot. Due to that, an ASN contains all AE and CSE layers whereas an ADN has only an AE layer. NoDNs correspond to sensors and actuators attached devices for which protocol conversion via gateway (MN) is required to join a oneM2M-based IoV.
oneM2M adopts the Resource Oriented Architecture (ROA) model, where all devices and related information can be handled as resources using a hierarchical structure. A resource, a uniquely addressed entity, can be transferred and manipulated using create, read, update and delete (CRUD), basic operations of RESTful architecture. Access to a resource is allowed by using different types of Common Service Functions (CSF), like Subscription and Notification, Discovery (probably with some predefined criteria for filtering). Any function/service in oneM2M is implemented as resource and procedure. A resource can be the data themselves, e.g., text files, images, energy level of a physical device, as well as the monitored states of an executed process/task, like execution time, completion time, etc. It enables a flexible representation for a wide range of data or information.
A oneM2M-based system has a tree-based architecture rooted at the infrastructure node.Information exchange between two M2M nodes will use the transport and connectivity services of the underlying networks. The routing information of a CSE, CSE-Point of Access (PoA), will depend on the characteristics of the underlying networks and will be provided by the CSE at the registration phase. The CSE-PoA is considered equivalent to the routable addresses of the targeted CSE. Besides, a oneM2M node can have more than one underlying network such as a low latency DSRC or 5G for outdoor environments, or a mmWAVE for indoors. For example, on vehicle OBUs, both DSRC, 5G are used to enable parallel V2V and V2I communications [30]. Multiple transportation networks can be used simultaneously to route QoS-differentiated traffic. The multiple differentiated CSE-PoAs will be defined during the development and implementation process.
IoT middleware platforms are classified into several interoperability, technical, syntactic and semantic levels [31]. Whereas oneM2M's focus is on technical and protocol interoperability with some efforts to tackle semantic interoperability, FIWARE, Oracle Fusion, Azure IoT, Amazon Web Services (AWS) [22], [32] are more focused on data models providing a semantic interoperability framework so that applications and services may easily consume information and/or trigger actions in various systems, including oneM2Mbased IoT systems [31]. They are cloud-based platforms which do not fit into VFC in which data processing services are performed at edge nodes. However, they can be seen VOLUME 9, 2021 as layers above the oneM2M-based systems to provide processed data results for use in a wide range of applications.
Previously, oneM2M has been studied to set up a fog channel for data exchange between two fixed fog nodes. The data channel is based on another protocol that can help to deliver data more quickly [33]. In our work, several data channels between the data owner and several worker nodes are established in parallel and are coordinated to obtain service results which are submitted to the fog manager.

B. oneVFC -oneM2M BASED ARCHITECTURE
The 3-layer VFC architecture is hierarchical. Various computing resource nodes probably in multi-hop communication fashion, are rooted to an access gateway (cellular-based BTS unit or DSRC RSU) as shown in Fig. 1. The hierarchical architecture of the proposed oneVFC is a scalable structure, and appropriate for local view-based data analytics in ITS systems. The access gateways are called fog manager nodes, and computing devices on vehicles are called fog worker nodes. A fog manager node is responsible for managing worker nodes in a given geographical area. The management of computation and communication resources, the task assignment and resource allocation are performed by the fog manager nodes. The worker nodes perform the computation jobs which are assigned by the manager nodes. Some worker nodes can relay information back and forth between the manager and the worker nodes. However, task deployment on far-away worker nodes is only suitable for delay tolerant services.
The VFC structure shows similarities with the architecture of oneM2M in which a middle node (MN) can manage many end-devices (ASNs or ADNs) under its responsibility, while the system server at cloud layer can act as IN. As shown in Fig. 2, we present a oneM2M-based architecture for the VFC platform. In this architecture, a fog manager node takes the role of MN. Vehicular computing nodes, fog worker nodes, can be set up as ASNs. Since an ASN has a CSE layer, containing service functions for messages being exchanged among oneM2M nodes, a fog worker node can communicate control and data messages. Hence, a computing device attached to a bus can take the role of ASN since it travels regularly along some specified routes helping to fill the connectivity gap (by relaying information/data) in vehicle networks as well as to execute computation jobs.
Other end devices like cameras, traffic lights or IoT-based data collection systems are connected to the gateway-MN as ADNs or NoDNs to provide data for AI-based big data analytic applications. Fig. 2 presents the oneM2M-based functional structures of roadside gateways, computing devices on vehicles, data users like smartphone users, and data owners/generators like cameras, sensors, actuators in ITS applications.
Maintaining the communication links between any pair of nodes is the responsibility of the underlying networks which can be DSRC or 5G networks. To reduce the transmission latency, a fog manager node should be attached to a DSRC-RSU or a 5G BTS.

IV. RESOURCES AND PROCEDURES FOR DISTRIBUTED TASK COMPUTATION MANAGEMENT
In general, compute service requests can come from any node but only the manager node is able to process the service request to assign subtasks to a number of fog worker nodes and to return the aggregated results to the data user who requested the service. From now on, the term ''service'' and ''task'' are used interchangeably. A service request processing procedure is composed of 4 steps: -Step 1: a data user requests a service to the manager.
-Step 2: the manager processes the service request by • Assigning subtasks to worker nodes: the manager will find the list of suitable computing nodes for doing the requested job. Note that this sub-step is optional if the computing service needs to be efficiently divided into subtasks for various worker nodes. In some applications, the group of worker nodes is known from the business logic of the application, e.g. in distributed learning or FL approach for AI-model training.
• Sending notification to workers: the manager sends notifications to worker nodes about the data and the execution program. -Step 3: the workers download the data, execute the computation task, and send back the results to the manager node. The data can be images, sensor-based measurements, or text files representing the AI model parameters, which are all considered input parameters for the computation task. -Step 4: the manager aggregates the results and sends this aggregation to the data user who requested the service. We design an AE-MANAGE, located at the manager node, to perform service allocation and coordination of multiple subtasks. We design an AE-COMPUTE at every worker node to call the specified service application programs whenever the information about the service is received at the worker node. The flow of control and data messages between manager and workers will be supported by the CSE layer in these nodes. Fig. 3 depicts the AEs and containers inside each manager's CSE and worker's CSE. We propose the following containers in the CSE layer: Resource trees in a manager node (left) and a worker node (right) consist of the essential containers for task execution and management, and for worker node resource monitoring.
-CNT-SERVICE: It is responsible for service-related information exchange. It receives the service requests from and feedback results to end users requesting services. -CNT-MONITOR: It is responsible for updating the worker states to the manager, e.g., CPU, RAM, energy available or a kind of their representation in terms of time processing. -CNT-EXECUTION: It is responsible for sending commands and data to the application programs at the worker nodes and sending back the obtained sub-results from the worker to the manager. -CNT-DATA: It is used for data storage. The operation of a worker node contains two phases: the initial (registration) phase and the computation phase. In the initial phase, the workers-CSE must successfully register to the manager-CSE to notify the manager about the workers' states. After registration, the manager AE-MANAGE will subscribe to all needed containers in the worker-CSEs (SERVICE, MONITOR, EXECUTION) through the publish/subscribe mechanism. Hence, whenever a container's content changes (e.g., service requests/responses, execution requests/responses, worker node states), notifications are sent to the AE-MANAGE for the purpose of task allocation and subtasks coordination. The worker AE-COMPUTE will subscribe to the CNT-EXECUTION at its CSE that it will be notified about the subtask data and subtask application program.
In the computation phase, a 4-step procedure is activated whenever a service request arrives. In this procedure, the manager and worker nodes exchange control and data messages and execute the sub-service programs at worker nodes. As shown in Fig. 4, the four steps are the following: -the manager AE-MANAGE receives the notification about a service request from a data user AE. Based on that, it will either extract the payload or execute the task assignment and resource allocation (TARA) module to find the list of candidate computing nodes to process the subtasks and the corresponding subtask workloads. This information is sent to the data owner for data preparation. -the manager AE-MANAGE sends to such a group of worker nodes the notifications including the data resources and the execution program resources. -the worker AE-COMPUTE receives notifications and reads the information on the required resources for its part of the application program execution and calls the service program. The worker node can download the data from the data owner if needed. -the worker-AE-COMPUTE executes the application program. After completion of computation job, the worker nodes notify the results to the manager-AE-MANAGE through the CNT-EXECUTION-response.
Multiple types of services addressed by pairs of serv_ID and source_ID can be processed at the AE-MANAGE.
To load data for a computation job, a worker AE-COMPUTE can use a discovery mechanism with the universal resource indicator (URI) link being provided by the manager-AE-MANAGE. Our platform supports two different kinds of data transfer: i) data can be piggybacked in oneM2M packets, ii) data link resources are posted in oneM2M packets, and the receivers download the data through other protocols.
The CNT-MONITOR regularly monitors the worker node's state and will communicate updates to the manager-AE-MANAGE through the publish/subscribe mechanism. Node state includes RAM, CPU, energy level, current processing workload, service processing time, etc. Each worker node has its own policies for sharing its resources when joining the oneVFC platform. The MONITOR containers will make the resource capability information available to the task assignment and resource allocation (TARA) algorithm.
The TARA algorithm is executed every time the manager node receives a new service request. The TARA execution results in the subtask assignments to a group of the worker nodes. The assignments are adapted to the status of worker nodes, e.g., the availability of nodes for a group of specified services, the workload being processed, the node configuration (such as CPU speed, RAM) etc., as well as to the target being pursued with regard to service processing latency, energy consumption, etc. Different resource allocation approaches proposed in literature can be adopted in the TARA algorithm. In the following section, we propose a system model of task assignment and resource allocation to minimize service processing latency subject to the limited computing resources.

V. TASK ASSIGNMENT AND RESOURCE ALLOCATION (TARA) A. SYSTEM MODEL FOR TASK ASSIGNMENT AND RESOURCE ALLOCATION (TARA)
A fog manager node will ''manage'' a set of computing resources called worker nodes, denoted by R. Those worker nodes have previously registered to the fog manager. The number of available worker nodes in R can vary because the manager must filter out worker nodes which are no longer available. The manager node will know that through regular updates on the status of worker nodes in R. The manager will try to use the available computation resources of the worker nodes under its supervision (nodes in R) to minimize the requested service processing time.
A data user k can request to the fog manager node to execute an aggregated task defined as a bundle of data W k and a specified job to work on that data. Note that the service requesting node can be one of the worker nodes. The bundle of data can be subdivided to be processed in parallel at several nodes. Alternatively, the task can be divided into smaller subtasks {w ki = p ki W k } to be handled by the set of worker nodes R = {n i } managed by the manager node, and then the result will be aggregated by the manager and returned to the original requesting node k. Hence, the set {p ki } represents the partitions of the bundle of data requested by node k, which will be processed by the set of worker nodes {n i }.
The subservice completion time at node i, called t ki , will consist of three components, in which, t ki trans , denotes the workload transmission delay, is the time to relay data from the data owner k to the worker node i. t ki proc , denotes the subtask processing delay, is the time to process the task at the worker node i. t ki result , denotes the result response transmission delay, is the time to transfer the output analysis results to the manager or the node that requested the service. The system performance will depend on the system configurations, e.g., bandwidth of transmission link for data/result transmission, the CPU speed, RAM storage, etc. of the computing resource. These relationships are formulated in the following analysis.
(1) The subtask workload transmission time The time needed for data of a subtask to be transferred from data owner k to worker node i is the ratio of the amount of data of the subtask and the transmission rate between the two nodes. The node k can also perform the entire task, then i = k, B ki = ∞ so t ki trans = 0.
(2) The subtask computation time Here, p i W k f i represents the processing time of w ki (which equals p i W k ) data at node i with f i being the processing speed. At any given time, node i may still be processing a certain amount of data of the previously assigned task, denoted by N i , for that the processing time is calculated by τ ki N i f i , where τ ki can be 0 or 1. If τ ki = 1, there is a task previously scheduled and currently executed on node i. Then, the subtask computation time will include both the processing time of the newly assigned task and the task being processed. τ ki = 0 means there is no previous task. Hence the newly assigned task from node k can be processed immediately.
(3) The sub result transmission time After the subtask is completed, nodes i will send the results back to node k which is the node that originally executed the service request. This delay is calculated by the ratio of the message size R ik and the transmission rate B ik between two nodes.
(4) The subtask completion time Therefore, the completion time of a subtask requested by node k to be performed by node i can be analytically presented as follows: Whenever the manager node receives the service request with an amount of W k workload, the TARA module is executed to find out the number of subtasks, the subtasks' workload, and the list of suitable worker nodes. The subtasks are performed independently, mostly in parallel, the completion time of the service requested by node k is determined as follows: TARA, which has an objective the minimization of service processing time yields the following min-max optimization problem: min (max (t 1 , t 2 , . . . , t n )) , Under the following constraints: Constraint (8) is needed to ensure that the whole data bundle will be processed. The constraint in (9) is to avoid sending too much workload to a certain node leading to overload or VOLUME 9, 2021 unbalance. We recommend the size of a subtask to be less than or equal to the average size of all tasks in the network. Note that, the fog manager node regularly monitors the currently executed workload N j of the computing resources.
The constraint in (10) is required to avoid that the size of a subtask is too small and needs a processing time much lower than the transmission time. This would make the task allocation ineffective, with large communication overhead and no gain in request completion time. To address that, the workload sent to a node needs to be no less than αW in which α is the parameter showing the relative order between processing time and transmission time which could be experimentally found.
Hence, the output result is set of {p ki } indicating the data bundle {W ki } assigned to the set of fog nodes managed by the manager.

B. TARA IMPLEMENTATION IN oneVFC
In the current version of oneVFC, we apply the particle swarm optimization algorithm [34] on the above defined minmax optimization problem and obtain the task assignments for the worker nodes. However, various approaches for the TARA model and algorithms can be implemented in the manager node as well.
The transmission rate B between any two nodes and the processing speed f (in Equation (5)) of a specified hardware have been provided by using the nominal values which in general do not reflect reality. Therefore, to estimate these parameters which are combined with specified hardware (e.g. processors, network switch, cables), we run several preliminary tests using the oneVFC platform. In these tests, the task with the requested workload W k will be deployed on one worker node, and the transmission and the computation delays are measured. The approximation functions showing the relationship of the transmission and computation delay with the workload are obtained statistically. Based on the approximation functions resulting from the preliminary evaluations, the transmission rate B and the processing speed f in Equation (5) are experimentally estimated.

VI. TESTBED EVALUATION AND RESULTS
For demonstrating the well-functioning of our oneVFC, we deploy two use cases and analyze their performance.
-AI/DL-based model exploitation: object detection application based on CNN model, -AI/DL model training: CNN model training process in FL approach. The used CNN model is m-AlexNet model [35]. mAlexNet is a compact version of AlexNet [36] which is an early well-known DL model for object detection and object recognition purposes. mAlexNet has fewer convolutional layers and fewer parameters than AlexNet to trade off accuracy against computation cost. Then it is more suitable for binary object detection, two-class detection. We have chosen mAlexNet because of its reasonable computation load on Raspberry Pi hardware in our testbed.
Both computing services require two types of input components: i) data sets, i.e., the collections of images, videos; and ii) the model, generally represented in text files containing weight values and model parameters. The computing services can be distributed to various machines of the VFC platform to be processed in parallel. Afterwards, the results are collected and combined.

A. TEST-BED SETUP AND PRELIMINARY TESTS
To verify the proposed architecture and design, a testbed was built as a scale model of the reality. The testbed includes a fog manager node and five fog worker nodes. Due to the unavailability of 5G or DSRC communication, the underlying network supporting the communication among nodes is Wifi-based. The hardware and software configurations are described in Table 1. The deployment inside our laboratory room is shown in Fig.5. We also deploy several experiments in an outdoor environment (a university parking space) with WifiMESH solution.  The value of the transmission rate B between any two nodes and the processing speed f (in Equation (5)) of specified hardware has been provided by the nominal values which in general do not reflect the real operation. Therefore, to better estimate these hardware dependant parameters (e.g. processors, network switch, cables), we run several preliminary tests on our oneVFC platform. A specific workload is requested to a given worker node (Raspberry Pi kit), and transmission and computation delays are measured. The final results are the average over 30 identical and independent service requests. The service interarrival interval is chosen large enough to allow the completion of the image processing before a new service is requested. Fig. 6 shows the computation time on the worker nodes and the data preparation and transmission time in function of the number of images processed by the miniAlexNet-based detection application (varying from 100 images to 500 images of about 200Kbytes). Estimates for average transmission and computation delay in function of workload are obtained. Based on those, the transmission rate B and the processing speed f in Equation (5) are experimentally estimated.

B. CNN-BASED IMAGE PROCESSING APPLICATION DEPLOYMENT ON VFC
We evaluate the performance of the vision-based object detection application on the lab-scale testbed. The image data set of a thousand ∼2MBytes images is located at the worker node #1 depicted in Fig. 4. This data owner node sends its request to the manager node and the manager node runs the TARA module to divide the requested task into subtasks to be co-performed by several other worker nodes. The flowchart of request processing is depicted in Fig. 4. The procedures in a computing worker node are like the ones executed in worker #2 in Fig. 4.
The simulation scenarios are set up with the following parameters: -The workload volume of a service request, represented as the number of processed images, varies from 100 to 500 images. -The service request interarrival times are modelled through a uniform or exponential distribution with mean going from 5 to 20 seconds. -The total number of worker nodes taking up the computation jobs created by a data owner (a worker node as well) varies from 1 to 5 nodes.
The performance parameter under study is the service completion time, calculated as the time between the moment the service request arrives at the manager and the moment of task completion at which the manager is receiving the results. The performance measures are averaged over 30 sequential service requests. The service completion time consists of several components being task assignment, data preparation control message transmission to worker node delay, as well as data transmission to worker node, service computation on worker node, and result message transmission to manager node delay. These delay components are monitored and investigated in detail thanks to oneVFC's functionalities. Fig. 7 clearly shows the reduction of the service completion time when service requests arrive at high pace. Indeed, when the service request interarrival times are exponentially distributed with a mean of 5s and 10s, the reduction is around 84% and 60%, respectively. When the service requests arrive at low pace, with mean interarrival time of 20s, the reduction of the service completion time is only 20%. The delay components of the service completion time in function of service request interarrival time, and in function of service workload (number of images processed) are depicted in Fig.8. The service execution delay at the worker nodes has the largest impact, whereas the control information transmission among the manager and worker nodes can be neglected. The other two delay components are the data transmission time from the data owner to the worker nodes and the time spent for data preparation (e.g., data compression before transmission). These two delay components are the so called oneVFC overhead delays, for a requested service to be deployed in parallel on various computing devices. Fig. 9 shows the delay components of the service completion time for a series of 50 requests arriving at high pace. Each service request demands an object detection for 300 images. Fig. 8(a) shows that when the service requests arrive at high pace, the data processing and the overhead time increase, resulting in a high response delay for the requested services. Besides, if the workload of each service request is increased gradually from 100 to 500 images per request, the component delays all show an increasing trend, see Fig. 8(b). VOLUME 9, 2021 FIGURE 8. Service completion time and its components in function of (a) service request interarrival time and (b) service workload (the number of images to be processed) with service request interarrival time equal to 10 seconds. Tests are performed on a network with 1 fog manager node and 3 fog worker nodes, the underlying network in the lab is Wifi. However, we observe that, at rather high workload (requests of 400-500 images arriving at high pace), the data preparation time has a rather sudden increment. The worker node which owns the data must do both jobs, data preparation to be sent for other worker nodes and service program execution for the subtask that is assigned to itself. Hence, the data preparation time gets higher quickly when service request workload gets higher. This effect is clear since vehicular computing devices usually have lower capability than server machines, hence this effect should be considered in the task assignment algorithms.
In our implementation, the transmission rate B and the processing speed f in (5) are experimentally estimated. Fig. 6 shows the preliminary tests to obtain these parameters, and these estimations do not consider service request arrival pace. These parameters might change when service request workload gets higher, or service requests arrive very densely. They should be estimated online to be adaptive with the current situation. This issue should be considered in task assignment and resource allocation when processing delay minimization is targeted (see section IV). Formulating the minimization problem under the assumed constant value of bandwidth and processing speed may cause non optimal results. The ability to monitor and report the states of computing nodes as well as the delay components in oneVFC should be exploited to determine the relationship between overhead cost and service workload and service request arrival rate.
When more nodes are willing to share their computation capabilities, the service completion time is clearly reduced as shown in Fig. 10(a). Every delay component that contributes to the service completion time is lower. However, when there are 5 computing nodes performing the subtasks, the service completion time slightly increases because data preparation and data transmission times are slightly higher. The fact is that when more computing nodes join the service computation, the data owner (worker #1) must put more effort in data preparation and transmission to other workers, resulting in a higher delay. Also, the service deployment time on worker #1 is higher, as can be seen in Fig. 10(b). It does again confirm that the interplay between transmission bandwidth, hardware processing speed and processed workload should be well investigated for achieving successful task assignment and resource allocation.
We also performed the simulation in an outdoor environment, in which two worker nodes are set up at fixed location and one is attached on a car travelling with average velocity of 20km/h in the limited area in our university campus. The underlying network is provided by Wifi MESH solution, including 4 access points operating in the 5GHz band. When a node is in the communication range of at least one access point, the communication to the other nodes connected with other access points is ensured. The outdoor measurements are confronted with the indoor Wifi ones in Fig. 11. The data transmission times to the mobile worker node increase largely, nearly 3 times in comparison to the measured values in the indoor environment. Indeed, the service completion time becomes larger than the one in the indoor setup, however, it is still much lower than the case of computing on one device. The outdoor setup is a proof of concept demonstrating  that the oneVFC platform can support mobile computing nodes in outdoor environments whenever the network connections among fog nodes are guaranteed by the underlying network.

C. DEEP LEARNING MODEL TRAINING IN FEDERATED LEARNING APPROACH
Training a deep learning model can take a long time and usually requires a high end, or cloud-based server, especially when large training datasets are involved. In an FL approach, a so-called training round, depicted in Fig. 12(a), goes as follows: any edge node will train the model with its locally collected data to obtain the local model and send the local model parameters to the server node, the server aggregates the received local models to improve the global model, and the server updates the global model to the local devices [28], [37], [38]. Those training rounds are repeated till the required model accuracy is achieved or till the maximum number of training rounds is reached.
A FL training approach does not require to transfer local data to the server, only model parameters are exchanged, preserving user data privacy which is a big concern in IoT/IoV environments. Model aggregation can follow different approaches, e.g. Federated Averaging [28], selective model aggregation depending on the data quality collected by travelling vehicles [39], [40]. These issues belonging to the business logic of each FL-based training process will be embedded in the application execution program, while the management of communication of model parameters among edge nodes and server will be performed by oneVFC. Possible application scenarios can involve several vehicle-worker nodes, taking the roles of edge nodes and collecting images/videos by means of their cameras while travelling on roads, and a gateway-manager working as server node.
We evaluate a mAlexNet model training with FL in which 3 worker nodes (Raspberry Pi kits) operate as local devices which store local dataset of images and have the DL model program installed. The mAlexNet model parameter file is exchanged between manager and workers. The manager node directs the FL based learning process and aggregates the feedback of the local devices to build the global model. VOLUME 9, 2021 The request for FL training service will contain the list of worker nodes' addresses joining the training process and the dataset addresses in each node. At every training round, the execution of Python code for model training is called at the workers' AE-COMPUTE, and the Python code for model aggregation is called at the manager's AE-MANAGE. The model parameter file is exchanged among the manager-CSE and worker-CSEs through the EXECUTION containers as the procedures described in Fig.4. The parameters related to the simulation setup are shown in Table 2. The experimental measurements of the workers and the manager operations are shown in Table 3. The computation time at the worker nodes is the largest in a training round, whereas the data transmission among the worker nodes and the manager is small. The learning curves shown in Fig. 12(b) present the model accuracies of the local models and the global model aggregated at the manager node. One can see the accuracies gradually improve with increased number of training rounds.

VII. CONCLUSION AND FUTURE WORK
In this work, we propose a vehicular fog computing (VFC) platform, in which the computation nodes, vehicle OBUs, are exploited to support ITS services. V2X technology, with the two standards-DSRC and cellular-V2X, which facilitate the real-time information exchange among cars and everything, is the main communication infrastructure for VFC. The proposed oneVFC platform is based on the oneM2M standard and has been designed to support AI/DL-based applications in ITS as well as to support AI/DL model training. oneVFC can manage the availability of distributed resources, orchestrate the information flows from data sources, control the computing tasks on vehicular nodes as well as feedback results to the application users. oneVFC can do real-time monitoring and overhead cost assessment which will help the task assignment algorithm to adapt to the current state of the computing machines.
In lab scale testbeds, we have demonstrated that oneVFC is able to manage the deployment of the AI-based applications on various machines resulting in high reduction of application processing time, especially when the workload is high, or when service requests arrive at high pace. Moreover, the oneVFC platform facilitates the deployment of the AI/DL model training in the Federated Learning approach. Concerning VFC, this work is the first attempt to apply the oneM2M standard to design and implement a platform to realize coordination of various computation resources on vehicles to support AI-based applications.
oneVFC has also been tested for a mobile node participating in the computation jobs in an outdoor environment. The results show that the computation job shared with the mobile worker node gets accomplished, however, the reduction of the service completion time is lower due to the larger data transmission times in the new environment and network setup. The underlying network is provided by an outdoor Wifi MESH solution due to the unavailability of DSRC or 5G-LTE. DSRC and 5G networks will soon be deployed offering data rates of about 20-30Mbps and over 100Mbps, respectively, which means that the issue of large data transmission times will be solved. Hence, the efficiency of distributed computing on various worker nodes under the support of our proposed oneVFC platform is confirmed.
For future work, several issues need further investigation. First, different schemes for task assignment and resource allocation should be integrated in oneVFC. Second, oneVFC should be evaluated in a professional vehicular network environment to assess the influence of variable communication conditions. Finally, the required security mechanisms to ensure data authentication and user privacy should be included and their corresponding impact needs to be evaluated.   She has cooperated and coordinated more than 15 national and international projects. She is the (co)author of over 150 publications. Her current research interests include security and privacy protocols for the IoT, cloud and fog, blockchain, and 5G security. She has been a member of the program committee for numerous conferences and workshops and an editorial board member for Security and Communications Magazine. In addition, she has been an expert reviewer for several EU calls, since 2015.
KRIS STEENHAUT (Member, IEEE) received the master's degree in engineering sciences, the master's degree in applied computer sciences, and the Ph.D. degree in engineering sciences from Vrije Universiteit Brussel (VUB), Belgium, in 1984Belgium, in , 1986, and 1995, respectively. She is currently a Professor with the Department of Electronics and Informatics (ETRO) and the Department of Engineering Technology (INDI), Faculty of Engineering, VUB. She is currently involved in several national and international the IoT/edge/cloud projects with industry and with academic partners in Europe, Vietnam, and Cuba. Her research interests include the design, implementation, and evaluation of wireless sensor and actuator networks for building automation, environmental monitoring, autonomous ground vehicle applications, mobility control and smart grids, taking into account security, and privacy aspects.