GeFL: Gradient Encryption-Aided Privacy Preserved Federated Learning for Autonomous Vehicles

Autonomous vehicles (AVs) are getting popular because of their usage in a wide range of applications like delivery systems, self-driving taxis, and ambulances. AVs utilize the power of machine learning (ML) and deep learning (DL) algorithms to improve their self-driving learning experiences. The sudden surge in the number of AVs raises the need for distributed learning ecosystem to optimize their self-driving experiences at a rapid pace. Toward this goal, federated learning (FL) benefits, which can create a distributed learning environment for AVs. But, the traditional FL transfers the raw input data directly to a server, which leads to privacy concerns among the end-users. The concept of blockchain helps us to protect privacy, but it requires additional computational infrastructure. The extra infrastructure increases the operational cost for the company handling and maintaining the AVs. Motivated by this, in this paper, the authors introduced the concept of gradient encryption in FL, which preserves the users’ privacy without the additional computation requirements. The computational power present in the edge devices helps to fine-tune the local model and encrypt the input data to preserve privacy without any drop in performance. For performance evaluation, the authors have built a German traffic sign recognition system using a convolutional neural network (CNN) algorithm-based classification system and GeFL. The simulation process is carried out over a wide range of input parameters to analyze the performance at scale. Simulation results of GeFL outperform the conventional FL-based algorithms in terms of accuracy, i.e., 2% higher. Also, the amount of data transferred among the devices in the network is nearly three times less in GeFL compared to the traditional FL.


I. INTRODUCTION
The surge in autonomous vehicle (AVs) usage in diverse fields has increased the research and development in VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ computation hardware, sensors, communication infrastructure, and optimization algorithms. It results in considerable computation and communication technology advancements, making the hardware performs efficient and affordable. Simultaneously, the communication between nodes in the network also becomes secure and fast. Due to this, AV companies upgrade themselves from the traditional centralized client-server architecture to the edge-based distributed computation systems (as shown in FIGURE 1) [1]. AVs comprise many sensors through which they gather input from the environment, and an intelligent algorithm decides further actions, such as how much driving wheel must be steered, how much increase in speed, when to apply breaks, and many more. From the study [2], we can infer that AVs generate nearly 4000 GB of data daily, comprising the feed from sensors and cameras. The sensors like light detection and ranging (LIDAR) generate 10-70 MB of data per second, and a global positioning system (GPS) generates 50 KB of data per second. The sound navigation and ranging (SONAR) generates 10-100 KB of data per second, the radio detection and ranging (RADAR) generates 10-100 KB of data per second, and the cameras generate 20-40 MB of data per second. A single car generates a vast amount of data. When we consider the whole system at scale, nearly 10000 AVs in an ecosystem, the traditional FL-based algorithms require a highly sophisticated infrastructure to process and get information. AV's intelligence is based on the performance of its submodules [3]. Let's consider a scenario where AV decides to increase or decrease the speed. This decision is based on numerous factors, such as if a RADAR sensor detects a pedestrian, AV's intelligence system increases or decreases its speed based on the pedestrian distance. In the other scenario, the RADAR is used to decide to change the driving lane. The other sub-modules that classify traffic signs like bump ahead, speed limit, crossroad, and no overtake also contribute to determining the speed of AVs. Humans can easily make these decisions, but it is not easier for machines as they only understand bits and bytes. However, with the recent advancements in machine learning (ML) and deep learning (DL) algorithms, it becomes feasible to design intelligent sub-modules contributing to the central decision system and enhancing the overall operation of the self-driving experience.
The popularization of AVs is reshaping the future of the autonomous industries, due to which the automakers started rapid manufacturing of AVs to cope with the market competition [4]. Due to this, the automakers produce AVs with partial intelligence, performing tasks like self-parking and applying brakes at an appropriate time to prevent accidents. Most automakers are currently in the prototype stage for fully autonomous vehicles [5] which perform each of the driving tasks without supervision. However, the partially implemented AVs currently running on the streets are facing issues regarding the privacy concerns of the endusers. Due to this, the AVs' acceptance is deteriorating. Furthermore, algorithms presently working in AVs utilize the end-user data comprising their travel history, communication with co-passengers, and other critical information. Because of this, end-users and companies' data are now at high risk of facing privacy concerns.
Precaution steps for achieving privacy in AVs addressed in [6] can be helpful through a shared consensus mechanism. Another way to maintain privacy in the traditional AVs infrastructure where servers perform the computation in implementing the proper hashing with file encoding for preserving the privacy [7]. Other types of privacy invasions and their solutions with detailed explanation elaborated by the authors in [8]. Motivated by this, the authors have identified two significant gaps: the first is to handle privacy concerns for users, and the second is to decrease the server-side computation and distribute some of the load among the edge devices.
The aforementioned privacy issues are resolved by moving the computation to the edge server, which reduces the computational overhead of the centralized server. Using this concept, the authors proposed an algorithm in this paper. The data generated by AVs are not shared with the servers and remain with them; instead, the inference of the computed data is sent to servers. After the proper data collection, the server processed the gathered data and shared the updated prediction model with AVs to improve its overall performance. Through this, the amount of computation needed by the server has decreased. Simultaneously the proposed also reduces the amount of data to be transferred by AVs to the server. A similar problem is solved in the personal movie recommender systems field by the authors of [9].
To overcome the privacy issues and high computation in AVs, this paper proposes a model, i.e., GeFL. To represent the benefits of GeFL, we simulated classifying traffic signs using a convolutional neural network (CNN) and federated learning (FL) without violating privacy. It is based on gradient encryption to improve the security and privacy of federated (distributed) learning algorithms [10]. As discussed above, many modules in AVs work simultaneously for the efficient driving task. Exploring all modules within a paper is challenging, so we have focused on one module that classifies traffic signs. For conveying the simulation process, we considered a dataset having 43 unique labels and over 50,000 images of German traffic signs [11]. The distribution of these images is done randomly among the clients. These clients train the local model, followed by the making of gradient encrypted images, syncing up with the server, and retrieving the updated classification model. The smaller size of the classification model help to utilize edge computation power and gradient encryption, maintaining privacy without any overhead cost.

A. MOTIVATION
In traditional FL, local devices fine-tune the models and share data among their peers to achieve outstanding performance globally [12]. The problem with this approach is the data shared among peers, i.e., the end user's privacy is not maintained by sharing user data. The existing advanced security algorithms like blockchain requires advanced implementation skills with added computational cost. Based on these factors, the authors are motivated to put forward GeFL, which is highly efficient and does not require additional computation and implementation skills. The GeFL encrypts data using the classification model only; furthermore, the output from the last layer is encrypted to achieve robust privacy among peers. Since the model weights keep changing, retrieving original data using algorithms like GAN becomes difficult. To demonstrate the GeFL, the authors have utilized the AVs submodule that classifies the traffic signs.

B. CONTRIBUTIONS
Following are the major contributions of the paper.
• We propose a hybrid framework GeFL, consisting of the FL and CNN model for privacy-aided optimization of AVs. To demonstrate the working of GeFL, we designed a decentralized traffic sign recognition system. Furthermore, to preserve end-user privacy, we utilized gradient encryption techniques and the proposed framework encrypts images at the last convolutional layer's output.
• The amount of data transferred in the decentralized network and the accuracy at which the classification model works are essential performance metrics for the simulation of GeFL. The proposed GeFL decreases the amount of data transfer by three times compared to the traditional FL model. Moreover, the accuracy at which the AVs prediction model works shows a rise of 2% in GeFL compared to the models utilizing centralized training. The decrease in data transfer reduces the company's operations cost, decreasing carbon footprints. Thus, the GeFL is sustainable and economical for the people.
• The proposed GeFL framework maintains the privacy of the end-users without a performance drop. Therefore, the proposed framework can offer the best services to its end-users without any privacy concerns.

C. ORGANIZATION
The paper consists of seven sections. Section II presents a literature review. Section III presents the system model and problem formulation. Section IV presents the CNN and FL-based proposed GeFL model. Section VI defines the environment under which simulation of GeFL for traffic recognition is carried out. Section VII presents the results and discussions and finally Section VIII concludes the paper.

II. LITERATURE REVIEW
This section discusses various state-of-the-art works carried out by researchers in the field of AVs. In recent years, numerous advancements have been implemented in AVs to improve their functional ability. In the last couple of years, researchers across the globe have done great work, in the field of AVs, by introducing the concepts of decentralized learning, also known as FL, and found promising results. Using FL, the researchers have found solutions to problems like secure data sharing, efficient and effective driver recommendation systems, privacy preservation during disasters, fault tolerance, and traffic classification.
In 2020, the authors in [13] proposed a blockchainempowered asynchronous FL scheme for secure data sharing on the internet of vehicles (IoVs) using a directed acyclic graph and permissioned blockchain to resolve problems like security and reliability. They introduced the asynchronous FL using the node selection technique with the adoption of deep reinforcement learning for better accuracy. Further, the authors in [14] demonstrated another use of a decentralized system in which they proposed an image classification model using FL. They utilized a binary image classification scheme on the balanced dataset and found better results than the centralized system models. Then, [15] proposed a decentralization system for a unique driver recommendation. They established the relationships between drivers' stress and their behavior on driving skills using long-short term memory (LSTM) and CNN-based models to indicate the best driver for the subsequent trip. Their proposed system is based on FL and implemented on the UAH-DriveSet dataset. They achieved an accuracy of 95% on the proposed FL-based model of LSTM and CNN to enhance the driving experience and quality for driver and passenger.
Lui et al. have proposed a data-driven pedestrian trajectory prediction in public buildings in the paper [16]. The authors first evaluate destination-driven pedestrian trajectory prediction (DDPTP) for the most likely destinations of the pedestrian with a destination classifier (DC) based on the trajectory images' sample. The model then predicts future trajectories with DTM (destination-specific trajectory model). The authors' solution was evaluated on different datasets such as NYGC and the ATC and outperforms the existing state-of-the-art models.
In the year 2021, the authors of [17] focused on the fault tolerance of AV systems. The authors proposed a byzantine fault tolerance (BFL) scheme for a decentralized system for VOLUME 11, 2023 AVs. According to the authors, AVs use ML techniques to improve their ability to self-drive. Therefore, they proposed a novel scheme, called byzantine-fault-tolerance decentralized, federated learning (BDFL), for AVs that focuses on privacypreserving with FL with a fault tolerance system. Further, in the paper [18], the authors proposed a decentralized FL approach for connected AV. They considered blockchain and FL-based learning approaches for privacy preservation and efficient vehicular communication networking. The proposed model implemented the on-vehicle local machine learning model whose updates are verified and exchanged in a distributed fashion.
The authors in [19] proposed an end-to-end FL scheme for AV where they validate their methodologies to anticipate the angle of wheel steering by utilizing on-gadget AI with FL. Their results show promising accuracy compared to the traditional deployment of ML/DL in the real world without any adverse effects. Then, in [20], the authors focus on an important problem of an AV which is traffic management. According to the authors, the DL-based traffic classification models are very much in use and in demand due to their ability for accurate classification. On the other hand, proposing an FL-based model gives privacy preservation with collaboratively trained learning models. The authors proposed a cross-silo horizontal FL scheme that outperforms the traffic classification task.
The authors in [21] used a DL-based system to classify traffic signs in AVs. They have used you only look once (YOLOv3) model to recognize the traffic signs and achieved 100% accuracy. However, the proposed model takes an exceptionally long time to distinguish traffic signs, i.e., 36.907457 and utilizes the conventional strategy rather than distributed learning. Another shortcoming is that they have considered fewer traffic signs for classification, which restricts the model learning. Then, in [22], the authors proposed a driving control system by using an end-to-end DL for autonomous driving. They explored NVIDIA's CNN work and proposed an LSTM and CNN-based model. The authors have also used the Hadoop distributed file systems (HDFS)-based cloud server to fetch the Euro Truck simulator driving data to propose a system that provides a driving command after processing the images. Further, the authors of [23] have done a literature survey on direction analysis and steering angle prediction using DL for the autonomous system. They collected the dataset on their own, used the Udacity self-driving challenge dataset, and compared the result of the proposed deep neural network and its neurons using DeepTest with the Rambo and Chauffeur models. The authors' evaluation provides meaningful insights and clear evidence related to the betterment of the challenge present in a self-driving car. However, the authors did not focus on security and privacy-related issues. In [24], the authors present a blockchain-based FL system for autonomous and connected vehicles. Their approach to the blockchain enhances data protection but is insufficient in preserving the data privacy of the AVs due to a lack of gradient encryption. Therefore, they proposed a distributed learning scheme where the local model is trained on local data and then shared with the global model for better performance. For blockchain, a unique consensus algorithm, i.e., proof of federated learning (PoFL), is introduced to resist potential adversaries. Then, The authors of [25] presented an autonomous driving system using DL models. They proposed extensive use of a neural network, DL, and deep reinforcement learning method to improve the learning ability of self-driving cars. In addition, they introduced a module-level architecture to implement self-driving cars using AI. The paper also discusses the advantages and disadvantages of AI in AV.
The authors of paper [26] proposed an optimized quantumbased federated learning framework to defend against adversarial attacks in intelligent transportation systems. The proposed quantum-behaved particle swarm optimization system performed pretty promisingly for different attacks on the system with federated learning. The author also discussed the results of the experiment in detail and did an extensive study on the achieved results. The authors found really good applications of FL in the AV domain and found decent results. Pokhrel et al. did an analysis and design challenges for federated learning with blockchain for autonomous vehicles in the paper [27]. The authors focus on privacy and efficient communications networking for AVs and find the blockchainbased FL model to be the most suitable option. The authors proposed an optimal blockchain-based federated learning model being a decentralized model that performs well for the defined objectives. The author also lists out future challenges that are really interesting to work on. The author applied FL with blockchain and outperforms on the defined goal of privacy and communications. In the paper [28], Zeng et al. use the federated learning-based model for on-road autonomous controller design for connected and autonomous vehicles (CAVs). The author focuses on designing a controlling system that can accurately work for real-time decisions such as frequent speed changes, stop-and-go, and highway merging. In the proposed framework, the authors trained the learning model collaboratively among a group of CAVs' controllers. The author proposed a novel dynamic federated proximal (DFP) algorithm that accounts for the mobility of CAVs, the wireless fading channels, as well as the unbalanced and non-independent and identically distributed data across CAVs. The authors extensively use FL for designing an efficient controlling system for CAVs.

III. SYSTEM MODEL AND PROBLEM FORMULATION
This section layouts the traffic sign recognition system model and problem formulation with a privacy preservation system.
A. SYSTEM MODEL FIGURE 2 elaborates the working of the system model. All the AVs and server is pre-equipped with an inceptive trained model with the same configuration. All AVs use this model to start predicting traffic signs in real-time. Whenever the model starts predicting incorrect traffic signs, the passenger notifies the system of the correct corresponding label. The system within AVs acts similarly to the feedback-based training [29]. Through a feedback mechanism, an internal pipeline that invokes training will take subsequent actions. The internal pipeline initiates the pre-processing of wrongly labeled images and stores it as a key-value pair with the image path as the key and the corresponding label from feedback as value. The processing of incorrectly labeled images is a batch process. After the processing of a certain amount of batches in the system, the pipeline initiates the training session. If there is a significant change in the performance metrics of the classification model. The pipeline passes the processed images into the local model i.e. classification model for gradient encryption. The output of the last convolutional layer from a specific image is considered its gradient-encrypted image. Finally, the weight of the classification model, the gradient encryption images, and the labels are uploaded to the server for global optimization. The newly updated local model is not replicated directly into the AV. They have to wait for the updated model from the server; till then, they use the previous model only.
The role of the server is to receive the client's updates, process them, and transfer them to all the AVs. The server defines the new model by taking an average of updated weights uploaded by the AVs. Then, the server processed the key-value pair containing gradient encryption images and corresponding labels. After the completion of a training session on the inputs, the newly achieved weights are downloaded by every AV. The whole flow is an asynchronous process that involves several listeners taking action based on events. FIGURE 2 represents the pictorial system model.

B. PROBLEM FORMULATION
In GeFL, the input data are the range of media files from the AVs like engine temperature, camera feed, sensors data, drivers drowsiness, etc. The optimization of the AI system is carried out at two places first one is within the AV, and the other is at the server. The server gathers the gradientencrypted data from the AVs, then performs the training task and shares the optimized model among its peers. The authors decided to mimic a traffic sign recognition system to simulate the proposed system model, i.e., GeFL. Initially, pre-processing is required on input data, where the authors resized the image into a 60×60 pixel matrix through bi-linear interpolation [30]. So let's consider x as an input image with h as height, w as width and x pre as 0 valued 60 × 60 matrix.
where w ratio , and h ratio are the resizing factors. Variable i and j are iterating values, p, q, r, and s are the value used for the interpolation process to create pre-processed image x pre .
After pre-processing, the model present in AV will start their training and optimize themselves. To do so, let's consider the parameter space of CNN at n th as.
where W i represent the weight matrix of i th layer. The output by t th layer will be for q th iteration is represented with ReLU [31] as an activation function.
where o q represent the output matrix for q th iteration The last layer of the prediction model is a dense layer with sigmoid [32] as an activation function, so the output from the last layer is. where, The loss function [33] for n th AV is: where, After the loss is calculated, the updation of weights is carried out at q th iteration is given as.
After completion of training at AV, for gradient encryption, the input data is replaced by the output of the last convolution layer of the CNN model. Then, the gradient encryption data with labels and model weights are inputs to the server. For example, the gradient encrypted data for image x, the model m with n as the last CNN layer, is The x enc is used as an input to the dense model whose weights are updated at the server. This process is done in batches, so the training session is carried out when a specific amount of data is collected from AVs. The mathematical equation for input is given as follows.
where inp is input for server model, W i are the weights for i th model, x enc i and l i are the list of gradient encrypted images and corresponding labels. The weights of the dense model are updated on the server similarly as done in AV. The mathematical equation for this is elaborated from Eqs. 11 to 18.

IV. GeFL: THE PROPOSED MODEL
In this section, the authors put forward the working of GeFL and the algorithms behind it. The solution for the defined problem is partitioned into two parts, as described in the following points.
• Federated Learning algorithm-It helps to confront the AVs' privacy concerns by not sharing the real-time data collected by the AVs with the server. Instead, it encrypts the data gradients and shares them with the server for global optimization. In addition, the traffic sign dataset is entirely distributed, covering distinct road scenarios and traffic signs of different countries. It covers different real-time scenarios of the road and optimizes the efficiency of individual AVs across the globe.
• CNN-based model-For traffic sign classification, the CNN-based model is considered. The model requires less in size with good performance, as discussed in subsection VI-C. Utilization of the defined model in subsection VI-C is to predict traffic signs and also perform gradient encryption for input images. The gathered gradients are put into the optimization process to improve the model's performance. When the training session at the server is completed, each AV in the network replicates weights present in the server. Img ← preprocess(Img) preprocess is functioned for pre-processing input images 6: Pred ← Md(Img) 7: if Pred ! = Img label then 8: W temp ← W local 9:

A. CLIENT SIDE WORKING ALGORITHM
This subsection presents the algorithmic flow of the simulation at the client's side, i.e., AV. They are equipped with different cameras and sensors to collect the surrounding environment data comprising the number of pedestrians, the count of vehicles, and road terrain. The feed from the camera is utilized as an input in various AI sub-modules of AV. When all sub-modules provide their inference, AI decides the action during the driving task. All submodules have their intelligence system helping them to infer results independently. It is hard to design all modules and elaborate on the concept of GeFL. Due to the aforementioned reason, we designed one of the submodules to demonstrate the working of GeFL.
To simulate the GeFL model, we developed a traffic sign recognition module by utilizing the standard dataset of a traffic sign. This module uses the feed from the front dash camera of the AV. The images from the front camera and the updated model from the server are considered inputs for this module. If there is some update from the server, the new weights are assigned to the local model. Then, the local prediction model predicts the results; however, if the user notices that the prediction is wrong, another training session is carried out within the AV. If there is a significant improvement compared to previously known weights, then the gradients of the image, i.e., the output of the last CNN layer, its label, and the weights of the new model, are transferred to the server. Moreover, to maintain integrity, the updated local weights are only replicated when the server sends the updated weights. Until then, AV uses the last known weights only. A detailed algorithmic elaboration of this approach is shown in algorithm 1.

B. SERVER SIDE WORKING ALGORITHM
This subsection gives insights into the processing carried out on the data received by the server for full filling the update requests by the client-side algorithm stored in the buffer. The data sent by AVs are the weights of their model, gradient encryption images, and labels specified by the user. The data stored in the buffer is processed in batches after a specific amount of time by the server. Initially, the server assigns new model weights to the average parameters sent by AVs. Then, in the second step, the server uses specific model weights to create an array containing gradient encryption data and the labels shared by all the AVs. Once the list is created, the training session is initiated. After training, if there is a significant change in the result, the server updates the parameter and propagates it to the AVs. In this process, the server doesn't know the input images because the authors have shared the images through gradient encryption; hence, privacy is preserved without compromising performance in terms of computational overhead. A summarized explanation of this approach is explained in the algorithm 2.

V. THREAT MODEL
The GeFL framework put forward a unique encryption algorithm that keeps changing continuously whenever the Update weights for server CNN model 22: end for 23: end for 24: if abs( (W temp W server )) >= U then 25: for itr = 1, 2 . . . , n do 26 end if 29: end procedure fine-tuning of the classification model takes place. The main threat to any encryption algorithm is ''how easy it is to decrypt'' data from the encrypted one. In GeFL, the parameters of the convolutional layer keep changing based on the data passed to it. Moreover, the last convolutional layer extracts complex features from the data. So, reverse computation to decrypt the data also becomes complex. Advanced algorithms like GAN have the capability to decrypt gradient encrypted-based images. However, the GeFL works in a distributed environment and the classification models keep maturing, which makes it hard to decrypt data. Based on the above modeling, the main threat to the GeFL is terminated.

VI. SIMULATION ENVIRONMENT
This section focuses on the experimental setup to analyze the performance of GeFL. To demonstrate the working of GeFL VOLUME 11, 2023 in AVs, the authors simulate it on a traffic sign recognition system.

A. DATASET DESCRIPTION
The dataset used for the proposed model is the German traffic sign benchmark dataset used in IJCNN 2011 [11]. This dataset is for a multi-class and single-image classification task. It consists of over 50000 RGB images distributed among 43 unique classes. Among 50000 images, nearly 39000 images are used for training, and 11000 images are used for testing purposes. Since the dataset is to be distributed among the clients for FL, the authors have made a few tweaks for data distribution after pre-processing. In the below subsection, algorithms for pre-processing and data distribution are elaborated. for itr = 1, 2 . . . , n do 5: λ ← resize(Img [itr], dimenion = (60, 60)) Image resize to 60*60 6: for i = 1, 2 . . . , 60 do 7: for j = 1, 2, . . . , 60 do 8 Img pre → append(λ) 14: end for 15: (Img pre ) returns the Img pre values 16: end procedure

B. PRE-PROCESSING
Since the dataset contains irregular dimensions of images, we transform the images into uniform sizes. To do so, we have used a bi-linear interpolation as shown in Eqs. 1 to 10, thereby it helps in carrying out model training easily. In the algorithm 3, the λ stored the resized image having dimension 60×60. For normalization, we divide each pixel by 256 because these images are 8-bit images. In the algorithm 3, nested for loops normalize each pixel by dividing it by 256. The normalization of pixel values makes the data distribution in a small range, i.e., 0 to 1; this makes the convergence of the model faster than before [34]. The image with newly normalized pixel values is stored in the Img pre for all n images in the input dataset Img. Algorithm 3 returns the Img pre that has the pre-processed images transformed into 60 × 60 size and with normalized pixel values for all the channels. To preprocess all n RGB images with the help of algorithm 3 requires time equal to n × a × b, where a × b is a new size of the image. In our case, the time complexity for algorithm 3 will be n × 60 × 60.

C. CNN TRAFFIC RECOGNITION MODEL
This subsection describes the CNN-based traffic recognition model with an aim to preserve privacy in AV. The authors explored several state-of-the-art pre-trained models, for instance, InceptionNet, Resnet, and many more for model training. However, these models have millions of parameters; InceptionNet has 24 million parameters, and Resnet has 23 million trainable parameters. Incorporating such architectures for model training increases the computational overhead of the proposed model. The smallest pre-trained model among the aforementioned models is MobileNet having more than 3 million parameters. However, the requirement for our model is to be less in size and efficiently preserve privacy in AV. As the model has to be deployed in the AVs, the number of parameters should be low as possible. Due to this reason, we defined a CNN model in GeFL having nearly 418,500 parameters. The architecture of the CNN model used during the simulation is described in FIGURE 3 and layerwise parameters are defined in Table 1.

D. DATA DISTRIBUTION AMONG CLIENTS
To simulate the GeFL, we created virtual clients, trained the defined clients independently, propagated the weights and gradient encrypted images to the server, and trained the model. This process includes data distribution among the clients, keeping track of their training, and propagating information among themselves. Algorithm 4 shows a detailed description of the data distribution we used in the GeFL, which gives readers more insights regarding GeFL. The authors distribute the data into (C) number of clients. Considering the total number of images is (n), each client has n/C images for training. Further, one for loop iterates through C and, depending upon the number of the image per class, size is stripped and stored to α x . From the set of label (L), labels correspond to images in α x are stored in α y . Next, the D stores the images and corresponding label pairs of each C. At the end of algorithm 4, D returns the i th index, which has the list of image and labels pair of each client i for model training. Therefore, to distribute the pre-processed images with C, the algorithm 4 requires an order of time C.

E. COMPLETE SIMULATION ALGORITHM
This subsection put together all the above-mentioned algorithms in one place to demonstrate the working of the entire simulation environment. FIGURE 4 represents the sequential flow of the GeFL. In that view, Algorithm 5 displays the core of the entire simulation process for GeFL with original images from the dataset, their corresponding labels, and the number of clients as the input. A random valued weights model is initially assigned to a global variable named S. Then, the input images and the class labels are zipped together to form data pairs. The amount of data allocated to each client is represented by size having value n/C , where n is the number of images in input. During the data allocation to each client, the images are pre-processed using the steps shown in Algorithm 3. The pre-processing has a time complexity of the order of 1, and data distribution has a time complexity of the order of C. After allocating data, local training at every AV is started in a parallel manner.
Every client train itself for e iterations for m images. Typically, the value of e is minimal compared to m, making the time complexity of the order of m. After training, the client initiates the process for generating gradient encryption images taking the time complexity of the order of m. Then, the client creates an update request to the server by sending the gradient encryption images, labels, and trained weights. The time complexity of Algorithm 1 is in order of m. When all VOLUME 11, 2023 Algorithm 5 FL Simulation Input: Img ∈ {RGB images}, L ∈ {Classes corresponding to each image}, C ∈ {Number of clients}, S ∈ {Server Model} Output: Array of images and their labels for each client. ζ ← [] ζ is an empty list to store clients gradient encryption data 11: ω ← [] ω is an empty list to store weights 12: for i = 1, 2, . . . , C do 13: 16: ω → append(Md i ) 17: end for 18: S ← SEREVER_UPDATION (ζ, ω) SEREVER_UPDATION is server side algorithm 19: for i = 1, 2, . . . , C do 20: Md i ← S 21: end for 22: end procedure the clients sent their update requests to the server. The server initializes the weights to the average of the client's weight. Then the server starts the training session for e iteration with the m×C number of images. The value of m×C is equal to n. After training, the weights are transferred back to the clients. So, the overall time complexity of the Algorithm 5 is in the order of n.

VII. SIMULATION RESULT
This section presents the simulation result to analyze the performance of the proposed FL-based gradient encryption image classification for AVs. In the simulation, we used several parameters, such as the number of clients, the number of communication rounds, the optimizer, and many more, to check the robustness of the proposed model. Table 2 shows all the parameters used while training the FL model.

A. PERFORMANCE EVALUATION WITH ADAM AND SGD
The primary performance metrics in the simulation are local accuracy and global accuracy. Based on the performance of SGD and Adam optimizers, the authors have considered one optimizer to evaluate the framework further. FIGURE 5a and FIGURE 6a represent the average accuracy among all clients after the local training is completed with Adam and SGD, respectively. While FIGURE 5b and FIGURE 6b shows the average accuracy among all clients after the global training is done with Adam and SGD. The y-axis represents the local accuracy for FIGURE 5a and FIGURE 6a, and global accuracy for FIGURE 5b and FIGURE 6b. While the x-axis represents the number of clients (number of AVs) involved, and the z-axis represents communication rounds (CR). The CR refers to the number of times the local model train itself and shares data with the server. From FIGURE 5a and FIGURE 6a, we can analyze that for a small number of CR, the accuracy is significantly less, and it is evident that the model has not matured itself within the given constraint of time. However, as we increase the number of CR, there is a sudden gain in accuracy because the model has updated itself in a sufficient amount of time, as represented in FIGURE 5a and FIGURE 6a. At the number of clients = 25 and CR = 11, we achieve local accuracy of 88.34% and 89.30% by Adam and SGD optimizers. Moreover, in FIGURE 6a it is clear that the bars for low CR have less accuracy for SGD compared with Adam, which signifies that Adam converges faster than SGD.
FIGURE 5b and FIGURE 6b shows the global accuracy achieved by the proposed model using the Adam and SGD optimizers. In both models, the global accuracy reaches above 90.2% on the test data when plotted against the number of clients and CR. However, there are a few shortcomings in the graph, which indicate slight overfitting on the set of data allocated to that client for Adam, shown in FIGURE 5b. Unlike adam, FIGURE 6b shows no such overfitting but it indicates less accuracy in SGD. The reason is that SGD is unstable, making it more likely to fall to global minima than adam. The less number of clients has high accuracy, while in the real-life scenario, that's not possible. When the server fine-tunes the weights increase in the number of clients shows advantages. For fewer clients, there might be the chance of overfitting, but an increase in clients brings a variety of gradients and prevents overfitting.

B. BEST PERFORMING PARAMETER
After simulating the data on different input parameters, we found that Adam is the best optimizer in the given scenario. The Adam optimizer with 17 clients and 7 CR  gave us a test accuracy of nearly 98%. We also inferred results for SGD. In SGD, the best scenario was at 24 clients with 8 CR, which gave a test accuracy of nearly 95%. From the test accuracy as well, Adam outperforms SGD. The secondary performance metrics are loss, precision, recall, and f1-score. From the above analysis, we narrow down our hyperparameters to 17 clients and 7 CR with Adam and 24 clients, and 8 CR with SGD. The comparison of the training history of 100 epochs with accuracy, loss, precision, recall, and f1-score for both the optimizers are  shown in FIGURE 7b, FIGURE 7a and FIGURE 8a, respectively. The value of the best input parameter, i.e., 17 clients and 24 clients, proves that FL is advantageous compared to traditional deep learning or machine learning approaches [35]. The final result is better than traditional deep learning. The reason behind this improvement is the increase in randomness from several machines. This increase in randomness decreases the chances of biases, which helps to avoid overfitting training data.

C. DATA COMPRESSION
The important metric for the proposed framework is the compression ratio. The gradient encryption helps in achieving the data compression of the AV. The output from the convolutional layer helps to significantly reduce the data size. For instance, if the traditional FL is employed on the proposed model, then data transfer per image is 10.8 kB. However, using gradient encryption, the data transfer per image is 3.2 kB. When the data transferred per image is compared, the GeFL framework helps to achieve three times more compression than traditional FL. FIGURE 8b represents the original data and gradient encryption, where 3200 bytes are extracted from the original data in terms of gradient encryption.

VIII. CONCLUSION
In this paper, we explored the existing state-of-the-art approaches that provide intelligence to AV by using FL; however, privacy concerns raise security loopholes in the AV system and degrade its performance. To solve the aforementioned problem with minimum computation requirements, we proposed a model, i.e., GeFL, which uses the gradients of the last CNN layer as the source of encoding, where the gradients (weights) continuously changes even after the model training. The weight obfuscation makes it difficult for the attacker to extract or acquire original data from the data stream. The proposed model is developed for efficient and reliable traffic recognition systems for AVs. Further, we decreased the data transfer size by three times without any decrease in the performance of the proposed model. The empirical result revealed that the GeFL outperforms the traditional machine learning approaches by 2%.
For future work, the amalgamation of blockchain with robust cryptography will be used to mitigate modern-day attacks and reduces the system's complexity by using efficient DL algorithms. ABDULLAH ALHARBI received the M.Sc. degree in information technology from the Rochester Institute of Technology, Rochester, NY, USA, and the second master's degree in information assurance and cybersecurity and the Ph.D. degree in computer science from the Florida Institute of Technology, Melbourne, FL, USA, where he also got Information Assurance and Cybersecurity Graduate Certificate. He is currently an Assistant Professor of computer science with King Saud University, Riyadh, Saudi Arabia. He is also the Dean of the College of Applied Computer Sciences, King Saud University (Muzahmiyah Branch). He is the CEO of the Information Security Association (Hemaya), a non-profit orgnization, Saudi Arabia, and a Research Fellow with the Center of Excellence for Information Assurance, King Saud University. Previously, he was the Chair at the Department of Administrative Sciences, Community College, King Saud University. His research interests include wearable devices security, transparent and continuous security, alternative authentication, usable security, and behavioral biometrics. and more than 150 articles. His research interests include electric energy transmission and distribution grids, microgrids, smart grids, power quality, and artificial intelligence applied in power systems. He is a member of the International Association of Engineers (IAENG) and the World Academy of Science, Engineering and Technology (WASET).
MARIA SIMONA RABOACA is currently working as a Researcher with the Department of Hydrogen and Fuel Cell, National Research and Development Institute for Cryogenics and Isotopic Technologies-ICSI Rm. Valcea. Her Ph.D. thesis is ''Theoretical and practical Contribution regarding to sustain with hybrid energy a Passive House'' in Faculty of Building Services Engineering, Technical University of Cluj-Napoca, Romania. She is currently a Project Manager with ICSI to project ''Smart conductive charging station, fixed and mobile, for electric propulsion transportation (SMiLE-EV)'' proposes the deployment of fixed and mobile EV and PHEV charging stations to meet the mobility needs of tomorrow's society and to prepare active/potential industrial partners for knowledge/technology transfer at the component or system level in prepare launching new products. She has been contributing to the field of renewable energy, green buildings, passive house concept, hydrogen energy, and stationary and mobile applications. She has published 68 papers, out of which 40 are listed on the Web of Science Core Collection (WoS). In total, her 40 WoS papers received 436 citations, 12 chapters, providing a Hirsch index equal to nine. The published papers contain subjects related to digital control systems using dSPACE and FPGAs HILs testing models, embedded systems, real-time applications. VOLUME 11, 2023