Federated deep learning for cyber security in the internet of things: Concepts, applications, and experimental analysis

In this article, we present a comprehensive study with an experimental analysis of federated deep learning approaches for cyber security in the Internet of Things (IoT) applications. Specifically, we first provide a review of the federated learning-based security and privacy systems for several types of IoT applications, including, Industrial IoT, Edge Computing, Internet of Drones, Internet of Healthcare Things, Internet of Vehicles, etc. Second, the use of federated learning with blockchain and malware/intrusion detection systems for IoT applications is discussed. Then, we review the vulnerabilities in federated learning-based security and privacy systems. Finally, we provide an experimental analysis of federated deep learning with three deep learning approaches, namely, Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and Deep Neural Network (DNN). For each deep learning model, we study the performance of centralized and federated learning under three new real IoT traffic datasets, namely, the Bot-IoT dataset, the MQTTset dataset, and the TON_IoT dataset. The goal of this article is to provide important information on federated deep learning approaches with emerging technologies for cyber security. In addition, it demonstrates that federated deep learning approaches outperform the classic/centralized versions of machine learning (non-federated learning) in assuring the privacy of IoT device data and provide the higher accuracy in detecting attacks. INDEX TERMS Federated learning, intrusion detection, deep learning, cyber security, the IoT, blockchain.


I. INTRODUCTION
The Internet of Things (IoT) is defined as the use of communication protocols and sensing equipments such as sensors, laser scanners, radio frequency identification, etc., to enable control system devices to be connected to the Internet. During the last few years, IoT technology has been widely used in the following areas: Internet of Vehicles, Manufacturing industry, Internet of Drones, Internet of Healthcare Things, Mobile Crowdsensing, Cyber physical systems, Agriculture, The associate editor coordinating the review of this manuscript and approving it for publication was Vicente Alarcon-Aquino . etc [1]. As IoT technology develops rapidly, there are millions of embedded physical devices, where each IoT device is interconnected and exposing data that can potentially affect the privacy and personal well-being of their users. In the absence of a credible security defense systems implemented on the IoT devices, they can be attacked by hackers [2] and are representing a large attack surface that is actively exploited.
The availability of modern Machine Learning (ML) is gaining more attention than ever before for its potential to extract useful and complex data models using large datasets from a central location [3]. With traditional machine learning, the learning data is collected on a centralized server. without addressing the privacy concerns as well as reducing data transmission cost. In addition to other security measures, such as Blockchain and authentication [4], [5], the machine learning techniques can be used by intrusion detection systems in order to identify normal and malicious actions [6], [7]. The term of privacy-preserving machine learning has become popular nowadays [8]. The idea of federated learning is proposed by Google [9] to overcome data privacy issues by leveraging collaborative learning across a wide range of devices (i.e., IoT devices). However, there are various limitations to the application of traditional federated learning in IoT applications, including, the reliability of the learning model as well as of the central server. By modifying the local model, if the central server (i.e., Edge server) crashed or modified the global model maliciously, updating accuracy of all local models at IoT devices will be significantly affected [10]- [16]. The constraint of power in IoT devices is a major issue for the deployment of federated learning. This resource limitation requires that energy consumption should be optimized for the implementation of federated learning [17].
The federated learning achieves great success and is widely used in many fields, e.g., mobile edge network optimization [18], Google keyboard query suggestions and prediction [19], [20], COVID-19 detection [21]- [23], vehicles communications [24], Internet of Drones [25], Augmented reality [26], Intrusion detection [27]- [29] etc. Therefore, many cyber security researchers have difficulty in finding the best learning type (i.e., centralized or federated learning) to test and evaluate their proposed security methods in IoT applications, and selecting an appropriate federated deep learning method is an essential issue in this field. Hence, we are motivated to realize a comprehensive study with an experimental analysis of the use of federated deep learning for cyber security in the Internet of Things.

1) CENTRALIZED LEARNING
Machine learning for IoT applications has conventionally been performed by uploading all the data from each IoT device connected with the cloud servers to build a standard model which can be shared and implemented across devices. The main benefit of centralized learning is the ability of the model to perform generalization using data from a cluster of IoT devices and then work with other relevant IoT devices instantaneously. However, there are some issues for traditional centralized learning such as privacy, latency, bandwidth, and connectivity.

2) FEDERATED LEARNING
The core concept of federated learning is to create machine learning models that are built on distributed datasets across different devices while avoiding the leakage of data. Specifically, federated learning is a new technique where the current model is downloaded and an updated model is computed on IoT devices using the local IoT data. These locally trained models are then returned from the IoT devices to the central server for aggregation, (e.g., the weights are averaged) and then a combined and enhanced single global model is returned to IoT devices. The distribution of data is important in terms of federated learning deployment and the associated practical and technical challenges. There are currently the following three federated learning types, as presented in Fig 2: • Horizontal federated learning: This type is implemented in situations in which the data sets share the same feature space but differ in the sampling space.
• Vertical federated learning: This type is implemented in the situations in which the data sets differ in the feature space but share the same sampling space.
• Federated transfer learning: This type is implemented in the situations where the data sets has different feature space as well as different sampling space.

B. RELATED SURVEYS AND OUR CONTRIBUTIONS
There are many surveys in the literature that have covered different aspects of federated learning-based frameworks for IoT. As shown in Tab. 1, we classify the federated learning surveys based on the following dimensions: • IoT application: It indicates whether the survey presented a taxonomy for federated learning-based frameworks for cyber security in the internet of things.
• Federated learning-based IDS: It reports whether the study provided a taxonomy for federated learning-based cyber security intrusion detection systems for the IoT.
• Federated learning-based blockchain: It indicates whether the survey reviewed federated learning-based frameworks coupled with blockchain technology for cyber security in the internet of things.
• Threat models in federated learning: It indicates whether the survey considered threat models in federated learning-based frameworks for cyber security in IoT.
• Experimental analysis in IoT: It indicates whether the survey provided an experimental analysis of federated deep learning for cyber security in IoT. Almost all of the surveys on federated learning for IoT applications present security and privacy countermeasures without focusing on an experimental analysis. Yang et al. [30] proposed a review of a secure federatedlearning framework, which includes federated transfer learning, vertical federated learning, and horizontal federated learning. Aledhari et al. [32] a review of federated learning algorithms, which includes use-cases, real-life applications, and hardware platforms. Liu et al. [33] provided an introduction about the integration of federated learning in the context of 6G communications. Jiang et al. [34] presented the challenges and opportunities of the application of federated learning in smart city sensing. Mothukuri et al. [36] provided a comprehensive survey on privacy threats of federated learning, but without an experimental analysis in IoT networks. Kholod et al. [37] analyzed the open-source federated learning frameworks for IoT applications without focusing in cyber security. Rahman et al. [38] provided a comprehensive taxonomies covering privacy and security, resource management, application areas, system models and designs. Nguyen et al. [39] provided a comprehensive survey about the recent advances in federated learning and IoT applications. Wahab et al. [41] presented a multi-level classification of federated machine learning in communication and networking systems. Ali et al. [42] provided an overview about the integration of federated learning and blockchain for IoT applications. Imteaj et al. [43] analyzed the implementation challenges of federated learning algorithms for resource-constrained IoT devices. Nguyen et al. [40] provided an overview of the essential notions about the integration of federated learning and blockchain in mobile edge computing networks. All these related surveys did not cover the application of federated deep learning for cyber security in IoT applications with focusing on experimental analysis.
Lyu et al. [35] provided a brief introduction into FL, alongside a classification for threat models into two major attacks: poisoning and inference attacks. The study points out the insights, the core techniques together with the fundamental assumptions embraced by the different attacks. The FL context brings an additional threat, which is model poisoning, distinct from traditional data poisoning. The goal is to make the global model incorrectly classify a given set of inputs. To explore this issue, Bhagoji et al. [44] conducted a range of attack scenarios, including: targeted model poisoning by intensifying the malicious agent update, improving attack stealth through the use of an alternating minimization strategy, and bypassing Byzantine-resistant aggregation strategies. Which validated the vulnerabilities of FL-based settings to model poisoning attacks. Xu et al. [45] proposed a FL-based privacy preservation scheme, called VerifyNet, which manages the verification of the training process, with homomorphic encryption, pseudo-random technology, and a double-masking protocol to ensure user privacy, verifiability, and confidentiality during the FL process. Results from experiments with real-world data have proved that VerifyNet is practical.
A notable exception is Goa et al.'s [46] recent work that reviews split and federated learning approaches with respect to their communication overheads and conducts an experimental evaluation against two established data-sets for Speech Command and ECG in a Raspberry Pie setup. In this context, we highlight the following research questions (i.e., Fig 17 that need to be solved: • Q1. What are the applications of federated deep learning in IoT networks?
• Q2. What and how is the federated learning used for intrusion and malware detection?
• Q3. What characteristics do the federated learning approaches with blockchain technology have for each of the IoT applications?
• Q4. What are potential vulnerabilities that can be exploited by adversaries in federated learning-based systems for IoT networks?
• Q5. What is currently the best solution between federated deep learning approaches and the classic/centralized versions of machine learning (nonfederated learning) in assuring the privacy of IoT device data and providing the highest accuracy in detecting attacks? To answer the previous questions, the main contributions of this work are: • We review the federated learning-based security and privacy systems for several types of IoT applications.
• We review the federated learning-based cyber security intrusion detection systems.
• We present the use of federated learning with blockchain for IoT applications.
• We review vulnerabilities that can be exploited by adversaries in federated learning-based security and privacy systems.
• We provide an experimental analysis of federated deep learning with three deep learning approaches, namely, RNN, CNN, and DNN. For each deep learning model, we study the performance of centralized and federated learning under three new real IoT traffic datasets, namely, the Bot-IoT dataset, the MQTTset dataset, and the TON_IoT dataset. The rest of this paper is organized as follows. Section II presents the federated learning-based security and privacy systems for several types of IoT applications. In Section III, we provide the federated learning-based cybersecurity intrusion detection systems. In Section IV, we clearly highlight the use of federated learning with blockchain for IoT applications. Then, we review vulnerabilities that can be exploited by VOLUME 9, 2021  adversaries in federated learning-based security and privacy systems in Section V. Section VI provides an experimental analysis of federated deep learning with three deep learning approaches. Section VII highlight the importance of the study and discuss the significance of our research on the future of the IoT and its applications, together with current open challenges. Lastly, Section VIII presents our conclusions.

Fig 4 shows the federated learning-based cybersecurity for
IoT. Tab. 2 provides the acronyms used in this study. Tab. 3 presents the federated learning-based solution for cybersecurity in IoT applications.

A. DETECTING COMPROMISED IoT DEVICES
IoT devices are being increasingly deployed in the everyday life. Many of those devices, however, are susceptible to attack through unsafe design, deployment, and configurations. Accordingly, many existing systems already contain vulnerable IoT devices that are open to being compromised, which is furthermore harmful in sensitive tasks such as surveillance, as shown by the work of Ciuonzo et al. [59], which focused on the issue of distributed detection of a non-cooperative object in a wireless sensor network.
While centralized learning-based intrusion detection approaches have been successful, including the hybrid hierarchical and AutoEncoder techniques, as presented by Bovenzi et al. [60], which provided a two-tier hierarchical network-based IDS that performs anomaly detection with a multimodal deep autoencoder, and soft output classifiers. And also, the work of Mirsky et al. [61], which provided Kitsune, a network-based plug-and-play IDS that can efficiently classify attacks on the local network without supervision. However, data privacy, network latency, and similar centralized learning-based issues are not considered in these approaches.
To identify compromised IoT devices, Nguyen et al. [57] proposed an autonomous self-learning distributed scheme, named DIOT, which is based on a federated learning approach. The flask and flask socketio libraries are used during the implementation of the federated learning algorithm. The performance evaluation shows that the DIOT scheme is able to detect 95.6% of attacks in an average of 257 milliseconds. Zhao et al. [62] developed a federated learning-based intrusion detection system, which can be used for detecting compromised IoT devices. The proposed system proposes that the global initial long short-term memory model is distributed among all user servers. Then, the user servers form their own unique model and start uploading their model settings to the central server. Last, the central server aggregates the model settings in order to form a new aggregate global model and then sends it to the user servers. The results of simulation on the SEA dataset (i.e., produced by the AT&T Shannon Lab) demonstrate that the proposed system reaches better accuracy and coherence compared to the conventional systems. To find the best candidate clients and solve the issue of accuracy optimization in federated learning, Mohammed et al. [63] introduced an online stateful heuristic based on federated learning combined with an IoT client alarm application, which can be used to notify clients of any unauthorized IoT devices in the IoT environment. The results of simulation on a real data set demonstrates that the suggested system surpasses the online randomized algorithm with up to 27% gain in terms of accuracy.

B. SECURE INDUSTRIAL INTERNET OF THINGS
With small size, small cost, and limited energy consumption, these appealing capabilities have made Internet of Things (IoT) largely endorsed in smart factories to supervise machinery, guide their automatic processes, or to help create a virtual representation of systems for advanced simulation purposes using digital twins [64]. To provide the tensor based data mining while guaranteeing the data security in industrial internet of things, Kong et al. [65] proposed a framework VOLUME 9, 2021 Federated Tensor Mining, named FTM, which is based on homomorphic encryption methods. The FTM framework is claimed to achieve high accuracy due to the homomorphic attribution. Khoa et al. [66] presented an IDS based on collaborative learning which can be applied effectively in the Industrial IoT and Industry 4.0. The proposed system builds intelligent ''filters'' for deployment at IoT gateways to quickly identify and prevent cyberattacks. Specifically, each filter utilizes the data collected in a filter's network in order to train its model for cyberattack detection through a deep learning system. Afterward, the trained model is distributed to other IoT gateways to increase the accuracy of intrusion detection throughout the overall system.
Rehman et al. [27] proposed an idea to enable a fully decentralized cross-device federated learning system, named TrustFed, which uses Industrial IoT devices as federated learning candidates. To maintain participants' reputations, the proposed TrustFed system uses smart contract technology and the Ethereum blockchain. TrustFed can identify and eliminate outliers in the training distributions prior to combining the model updates. The results of the simulation on the Turbofan Engine Degradation simulation dataset (released by NASA) demonstrates that the proposed system performs better in terms of the lower loss irrespective of the population size. Sun et al. [47] introduced a new framework based on digital twin to assist federated learning in Industrial IoT. The digital twin are used for capturing the characteristics of industrial devices. Hao et al. [58] developed a privacy-enhanced federated learning system, named PEFL, for industrial artificial intelligence, which is based on Augmented Learning with Error (A-LWE) term embedded with the homomorphic ciphertext of private gradients. To provide differential privacy, the PEFL system adopts a distributed Gaussian mechanism. The performance evaluation on MNIST dataset demonstrates that the PEFL system in terms of accuracy as well as communication and computation costs. To reduce the communication burden on the federated learning server, a proxy server can be used which is proposed Zhao et al. [67] to achieve anonymity of participants.

C. SECURE EDGE COMPUTING
Newly emerging technologies such as Mobile Edge Computing (MEC) and new generation communication technologies are essential to support the fast development and deployment of the IoT networks. As IoT networks grow in scale, determining the optimal allocation of limited resources to deliver high-quality IoT services is a critical challenge. Edge computing involves the processing of data at the edge of a network compared to processing in the cloud or on a remote server. To provide privacy and data security, Taïk and Cherkaoui [68] designed a system model based on federated learning and edge computing. The edge devices are used to train models by federated learning, which can minimize security issues. Lu et al. [53] designed a new system, named DITEN, that integrating blockchain and federated learning in edge networks. The proposed DITEN system uses Deep Neural Networks (DNN) as a strategy scheduler to ensure data privacy of users and enhance learning security. The experimental results on two datasets, namely, the realworld MNIST dataset and the Fashion-MNIST show that the proposed DITEN system is efficient compared to the conventional federated learning in terms of learning accuracy, learning loss, and communication time cost. Qian et al. [69] developed a privacy-preserving data analytic system, where the federated learning at the centralized fog devices. The proposed system uses an active learning in edge devices, which can harvest the potential privacy benefits as well as reduce latency and communication overhead.
To provide joint IoT network and edge server optimization, Xiao et al. [70] proposed a federated edge intelligence framework, named FEI. The FEI consists of a group of edge servers that trains a shared model using the data collected and uploaded from IoT devices. Cui et al. [71] introduced a secure and decentralized platform, named SAPE, for securing edge computing. The SAPE platform enables users to send their assignments, which are then planned to the relevant edge nodes to reduce the time it takes to complete the tasks. To prevent attacks, the SAPE platform uses federated deep reinforcement learning (DRL). The reliability of the federated training process is improved by a blockchain-based verification scheme. The findings demonstrate that SAPE overcomes some of the shortcomings conventional schemes during the defense against adversarial attacks.

D. SECURE INTERNET OF DRONES
The combination of unmanned aerial vehicles (UAVs) and artificial intelligence (AI) technology created opportunities to facilitate existing ground-based mobile crowdsensing platforms to achieve more difficult missions. More precisely, drones enable autonomous crowdsensing at any time and any place due to their remarkable benefits of lower cost, faster operational deployment, and more flexible movement, as presented by Motlagh et al. [72], which provided a demonstration of the use of drones for crowd surveillance through face recognition. Federated learning can provide significant privacy protection by allowing a collection of UAVs to train a shared AI model collaboratively while preserving the training data (i.e., sensed data) on their devices at the local level. Fig 5 illustrate the federated learning-based cybersecurity for internet of drones. For secure and efficient AI model training in UAV-assisted mobile crowdsensing, Wang et al. [25] designed a practical federated learning framework, named SFAC, which is based on three technologies, namely, blockchain, local differential privacy, and reinforcement learning. Blockchain technology is used to preserve data training and contribution verification between drones, whereas reinforcement learning is used to achieve optimal strategies. Their performance evaluation using the MNIST dataset showed that the SFAC framework enhanced the quality of the local model update (QoLM) metric in the federated learning process learning, compared with conventional frameworks. To defend against jamming attacks, Mowla et al. [73] introduced an adaptive federated reinforcement learning system, which can be applied for flying ad-hoc networks. The simulation results indicated a 39.9% improved average accuracy of the federated jamming detection scheme used in the defense mechanism.
To counteract eavesdropping in a fog-aided IoD network, Yao et al. [74] proposed a secure federated learning scheme. The main idea of this proposed scheme is that monitoring the energy of all the unmanned aerial vehicles (UAVs) to optimize the safety rate of the federated learning system is limited by the UAV battery capacity and the Quality of Service (QoS) constraint. The performance evaluation of the proposed scheme shows that it performs better than two existing related algorithms with a small federated learning training time. Therefore, Yazdinejadna et al. [75] designed an authentication system based on federated learning using drones' Radio Frequency (RF) features. The proposed authentication system uses the Deep Neural Network (DNN) and Homomorphic Encryption (HE). The DNN network is implemented locally on drones with Stochastic Gradient Descent (SGD) optimization, while the HE system is used to secure model parameters. From the experimental findings, the proposed authentication system obtains a high true positive rate when authenticating drones and improved performances in comparison to alternative machine learning-based systems.

E. SECURE INTERNET OF HEALTHCARE THINGS
The management of health has emerged as a major issue and challenge as new complex types of diseases and symptoms are introduced like COVID-19. Fig 6 present how the healthcare sector can use federated learning techniques in order to maintain patients' data privacy, while benefiting from other hospitals' knowledge. Thwal et al. [49] designed a deep learning-based clinical decision support solution, which is trained and managed in a federated learning model. The proposed solution focused on an approach to ensure patients' privacy and address the threat of cyberattacks by allowing for the mining of clinical data at a large scale. Based on a federated learning model, the proposed solution can exploit rich clinical data to train every local neural network with no requirement to share patient private data.
To decrease energy consumption in the federated learning process, Hao et al. [76] designed a new scheme, which separates the model into three sections and transfers the central section to the cloud server with a high computational cost. To perform gradients aggregation in ciphertext context, the proposed scheme applies homomorphic encryption, which can resist several existing deep learning privacy attacks. For securing wearable healthcare, Chen et al. [77] a federated transfer learning framework, named FedHealth. The FedHealth framework combines different organizations' data without losing information privacy and performs comparatively personalized learning of models using transfer of knowledge.
The COVID-19 pandemic triggered a global crisis that required collaborative efforts to combat it. A critical factor in evaluating and responding to COVID-19 is the effective identification of infected patients, and AI is a key part of this. However, the problem with the old centralized AI is the sharing of data among hospitals around the world, which raises many privacy issues, and that's where FL comes in. Zhang et al. [23] proposed a dynamic fusion-based FL system to analyze medical diagnostic images such as CT scans and X-rays, and decide dynamically which clients participate according to the performance of their local model and plan the fusion of models depending on the training time. The results demonstrated that the system is practical in terms of performance, communication and failure tolerance. Kumar et al. [21] proposed a blockchain-based FL system for COVID-19 detection, which was trained and evaluated on real COVID-19 patient data that was collected and publicly published from various hospitals with different types of CT scanners, as well as a data normalization strategy. Liu et al. [22] proposed an FL-based model for learning COVID-19 data. The authors evaluated the performance of popular models, including MobileNet, ResNet18, and COVIDNet, with and without the FL framework. The authors concluded that ResNeXt shows the highest efficiency in images with COVID-19 labels. Whereas, MoblieNet possessed the lowest number of parameters. Hence, the work suggests that ResNeXt and ResNet18 are selected to be better for COVID-19 identification among the models used.

F. SECURE CLOUD COMPUTING
While conventional machine learning training models share data centrally in the cloud, an increasing number of customers are not interested in participating in data sharing due to privacy or peer competition issues. Federated learning has been suggested as a distributed platform to overcome these limitations, where multiple customers collectively train a machine learning model without partitioning their individual datasets. Fang et al. [50] designed a federated learning scheme with strong privacy preservation, named HFWP, for securing cloud computing. Based on a lightweight encryption protocol, the HFWP scheme is robust against colluding parties and an honest but curious server. The experimental results on two real-world datasets, namely, MNIST and UCI Human Activity Recognition Dataset, shows the highest accuracy compared to other existing works. Zhang et al. [78] introduced a federated learning scheme that takes the local characteristics of AI IoT applications, which can enhance the accuracy of prediction of any individual AI IoT-enabled device.
For enhancing cloud computing-based 5G heterogeneous network, Wei et al. [79] designed a federated learning scheme based on end-edge-cloud cooperation. Within this scheme, the nodes that are equipped with mechanisms for attack detection are deployed in the end, edge, and cloud of the 5G heterogeneous network. To reduce the negative impacts due to heterogeneity in a cloud-edge architecture, Wu et al. [80] proposed a personalized federated learning VOLUME 9, 2021 scheme, which the power of edge computing is used for high throughput and low latency.

G. DATA COLLABORATIONS IN IoTs
As IoT technologies are rapidly emerging, network applications require cross-domain collaborative computational processing, which necessitates the aggregation and cooperation of a large number of network data sources. Different data owned by various stakeholders and having distinct properties will be combined into the network applications within these processes. The information that is revealed to the providers of applications, results in the inevitable risks of losing data privacy control. To enable the secure collaboration of massive data sources, Yin et al. [81] designed a secure data collaboration scheme, called FDC, which can be applied in an IoT environment. The FDC scheme uses three parties: a blockchain system, public data center, and a private data center. The blockchain system is used to sustain flexibility and access control, while the private data center is applied for registration, management, storage, and IoT data collection. The performance evaluation on wearable sensor data shows that the proposed FDC scheme provides efficient accuracy and loss.

H. SECURE 5G-ENABLED IoT
The IoT network environments are time-varying, and the network devices' heterogeneous resources make it difficult to provide reliable, secure, and real-time communications among the network devices and their service servers, especially in the 5G-enabled IoT. Yu et al. [52] proposed a federated learning-based distributed model, named UDEC, in order to address the following three challenges: 1) Privacy and security-preserving services, 2) Dynamic and low-cost scheduling, and 3) Full use of system resources. The UDEC model train deep reinforcement learning to secure critical users' service request data at the edge nodes. Their performance evaluation shows the effectiveness of the UDEC model in terms of energy consumption.

I. SECURE INTERNET OF VEHICLES
Vehicular IoT provides a safer travel environment and better on-board experience, leading us to a smart and self-driving automotive future. In particular, there are a number of applications that can be found in the field of automotive IoT, including, autonomous vehicles, driver assistance, vehicle telematics, and predictive automotive maintenance. A federated learning approach is implemented in the field of data-driven navigation, which uses the data that mobile users collect and embedded processing resources.  named FedLoc, which can secure updates to locally trained models, providing robust support for participant fluctuation. The FedLoc scheme is robust against malicious unauthorized participants by employing the limited Laplace mechanism as well as the homomorphic threshold encryption mechanism. Lu et al. [56] designed a collaborative edge learning framework, named CLONE, by using real-world data set captured from a large electric vehicle (EV) manufacturing enterprise. The CLONE framework is based on long-term memory networks and a federated learning algorithm to proves latency saving, privacy enforcement, safety preservation, and the efficacy of driver personalization. The CLONE framework selects the fault of an EV battery and related hardware as a case study to demonstrate that the CLONE system can predict failures with accuracy to achieve collaborative and reliable driving. Lu et al. [55] proposed a scheme for federated peer-to-peer vehicle learning that uses random updating of sub-pots with no conservators, which increases both safety and reliability. The process of aggregation is performed in all vehicles in an asynchronous manner. When performing a joint learning task that includes data sharing or leak detection, all vehicles act as participants to perform federated learning. The information from vehicle data retrieval is stored on neighboring RSUs in the system in a distributed hash table form.
Lu et al. [55] proposed a scheme for federated peer-to-peer vehicle learning that uses random updating of sub-pots with no conservators, which increases both safety and reliability. The process of aggregation is performed in all vehicles in an asynchronous manner. When performing a joint learning task that includes data sharing or leak detection, all vehicles act as participants to perform federated learning. The information from vehicle data retrieval is stored on neighboring RSUs in the system in a distributed hash table form.

J. SECURE MOBILE CROWDSENSING
Mobile Crowdsensing is an emerging key element of IoT, which is a model that employs individuals wearing smart devices, called ''workers'', to conduct different sensing activities. To resolve two challenges for mobile crowdsensing, namely, user dropout and forced aggregation, Liu et al. [54] proposed a federated extreme gradient boosting framework, named FEDXGB, which is based on two kinds of parts, a central cloud server and a set of users. FEDXGB performs the following process. The central server takes an iterative invocation of a sequence of secure schemes to construct the XGBoost classification and regression tree. Within the schemes, the FEDXGB framework uses a secure aggregation protocol to aggregate user gradients. Through a combination of Bresson's cryptosystem and Shamir's secret sharing, FEDXGB allows the central server to perform constrained aggregation on the gradients and is able to recover dropout users' data. The performance evaluation under both ADULT and MNIST datasets show that the FEDXGB framework can provide a computation and communication cost reduction with negligible performance loss.
The data aggregation techniques based on homomorphic encryption for privacy-preserving have been well-studied for improving the privacy of FL systems. Zhang et al. [82] proposed a secure data aggregation system, named FedSky, for federated mobile crowdsensing, which is based on an effective worker selection mechanism. Instead of choosing a random cluster of users, The FedSky system chooses a cluster of users based on the size of the users' local data and the computing power of their mobile devices. Compared to the conventional FedAvg approach [83], the proposed system can reduce significantly the computation time of the users as well as the latency of the system. The performance evaluation on the MNIST dataset shows that the proposed system the maximum training time can be as high as 6 hours under the experimental setting of sd = 15 and k = 100 (sd: the standard deviation for computational power; k: the number of selected workers).

K. CYBER PHYSICAL SYSTEMS
Cyber physical systems process multi-source and large-scale data in various domains of application. These data are generally composed of private personal and incomplete information, usually distributed across various devices and locations. Federated learning is proposed as an efficient approach for ensuring the privacy of cyber physical systems. Based on a VOLUME 9, 2021 Gaussian mechanism and an optimized federated soft-impute algorithm, Yang et al. [48] introduced a privacy-preserving tensor completion method. Through a formal recovery error bound, the proposed privacy-preserving tensor completion method is proven that can provide a privacy guarantee with high accuracy.

III. FEDERATED LEARNING-BASED CYBER SECURITY INTRUSION DETECTION
Tab. 4 presents the federated learning-based systems for intrusion and malware detection in IoT applications.

A. FEDERATED LEARNING-BASED ANOMALY DETECTION
Federated learning is a decentralized machine learning approach that exploits the performance computing power of edge devices with no explicit exchange of user data patterns. The local models are trained on user data on the device, and those models are forwarded to a central server. Since it is trained on sensitive user data, federated learning can suffer from machine learning attacks against the locally created models. To overcome this problem, Al-Marri et al. [89] proposed an IDS based on federated mimic learning. The proposed system is implemented and evaluated using Python on Google Colab with the real-world dataset (NSL-KDD), which the results show 98.11% detection accuracy with federated mimic learning compared to centralized machine learningbased IDSs. To address the need for securing traffic and maintaining privacy in heterogeneous networks, Li et al. [90] designed a distributed an IDS based on federated learning for satellite-terrestrial integrated networks for analyzing and blocking harmful traffic, especially distributed denial of service (DDoS) attacks. The proposed IDS uses two technologies, namely, 1) homomorphic encryption to provide secure multi-party computing in federated learning and 2) convolutional neural network for achieving higher recognition accuracy.
To detecting various types of cyber threats against industrial cyber physical systems, Li et al. [92] designed an IDS based on federated learning with a convolutional neural network and a gated recurrent unit. The proposed IDS system employs the Paillier public-key cryptosystem to ensure that the model parameters remain secure and private throughout the training process. The performance evaluation under the gas pipeline system dataset show the following results: F-score = 98.14 %, recall = 97.47 %, precision = 98.85 %, accuracy = 99.20 %, which are better compared to three related works [57], [95], and [77]. Mothukuri et al. [85] uses Gated Recurrent Units (GRUs) models-based anomaly detection approach to provide real-time proactive recognition of intrusions in IoT networks through the use of decentralized device data. The proposed IDS can preserve the integrity of data stored on local IoT devices by sharing only the weights learned with the federated learning's central server. Huong et al. [86] designed an IDS, named LocKedge, for IoT networks. The LocKedge system uses the detection task right at the edge layer with high accuracy. Therefore, the detection system is based on two modules: feature extraction and classification. The feature extraction stage focuses on minimizing features from the input samples that are fed to the detection stage. The performance evaluation under the BoT-IoT dataset shows that federated learning results are lower than its centralized mode counterpart. Chen et al. [94] proposed a federated deep autoencoding Gaussian mixture model, named FDAGMM, for network anomaly detection. Through the performance evaluation under the use of the network intrusion detection dataset (KDDCUP 99), the results show that the FDAGMM model is efficient in three metrics, including, F1-Score, Precision, and Recall, compared to the deep autoencoding gaussian mixture model.
Based on the performing inference of detection models and local training, Rahman et al. [91] proposed a federated learning-based system for detecting IoT intrusion, which can preserve data privacy. Therefore, the IoT devices can take advantage of the knowledge of their peers by sharing only the updates to a remote server. Then, the remote server aggregates the updates and exchanges an enhanced detection framework with the collaborating devices. The performance evaluation on an NSL-KDD dataset shows that the proposed system have an accuracy fluctuating around 83.09 %. Cetin et al. [96] proposed an IDS, named FedAGRU, which is based on federated learning. For collaborative training, FedAGRU takes advantage of the computing resources of edge devices and local datasets for training the model and then uploads the settings to a server. Through the performance evaluation under the use of the three network intrusion detection dataset, namely, KDD CUP 99 data set, CICIDS2017 data set, and WSN-DS wireless network data set, the results show that the FedAGRU system provides less communication overhead with higher detection accuracy. McElwee et al. [97] proposed a federated analysis security triage tool, named FASTT, for prioritizing and responding to IDS alerts. The FASTT tool resolves the issue of the high volume of intrusion detection threats that need to be reviewed by security analysts in a manual process. Based on the TensorFlow deep neural network approach, the FASTT can categorize intrusion detection alerts and identify which types of security analysts are to review the threats.
To construct a generalized model for anomaly detection in the industrial internet of things, Wang et al. [28] proposed hierarchical federated learning, where every local model is trained by deep reinforcement learning algorithm. As the local datasets are not needed during federated learning, the privacy leakage risk is minimized. Moreover, through injecting a degree of privacy leakage and an interaction function into the anomaly detection concept, the proposed system can significantly increase the accuracy of detection.
Based on a boosting method of logistic model trees, Cvitic et al. [29] proposed a DDoS traffic detection for different IoT device classes. For collecting federated data from heterogeneous sources in IoT networks, Moustafa et al. [98] introduced the testbed TON IoT datasets for Windows operating systems, which is deployed in three layers: edge, fog, and cloud. The edge layer includes IoT devices, the Fog layer includes gateways and virtual machines, and the cloud layer includes cloud services, connected to the other two layers. Therefore, the TON IoT datasets employed under the following nine attack families: 1) Man-In-The-Middle (MITM) attack, 2) Password attack, 3) Cross-site Scripting (XSS) attack, 4) Injection attack, 5) Backdoor attack, 6) Ransomware attack, 7) Distributed Denial of Service (DDoS) attack, 8) Denial of Service (DoS) attack, and 9) Scanning attack. To provide wireless edge network security in IoT networks, Chen et al. [88] proposed a federated learning-based intrusion detection system, named FedAGRU, which employs gated recurrent units (GRUs) models. Specifically, the proposed FedAGRU system is different from the existing centralized learning approaches by providing updates to the global learning models rather than sharing the original data directly between the central server and edge devices. Based on three datasets, namely, KDD CUP 99 data set, CICIDS2017 data set, and WSN-DS wireless network data set, the results demonstrate that FedAGRU increases the accuracy of detection by around 8% compared to other centralized learning approaches. Moreover, the cost of communication of FedAGRU achieves 70%, which is lower performance than other federated learning approaches.

B. FEDERATED LEARNING-BASED MALWARE DETECTION
There are billions of IoT devices without suitable protection measures which have been developed and deployed in the last few years. The susceptibility of these devices to malware has increased the requirement for effective detection technologies to identify devices that are compromised by malware inside the network. Taheri et al. [87] proposed an federated learning-based system, named Fed-IIoT, for android malware detection. To impersonate the environment of a poisoned sample, the Fed-IIoT system employs a generative adversarial network. The performance evaluation on three IoT datasets (the Contagio dataset, Drebin dataset, and Genome dataset) using different features show that the Fed-IIoT system performs significantly better than other local adversarial training mechanisms. To perform malware detection in cloud computing environments, Payne and Kundu [93] proposed a hierarchical approach towards deep federated defences. Their proposed approach formalized malware detection as a graph and hypergraph learning problem.

IV. FEDERATED LEARNING WITH BLOCKCHAIN
Blockchain is a decentralized, provenance-preserving, immutable ledger technique. It provides an efficient method to remove a central server that is prone to attacks in an untrusted computing environment [110], [111]. To alleviate the security problems that involve a central server in federated learning, the blockchain model can be integrated with the federated learning as shown in Fig 8 [112]- [118]. Tab. 5 presents works on blockchain and federated learning-based solutions for cyber security in IoT applications.

A. PERMISSIONED BLOCKCHAIN-BASED SOLUTIONS
The implementation of distributed multi-party data sharing in IoT applications is challenged by several issues. Based on permissioned blockchain, Lu et al. [106] developed a differential private multi-party data model sharing mechanism, which is combined with federated learning. The proposed mechanism can reduce the threat of data leakage, which enables data owners to have more control over the access to stored and shared data. The simulation results on two real-world data sets (i.e., Reuters dataset and 20 newsgroups dataset) show that the proposed system can guarantee the quality of shared data as well as differential privacy.
To enhance the security of federated learning, Majeed and Hong [107] developed a blockchain-based solution, named FLchain, which can be applied in multi-access edge computing. The FLchain solution uses two ideas, namely, 1) the channels for learning multiple global models and 2) the global model state tree. Specifically, the aggregation of local model updates is updated and stored in the blockchain network.
Połap et al. [104] developed a privacy-preserving federated learning scheme, which is based on blockchain technology for securing the Internet of Medical Things. The use of the blockchain technology here provides security to updates of local data, which are critical for the aggregation of federated learning, and are derived from trusted devices with authenticity. Furthermore, the local updates can be stored as transactions in the blockchain network. The simulation results on the Tuberculosis Chest X-ray Image Data Sets with a convolutional neural network as a learning classifier show that the proposed scheme achieves an effectiveness average of 73,7%. Based on a multi-agent system, Połap et al. [99] developed a security architecture that combines the implementation of blockchain technology and federated learning for securing the Internet of Medical Things (IoMT). The proposed architecture enables separating specific tasks to agents units as well as sharing and protecting private data using blockchain technology. The performance evaluation on Skin Cancer MNIST dataset with the ratio of 70:30 between training and validating shows that the proposed architecture achieved an accuracy of 80 % for 25 iteration.
Lugan et al. [108] introduced a scalable security architecture by deriving a new paradigm of trusted coalitions with a high degree of trustworthiness which provides privacy-preserving of data as well as motivation for coalition participation in the absence of a central authority. The proposed architecture is based on permissioned blockchains, which enable deep learning that is distributed with rising degrees of security and privacy. Lu et al. [18] proposed a permissioned blockchain empowered federated learning scheme, using digital twins to support long-distance communication between edge servers and end users in edge computing. The performance evaluations on the CIFAR10 dataset show that the learning loss of the proposed scheme is improved through the optimization process.
Through a shared machine learning model, Doku et al. [109] proposed a federated learning scheme, named iFLBC, which is based on blockchain technology. The iFLBC scheme generates a shared model based on the aggregation of the trained models. The aggregated model is then used by IoT users to provide edge intelligence to end users. The Proof of Common Interest (PoCI) is used by the iFLBC scheme as a consensus algorithm to determine relevant data.
To perform authentication and trust management of federated nodes as well as the edge training model, Rahman et al. [103] introduced a hybrid lightweight federated learning platform that uses smart blockchain contracts for securing the Internet of Health Things (IoHT). Their platform is designed to enable inference process model learning, and the complete encryption of a dataset. Here a blockchain is used to aggregate the updated model parameters using multiplicative encryption, while the additive encryption operation is performed by each federated edge node.
Through a shared machine learning model, Doku et al. [109] proposed a federated learning scheme, named iFLBC, which is based on blockchain technology. The iFLBC scheme generates a shared model based on the aggregation of the trained models. The aggregated model is then used by IoT users for the provision of edge intelligence to end-users. The Proof of Common Interest (PoCI) is used by the iFLBC scheme as a consensus algorithm to determine relevant data.

B. PERMISSIONLESS BLOCKCHAIN-BASED SOLUTIONS
The permissionless blockchains (aka. public blockchains) enable any person to perform operations and to join as a validator. Li et al. [100] introduced a crowdsourcing protocol, called CrowdSFL, which is based on federated learning and blockchain technology. The CrowdSFL protocol uses a re-encryption algorithm based on Elgamal to provide higher security with less overhead. The simulation results show that the proposed CrowdSFL protocol can resist the following malicious behaviors: Malicious miners, Malicious workers, and Malicious requesters. To resist poisoning attacks as well as membership inference attacks in 5G networks, Liu et al. [101] developed a blockchain-based federated learning protocol. The proposed protocol can provide privacy-preserving of data based on the local differential privacy technology. The performance evaluation using two datasets, including, MNIST dataset and CIFAR-10 dataset, show that the proposed protocol can deter poisoning attacks.
Wang et al. [102] proposed a secure decentralized multiparty learning scheme, named BEMA, for edge computingbased IoT applications. Specifically, each part in the BEMA scheme distributes their local model and during that time, they are processing the models received from other users about their local dataset and identify the models that require certification. According the BEMA scheme, the parties broadcasts the certification message to the corresponding parties. Based on the certification message, the system parties are not required to exchange their dataset with any other parties. The simulation results on the MNIST dataset show that the BEMA scheme is efficient in term of prediction accuracy under attacks compared to the baseline models.
Based on the features of blockchain technology and federated learning, Sharma et al. [105] proposed a distributed computing defence scheme for securing the Internet of Battle Things. The proposed system is composed of four different layers: data layer, edge layer, fog layer, and cloud layer. The performance evaluation shows that the proposed scheme achieved an accuracy rate of more than 92.7 %.

V. THREAT MODELS IN FEDERATED LEARNING
As federated learning is based on the collaborative action of all edge devices to build a machine learning model, a machine learning model can be faked when only a couple of edge devices are operating incorrectly [137]. Tab. 6 presents the vulnerabilities that can be exploited by adversaries in federated learning-based systems for IoT networks.

A. INFORMATION LEAKAGE
The problem of information leakage from collaborative deep learning is addressed by Hitaj et al. [120], where the authors proposed an attack to leverage the real-time quality of the learning operation which enables the adversary to train a generative adversary network (GAN) to create a set of targeted training patterns designed to be protected from the adversary. Based on the analysis of the privacy leakage of TernGrad [138], Dong et al. [51] proposed a secure and robust federated learning protocol, named EaSTFLy, which can be applied in IoT networks. The EaSTFLy protocol uses privacy-preserving technologies, namely, Paillier homomorphic encryption (PHE) and Shamir's threshold secret sharing (TSS) in order to solve arising privacy issues. The performance evaluation shows that the EaSTFLy protocol can resist against semi-honest adversaries using two datasets, including, MNIST and SVHN.
To train a deep neural network over a large dataset can consume significant time and resources. One popular approach to scaling is to fragment the training dataset, and simultaneously train different networks on each of these subsets and then share settings via a server of metrics. When training, a local model retrieves settings from the server, computes any required changes from its existing training dataset, and then sends these changes directly back to the server, which makes changes to the overall settings. Melis et al. [136] founded that the leakage of unintended features will expose collaborative learning to powerful inference attacks.

B. POISONING ATTACK
Poisoning attacks focus on degrading the accuracy of a machine learning model by falsifying the aggregation through the use of poisoned model updates, as shown in Fig 9. Tan et al. [137] categorized poisoning attacks using the sources of poisoned model updates into two types, namely, model poisoning and data poisoning. Data poisoning is performed by changing the training data in the damaged edge devices, while model poisoning uses some predefined rules to generates updates to the poisoned model. Zhao et al. [127] proposed a defense security system against poisoning attacks using the concept of generative adversarial networks. The proposed system removes adversaries using auditing data that is generated by generative adversarial networks. Based on microaggregation and Gaussian mixture models, Singh et al. [126] designed a security system, where the clients of the system self-identify as members of a minority group and advertise relevant features to their peers. Even with a low proportion of malicious edge servers, data poisoning attacks can significantly decrease recall and classification accuracy, as discussed by Tolpegin et al. [139]. Fang et al. [125] proposed a new idea to defend against the local model poisoning attacks based on two concepts, including, Reject on Negative Impact (RONI) and TRIM. The RONI consists of evaluates the influence of every training instance on the learned model's error rate and deletes the training instances that have a significant negative influence. Ma et al. [128] proposed a secure federated learning mechanism based on the trimmed optimization with multiple keys, which can resist a range of poisoning attacks. Taheri et al. [87] uses two concepts, including, Federated Generative Adversarial Network (FedGAN) and Generative  Adversarial Network (GAN), to create an architecture based on federated learning, named called Fed-IIoT. The proposed Fed-IIoT architecture can resist dynamic poisoning attacks in the server-side components.

C. JAMMING ATTACK
Adversaries can initiate a jamming attack against federated learning-based security and privacy systems where the intruder's intention is to maliciously interrupt the victim network's conversation by interfering or colliding at the recipient's side. Mowla et al. [129] proposed a security architecture using federated learning for the detection of cognitive jamming attack. Based on the Dempster-Shafer theorybased client group prioritization technique, the detection can be performed on the device while taking into account the unbalanced sensory data characteristics of the environment under training.

D. BYZANTINE ATTACK
An attacker distributes a local malicious model to other participants to modify the result of the classification of the maxmodel predictor. This attacker can induce errors in their local model update process. Wang et al. [102] designed a secure federated learning system based on blockchain technology that can defend against Byzantine attacks. Jebreel et al. [130] designed a novel concept against Byzantine attacks where the basic concept is the analysis of a small fraction of the updates, instead of analyzing the whole updates. Sun et al. [47] proposed adaptive federated learning with digital twin, which is based on the concept of interaction records and learning quality that rely on the use of malicious updates to mitigate the malicious data threat.

E. ADVERSARIAL ATTACK
When an adversary is able to compromise an IoT device without being detected, it can attempt to ''poison'' the system's training operation by falsifying packets as adversarial samples that are designed to influence the model's learning in a manner that prevents the malicious activity from being detected [140], [141]. Hitaj et al. [120] uses the differential privacy at different granularities against generative adversarial network. Song et al. [123] proposed federated defense against adversarial attacks using deep neural networks.
Qiu et al. [124] proposed an adversarial attack against deep learning-based network intrusion detection systems to attack one state-of-the-art Kitsune [61]. The proposed attack uses saliency maps to identify the critical features. Therefore, Ibitoye et al. [121] showed the impact of adversarial samples on an intrusion detection system based on a deep learning approach in the environment of an IoT network. Specifically, the study uses two deep learning approaches, including, a typical Feed-forward Neural Network (FNN) and a Self-normalizing Neural Network (SNN). The performance results on the BoT-IoT dataset show that an intrusion detection system based on an FNN performs better than with SNN.
The concept of Generative Adversarial Network (GAN) was introduced by Goodfellow et al. [142], which is used by Hassan et al. [122] to generate adversarial attack data and attempting to classify these generated data. The GAN is composed of two components, including, 1) generator and 2) discriminator. Fig 10 illustrate GAN with FL-based IoT for cyber security [143]. To improve the reliability of the attack/non-attack detection system for a non-noisy as well as an adversarial setting, the authors proposed a robust decision boundary optimization approach. To train the downsampler, the proposed system uses a novel cooperative training algorithm, which provides an improved delivery for noisy examples with the real distribution. Throughout the performance evaluation on a SCADA dataset, the results show that the proposed system can classify with a binary cross-entropy loss score of 0.47 and an accuracy of 95.55 %.
Recently, Rosenberg et al. [144] proposed a taxonomy for the adversarial attacks in cyber security based on the following seven distinct attack characteristics: • Attack's output: It indicates two types of attacks that aim to modify a feature's values, including, feature vector attack and end-to-end attack.
• Perturbed features: This characteristic of the attack consists of the features being added or modified.
• Attacker's goals: This characteristic of the attack consists of performing incorrectly the security goals such as authentication, confidentiality, privacy, integrity, and availability. . . etc.
• Attack's targeting: It indicates three types, including, label indiscriminate attack, label-targeted attack, and feature-targeted attack.
• Attacker's training set access: It indicates the type of the adversary's access to the training set used by the classifier.
• Attacker's knowledge: This characteristic of the attack is based on the amount of knowledge of the attacker regarding the classifier.
• Targeted phase: It indicates two phases, including, training phase attack and inference phase attack.

F. PRIVACY LEAKAGE ATTACK
In a distributed learning approach, the parameters of an updated local model on IoT devices can keep disclosing some VOLUME 9, 2021  information regarding data that has been employed during training. Furthermore, the attackers can deduce if an IoT device has been involved in some mission from their local model updates via differential attacks. As each task has specified detection positions, the privacy of the location of the IoT devices involved can be leaked. To resist against such privacy leakage attack, Wang et al. [25] proposed a framework that uses three technologies, namely, blockchain, local differential privacy, and reinforcement learning. Fig 11 illustrates a privacy leakage attack in federated learning where a malicious actor compromises the aggregation server and leaks the data of participating entities.

G. SHILLING ATTACK
Shill attackers attempt to affect recommendation systems by producing many malicious profile users and rating target items with extreme ratings to increase or decrease their popularity. Jiang et al. [131] proposed a new idea about designing four features from the gradient matrices in order to detect shilling attackers. Specifically, the proposed idea train a semi-supervised Bayes classifier. The performance evaluation on two real-world datasets, namely, MovieLens and Netflix, demonstrates that the proposed idea can not only identify shilling hackers but also improve the performance of recommendations significantly.

H. INFERENCE ATTACK
An inference attack is a technique of data mining that is conducted by examining data to obtain illegitimate knowledge regarding a specific topic or database. Hao et al. [58] proposed a privacy-enhanced federated learning scheme that can ensure the privacy of training data during and after the training process as well as resist model inversion attacks and membership inference attacks. Liu et al. [54] proposed a federated extreme gradient boosting scheme that is based on differential privacy and homomorphism of the Paillier cryptosystem against the inference attack. Liu et al. [101] proposed secure federated learning for detection poisoning and membership inference attacks using the local differential privacy technology.

I. OTHER ATTACKS
There are other offensive strategies that can be used to attack ML models, such as white/black-box attacks, or even graybox attacks. The black-box attacks only provide the ability to query the network's output or even have no network knowledge, while white-box attacks suppose that the attack target is available [119]. Gray box attacks train a generative model to produce adversarial examples and assume only access to the target model in the training phase [134], [135]. These three methods are generally categorized as adversarial attacking methods.

VI. EXPERIMENTATION
We train three deep federated learning-based IDS models for cyber attack detection in IoT, namely . We chose the Sherpa.AI framework for its advantages compared to other frameworks [145]. The source code for the experimental evaluation of this article is available upon request. 1

1) FEDERATED LEARNING PROCESS
In Fig 12 we illustrate the learning process applied in our deep federated learning based-IDS model. Alg. 1 shows a pseudo-algorithm for the steps taken to train the various client sets, which is adapted from [9]. At the beginning, a C fraction of K clients is picked by the aggregation server to join the FL workflow, and carry out computations for R federated learning rounds. The aggregation server produces a random generic model having a random set of initial weights w. Next, each client k retrieves the generic model from the aggregation server. Every client re-train the generic model with its private data locally and calculate a new local set of weights w k t+1 for the freshly generated local model. The clients share the updated model. Then, the server aggregates the parameters of all clients ( K k=1 n k n w k t+1 ). After that, the aggregation server sends the updated global model to the clients, where each client applies the updated parameters, to improve the global model. These steps are repeated until the model is converged.

B. DATASETS DESCRIPTION AND PRE-PROCESSING
Datasets are mandatory for training and evaluating IDSs in IoT networks. The selection of the appropriate datasets for a specific task is also of great importance. The datasets that can be used in the performance evaluation of FL approaches for IoT networks are reviewed in Tab. 7. There are three datasets, namely, MNIST [146], Fed. EMNIST [147], and 1 https://github.com/Ferrag/FLCYBERSECURITYIOT Send w to Server CIFAR-10 [151] that can be used as real object classification tasks for evaluating adaptive FL for Industrial IoT. Therefore, these datasets are not suitable for evaluating federated learning-based IoT intrusion detection systems. Security researchers use cyber security datasets such as NSL-KDD [152] and CICIDS 2017-2018 [153] for the  performance evaluation of federated learning-based intrusion detection systems [159]. These two datasets does not contain IoT and IIoT traffic. In addition, NSL-KDD [152] is obsolete in the age of IoT networks (i.e., Fog, Edge, Cloud, Virtualization, 6G. . . etc.). For evaluating FL-based cyber security solutions in IoT networks, the security research community uses the following three datasets: TON_IoT [149], Bot-IoT [154], and MQTTset [156]. They are chosen specifically because they are build from heterogeneous data sources as well as collected from IoT and IIoT sensor telemetry datasets.
FL-based tasks require the data distribution to be Non-Independent and Identically Distributed (Non-IID) and unbalanced, which reflects the properties of the real-world scenario. However, due to the lack of FL-specific datasets, any pre-existing public dataset with engineered partitions can be used to mimic data federations, as employed in our experiment. Based on the datasets review presented in Tab. 7, we selected and used three real traffic IoT-based datasets, namely: BoT-IoT dataset, MQTTset dataset, and TON_IoT dataset. Tab. 8 provides a list of flow types and sample counts for each dataset. Description and pre-processing of each dataset is as follows:

1) BoT-IoT DATASET
The BoT-IoT dataset was produced at the Cyber Range Lab at UNSW Canberra as a result of building a real-life network environment integrating a mix of normal and botnet traffic [154], [160]- [164]. All 69.3 GB captured PCAP files with over 72 million records. The dataset is available in a variety of file formats, including PCAP, generated argus files, as well as CSV files. We used the CSV files for our experimental evaluations. The dataset includes various types of cyber attacks including: • DDoS & DoS attacks: The purpose of these attacks is to make services inaccessible to legitimate users by using a group of compromised bot-nets. Both DDoS, DoS for TCP and UDP attacks were carried out using the Hping3 tool.
• Reconnaissance: or probing attacks, which is a type of malicious behavior that collects user data by scanning remote systems. The dataset contains two types of such attacks, namely: port scanning using Hping3, and operating system fingerprinting using Nmap and Xprobe2 tools.
• Theft: The objective of these cyber attacks is to compromise sensitive data. The dataset contains two types of such attacks, namely Keylogging and Data theft attacks, both of which are carried out using the Metasploit framework. After dropping missing values, we also dropped the 'pkSe-qID', 'saddr', 'sport', and 'daddr' features in order to prevent overfitting, we encoded the 'proto' feature' with one-hot encoding. Then, we normalized other numerical features with the Z-Score normalization strategy as follows: where, x denote the value of the feature, µ denote the mean, and σ denote the standard deviation.

2) MQTTset DATASET
introduced by Vaccari et al. [156] to address the lack of support for specific protocols that IoT environments are currently using. It consists of Message Queue Telemetry Transport (MQTT) protocol-based traffic between various IoT devices to imitate a smart IoT environment. It comprises real-world attacks tailored to target the IoT environment, including: • DoS: This attack was conducted using the MQTTmalaria tool • Brute Force: The approach to this type of attack is to try to recover the user credentials used by MQTT using the MQTTSA tool.
• Malformed data: this type of attack is designed to trigger several malformed packets and send them to the broker, attempting to raise exceptions on the selected service.
• SlowITe: the Slow DoS against IoT Environments attack is a new DoS approach that targets the MQTT protocol, which generates a huge number of connections to the MQTT broker.
• MQTT Publish Flood: This approach seeks to overload the system by using a unique connection rather than instantiating multiple connections using the IoT-Flock tool.

3) TON_IoT DATASET
This dataset is introduced by the IoT Lab of the UNSW Canberra Cyber, the School of Engineering and Information technology (SEIT), UNSW Canberra at the Australian Defence Force Academy (ADFA) [150] for the collection and analysis of mixed data sources from IoT and Industrial IoT (IIoT). The benchmark was conducted using several virtual machines that included multiple operating systems to address the cross-layer connectivity between the three tiers: IIoT, Cloud, and Edge/Fog systems. Parallel processing was used to assemble the datasets to gather diverse benign and attack traffic, for IoT telemetry data service. It includes different attacking techniques, such as: • Password Cracking: This type of attack is intended to allow the attacker to overcome authentication schemes in order to compromise the IIoT devices. It was conducted using CeWL and Hydra toolkits.
• Backdoor: With this kind of attack, it is possible for attackers to obtain non-authorized remote access to IIoT devices affected by a backdoor malware. The framework used for these attacks is the Metasploitable3 framework.
• Injection: With this attack, the adversary aims to inject malicious data into the IIoT applications.
• XSS: the adversary frequently tries to run malicious commands in IIoT applications through a web server.
• Scanning: scanning tools, such as Nmap and Nessus tools, allow the attacker to perform scanning attacks against the IoT/IIoT devices and MQTT broker in a public network.
To prevent overfitting, we dropped the 'date' and 'saddr' features. Then, we used the Z-Score normalization strategy for numerical features.

C. USE CASES AND PERFORMANCE METRICS
For the purpose of evaluating our experiment, we employed two use cases, namely: • Centralized learning approach: The data is located at a single location with three well-known deep learning classifiers, i.e., DNN, CNN, and RNN.
• Federated learning approach: The data is located across different clients, and an aggregation server is used to aggregate the models of the clients. We used also the same classifiers as in the previous approach.
We used three sets of client distributions: K = 5, K = 10, and K = 15, with two data distribution methods: 1) independent and identically distributed (IID) and 2) non-independent and identically distributed (Non-IID), over 50 federated learning rounds. Tab. 9 shows the different parameters used in the three deep learning models for the centralized and federated learning approaches.
When conducting intrusion detection performance analysis, the most common metrics used are: • True Positive (TP): is used to determine the number of attack patterns that are properly classified as attacks.
• False Positive (FP): is used to determine the number of normal patterns that are wrongly classified as attacks.
• True Negative (TN): is used to determine the number of normal patterns that are proportion classified as normal.
• False Negative (FN): is used to determine the number of attack patterns that are wrongly classified as normal.
• Accuracy: is used to determine the proportion of correct classifications to the total number of entries, which is given by: • Precision: denotes the proportion of correct intrusion classes to the total amount of predicted intrusion results, which can be given by: • Recall: denotes the proportion of proper attack classifications relative to the overall count of all samples that ought to have been identified as attacks, it is given by:

2) FEDERATED LEARNING MODELS
In this experimental setup rather than locating all data in one location and conducting the learning from there, a federated deep learning approach is used, where the data never leaves the client side along with the shared knowledge that goes back and forth between the aggregation server and the participating clients. Fig 17 report the validation accuracy for each global model against the centralized model across all datasets and all classifiers. Fig 17 (a) plots the validation accuracy achieved by the federated deep learning classifiers (DNN, CNN, RNN) with both the IID and Non-IID data distribution strategies for the Bot-IoT dataset. For the IID data distribution strategy, the federated deep learning global models were able to approximate the performance of the centralized learning models. For the non-IDI data distribution strategy, the global   models struggled a bit to perform the same as in IID, which is quite normal since the data samples were randomly distributed for all clients, however after 50 FL runs, the overall performance was pretty good . Fig 17 (b) and Fig 17 (c) illustrate the validation accuracy obtained by the federated deep learning classifiers with the IID and Non-IID data distribution strategies for the MQTTset and TON_IoT datasets, respectively. Similar to the first data set, the same observations apply to these two experiments.
Tab. 11 present a detailed side-by-side comparison of all accuracies obtained by all global models and the highest/lowest accuracy of the best/worst clients couple in every set, across the first and the 50 th round of federated deep learning. The first observation is that in the IID data distribution strategy, the Best, Worst, and Global models are closely related to each other consistently across all settings and datasets, even though the clients are trained from different class samples. The reason being that all clients can learn from all classes. The second observation is that at the 50 th rounds of federated deep learning, the performance of all global models managed to approach the performance of the centralized model.
In the Non-IID case, at the first FL round, the Best, Worst, and Global models are nowhere near one another, and this is quite expected since not all clients were trained from all classes. A good example is a Bot-IoT dataset, with the CNN classifier, where K=15, the worst accuracy of the client was 01.00%, but with 50 e of federated deep learning rounds, this VOLUME 9, 2021 same client has an accuracy of 52.98% and the global model achieved 90.35%. This means that this client was able to benefit from the federated learning approach even though it has very limited knowledge of the attack classes in its local private data.

3) COMPARISON
The centralized intrusion detection approaches are capable of detecting intrusions with high accuracy. However, there are problems with these practices. First, and most importantly, privacy issues, since it requires data to be collected at a single entity, thus making it easier for an attacker to target a single location for all data, if that single entity is compromised, all sensitive data will be breached. Second, given the huge flow of data coming from the end devices to that single entity, latency, and processing is major concerns that must be addressed.
Federated learning-based intrusion detection systems, on the other hand, significantly decrease the previous issues with decent detection accuracy, and in many cases, it approached the performance of a centralized approach as we showed with our federated deep learning models. Furthermore, by taking into account that the field of federated learning is in its developmental stage, we expect that in the future, federated learning will replace centralized and traditional learning approaches in many machine learningbased domains, especially in areas where data privacy is a real concern.

VII. IMPORTANCE OF THE STUDY AND OPEN CHALLENGES
Federated learning is an emerging research area that is still in its developmental stage. Although it has a lot of potential in different IoT-based application areas, the practical implementation of federated learning presents several open challenges, as discussed below.

A. IMPORTANCE OF THE STUDY 1) IoT APPLICATIONS
The study shows that the federated deep learning-based security and privacy systems can be applied for several types of IoT applications, including, Industrial Internet of Things, Edge Computing, Internet of Drones, Internet of Healthcare Things, Cloud Computing, 5G-enabled IoT, Internet of Vehicles, Mobile Crowdsensing, etc.

2) INTRUSION AND MALWARE DETECTION
The study presents the importance of using federated deep learning by intrusion detection systems and malware detection systems as a decentralized machine learning approach for detecting cyber security attacks in IoT networks.

3) WHEN FEDERATED LEARNING MEETS BLOCKCHAIN
The study shows that blockchain technology can be integrated with federated deep learning for cyber security in IoT networks. This combination reduces the threat of data leakage and enables data owners to have more control over the access to stored and shared data.

4) VULNERABILITIES OF FEDERATED DEEP LEARNING
The study presents the importance of defending against the vulnerabilities that can be exploited by adversaries in federated deep learning-based systems for IoT networks. These adversaries can use cyber security attacks such as adversarial attacks or poisoning attacks to degrading the accuracy of a machine learning model or deduce if an IoT device has been involved in some mission from their local model updates.

5) FEDERATED DEEP LEARNING VERSUS CLASSICAL MACHINE LEARNING
The primary motivation for conducting this study was to investigate the effectiveness of federated deep learning versus conventional machine learning for cybersecurity in IoT networks. Based on the performance evaluation under three new real IoT traffic datasets, namely, the Bot-IoT dataset, the MQTTset dataset, and the TON_IoT dataset, the study demonstrates that federated deep learning approaches (i.e., CNN, RNN, and DNN) outperform the classic/centralized versions of machine learning (non-federated learning) in assuring the privacy of IoT device data and provide the higher accuracy in detecting attacks.

B. OPEN CHALLENGES AND CONSIDERATIONS 1) SECURITY AND PRIVACY CHALLENGES
Federated learning promises to protect the privacy of local user data, however, recent studies have shown that the involvement of specific participants can still be revealed by analyzing the global model [165]. Although some techniques have been used to overcome this problem, including differential privacy [166], these approaches degrade model performance or require additional conditions that are not suitable for IoT networks, especially high computing power [167]. Therefore, efficiently implemented federated approaches that provide high performance and preserve privacy without additional computational overhead are strongly required for IoT networks and applications.

2) IoT NETWORK SETTINGS CHALLENGES
The robustness of the federated deep learning system should be considered since users and aggregators are required to exchange parameters over the IoT network. In addition, communication channels and computational power are constrained in terms of capacity, as well as the presence of various network issues such as bandwidth, interference, and noise [167]. Hence, client access and limited network reliability are significant research challenges in developing a federated deep learning system for cyber security in IoT applications.

3) DATA-RELATED CHALLENGES
The issue of identifying and eliminating bias of all kinds (cognitive, sampling, reporting, and confirmation) in the data generation process is a serious concern for ML research in general. However, it is more complicated in FL due to the fact that data is distributed over multiple parties. For example, if IoT devices have varying data sizes, the FL-based system may give more importance to the contributions of the populations. In addition, If the global model update depends on the latency of the IoT network, then networks with slower devices or networks may be under-represented [168]. The most important question that may arise is how to develop a new FL-based strategy that can resist the vulnerabilities (Poisoning attack, Jamming attack, Adversarial attack, . . . etc.) while considering the practicability of deploying the solution, particularly in the context of low-resource IoT devices.

4) FL PLATFORMS CHALLENGES
Many IoT-based applications can benefit from FL due to the amazing performance of collaborative learning in the appropriate domains. Although there are various emerging frameworks for FL in general, designing a specific IoT framework based on FL is still an important research topic that needs to take into account the underlying IoT infrastructure.

VIII. CONCLUSION
In this article, we conducted a comparative study with an experimental analysis of federated deep learning approaches for cybersecurity in IoT applications. Specifically, we analyzed the federated learning-based security and privacy systems for several types of IoT applications, including, Industrial IoT, Edge Computing, Internet of Drones, Internet of Healthcare Things, Internet of Vehicles, etc. Then, we reviewed the federated learning systems with blockchain and malware/intrusion detection systems for IoT applications. We reviewed the vulnerabilities that can be exploited by adversaries in the federated learning-based security and privacy systems. We provided an experimental analysis of federated deep learning with three deep learning approaches, namely, RNN, CNN, and DNN. For each deep learning model, we studied the performance of centralized and federated learning under three IoT traffic datasets, namely, the Bot-IoT dataset, the MQTTset dataset, and the TON_IoT dataset. The results demonstrate that federated deep learning approaches can outperform the classic/centralized versions of machine learning (non-federated learning) in assuring the privacy of IoT device data and provides the highest accuracy in detecting attacks.