A Two-fold Machine Learning Approach to Prevent and Detect IoT Botnet Attacks

The botnet attack is a multi-stage and the most prevalent cyber-attack in the Internet of Things (IoT) environment that initiates with scanning activity and ends at the distributed denial of service (DDoS) attack. The existing studies mostly focus on detecting botnet attacks after the IoT devices get compromised, and start performing the DDoS attack. Similarly, the performance of most of the existing machine learning based botnet detection models is limited to a specific dataset on which they are trained. As a consequence, these solutions do not perform well on other datasets due to the diversity of attack patterns. Therefore, in this work, we first produce a generic scanning and DDoS attack dataset by generating 33 types of scan and 60 types of DDoS attacks. In addition, we partially integrated the scan and DDoS attack samples from three publicly-available datasets for maximum attack coverage to better train the machine learning algorithms. Afterwards, we propose a two-fold machine learning approach to prevent and detect IoT botnet attacks. In the first fold, we trained a state-of-the-art deep learning model, i.e., ResNet-18 to detect the scanning activity in the premature attack stage to prevent IoT botnet attacks. While, in the second fold, we trained another ResNet-18 model for DDoS attack identification to detect IoT botnet attacks. Overall, the proposed two-fold approach manifests 98.89% accuracy, 99.01% precision, 98.74% recall, and 98.87% f1-score to prevent and detect IoT botnet attacks. To demonstrate the effectiveness of the proposed two-fold approach, we trained three other ResNet-18 models over three different datasets for detecting scan and DDoS attacks and compared their performance with the proposed two-fold approach. The experimental results prove that the proposed two-fold approach can efficiently prevent and detect botnet attacks as compared to other trained models.


I. INTRODUCTION
Internet of Things (IoT) revolutionized the technology by enabling real-world objects/things to connect and communicate with each other over the internet to luxuriate human life [1], [2]. Over the past few years, the adoption of smart IoT devices like smart cameras, smart TV, smart wearables, smart toys, smart bulbs, etc., is exponentially increasing in our daily life [3], [4]. Therefore, this new emerging trend in the field of computing has empowered our everyday life objects to connect and communicate with each other without human intervention. Despite the IoT devices are helping us in a lot of areas, these devices have negligible or very limited security features [3]. Furthermore, many IoT devices come with a fixed key or hard-coded default username and password, which a user cannot change [5]. These security pitfalls make it easy for hackers to exploit these insecure IoT devices and get control over them [4].
The recent trends reveal that the cyber-attacks increasing day by day with the rapid increase of insecure IoT devices [6]. Among the recently reported cyber-attacks, botnet and distributed denial of service (DDoS) attacks are the most prevalent attacks, which are increased both in frequency and magnitude over the last decade [4], [6]. A botnet attack is a cyber-attack in which an attacker first scans a network to look for weakly secured or vulnerable (IoT) devices. After analysing the scanning information, the attacker targets vulnerable (IoT) devices to install a bot program into them through malware [7].
The installed bot program connects the infected devices with a central server or a peer network from where the further commands are sent to them to perform different malicious activities like sending spams, flooding DDoS [6], [8], etc., from plenty of infected IoT devices over the target server, website, etc. Once an IoT device gets infected and becomes part of a botnet, then the attacker uses the infected device to perform DDoS attacks.
The botnet attack is not only a serious threat to insecure IoT devices but also a crucial threat to the whole internet [6]. With the advent of the Mirai botnet attack in 2016, the IoT botnet attacks are continuously escalating [9]. After the public disclosure of the Mirai botnet source code, many variants and imitators of Mirai botnet have been evolved [9]. These new variants and imitators have infected millions of IoT devices [3], [9] and wreaked ever large and catastrophic DDoS attacks like GitHub [10], AWS [11], etc., over the past few years.
Nowadays, attackers can easily locate insecure IoT devices via online services such as Shodan [12], Censys [13], etc. These online search engine services provide a huge amount of information to attack insecure IoT devices [9]. By compromising the insecure IoT devices, an attacker can perform several cyber-attacks such as spamming, phishing, DDoS [6], [8], [9], etc., to wreak havoc against the other resources on the Internet. Some recent studies exposed that IoT devices are much prone to botnet and DDoS attacks, as a wide range of DDoS attacks are performed by compromised IoT devices [14] [15]. Likewise, Gartner recently predicted that 25% of the cyber-attacks are posed due to the insecure IoT devices [16].
In order to secure the insecure IoT devices to become a bot and perform different DDoS attacks, there must be an efficient security system to detect IoT bots. The existing botnet and DDoS attack detection techniques are divided into two categories, i.e., host-based techniques and network-based techniques [17]. Due to the resource constraint nature (i.e., limited memory, battery, and compute power) of IoT devices, the host-based solutions are not feasible for IoT devices [1], [17]. However, the network-based solution is a better way to protect the IoT devices and network from these devastating cyber-attacks. The network-based techniques are subdivided into three main types [18]- [22]: 1) Signature-based detection method: relies on matching the network traffic with some specific rules defined in the rule database to detect and prevent potential attacks. 2) Anomaly-based detection method: analyses the normal behaviour of network traffic and builds a baseline profile of each device communicating in the network Any significant deviation from the baseline is considered as an anomaly. The anomaly-based detection method is further classified into two subtypes: • Statistics-based detection: These methods detect anomalies based on a statistical distribution of intrusions. • Machine learning-based detection method: detects abnormalities based on packet and payload features. These methods mainly detect and prevent potential attacks using machine learning models. • Knowledge-based detection method: detects anomalies based on the profile or previous knowledge of a network. The profile or previous knowledge of the network is generated under different test cases to detect abnormalities in the network [22]. 3) Specification-based detection method: performs intrusions detection based on the specifications or rules defined by a user [22].
The major drawback of the signature-based detection method is that it only detects the known threats for which the rules are available in its rules' database [20], [21]. On the other hand, the stateful protocol-based detection methods have limited ability to inspect the encrypted traffic. However, the traffic behaviour analysis, i.e., anomaly detection is very effective in both analysing the encrypted traffic and detecting the unknown attacks [19]. In case of anomaly detection methods, the machine learning approach has shown tremendous performance in recent years. The machine learning-based detection methods are trained on datasets to learn and distinguish the behaviour and pattern of normal and attack traffic [20], [21]. Henceforth, by learning the normal and attack traffic patterns, the machine learning models are useful to detect new botnet and DDoS attacks that are derived variants or imitators of the existing botnet and DDoS attacks. The existing botnet attack detection methods detect the botnet after the IoT devices are compromised by some malware and start performing malicious activities as directed by the botmaster. Moreover, the performance of most of the existing machine learning based botnet detection models is limited to a specific dataset on which they are trained [6]. This is due to the fact that different datasets contain different types of botnet attacks. Further, the features used for detecting botnet attacks from one certain dataset, are not adequate to efficiently detect the botnet attacks from other datasets due to the diversity of botnet attacks [6]. As a consequence, these solutions do not perform well on other datasets due to the diversity of attack patterns [6]. However, in order to protect the IoT devices from being compromised, there is a crucial need for providing a protection mechanism to safeguard the IoT devices from botnet and DDoS attacks during the premature stage (i.e., scanning) of the botnet attack. Therefore, in this work, we propose a novel two-fold approach to prevent a botnet attack during the premature stage (i.e., scanning attack) and to detect DDoS attack in IoT network in case an attacker compromises an IoT device and start performing a DDoS attack. As discussed earlier that an attacker can use the bot-infected IoT devices to perform different malicious activities like sending spam emails, flooding DDoS [6], [8], etc., however, in this work, we focus on detecting DDoS attacks performed by bot-infected IoT devices. The proposed two-fold approach uses a state-of-theart deep learning model, i.e., ResNet which is first trained for detecting the scanning activity and then trained for detecting the DDoS attack performed by the attacker or compromised IoT devices towards or outside the network.
For preventing the IoT devices and network from IoT botnet attacks, in the first fold, we trained the ResNet-18 [23] model for scanning attack detection so that it can detect the premature attack stage and notify about the malicious attempt before an attacker goes to further steps for compromising the IoT devices. On the other hand, in the second fold, we trained the ResNet-18 [23] model for DDoS attack detection to detect and mitigate the botnet attack, in case an attacker invades the scanning attack detection model, install malware on IoT devices and starts performing DDoS attacks. The key contributions of this work are as follows: • We analysed the frequently used scanning and DDoS attack techniques and produced a generic dataset by generating 33 types of scan and 60 types of DDoS attacks. In addition, we partially integrated the scan and DDoS attack samples from three publicly-available datasets for maximum attack coverage for better training of machine learning algorithms. • We proposed a two-fold machine learning approach to prevent and detect both inbound and outbound botnet attacks in the IoT network environment. The proposed two-fold approach prevents IoT botnet attacks by detecting the scanning activity, while it detects the IoT botnet attack by identifying the DDoS attack. • To demonstrate the effectiveness of the proposed twofold approach, we trained three ResNet-18 [23] models over three different datasets and compared their performance with the proposed two-fold approach for detecting and preventing IoT botnet attacks.
The rest of the paper is organized as follows: Section II presents a review of some existing work for botnet attack detection. Section III describes some background knowledge needed to understand the botnet attacks basics. Section IV explains the proposed methodology to prevent and detect IoT botnet and DDoS attacks. Section V discusses the experimental setup and results of the proposed two-fold approach  [4]. In the graph-based botnet detection techniques, all the communication nodes of a network are analysed to detect the anomalies which communicate differently as compared to the neighbour nodes [24]. On the other hand, in the flow-based botnet detection approach, both the inbound and outbound traffic statistics, i.e., features are monitored by the machine learning algorithms which detects the botnet attacks based on the traffic pattern resemblance.

VOLUME 4, 2021
Nguyen et al. [16] proposed a graph-based approach to detect the IoT botnet via printing string information (PSI) graphs. The authors used PSI graphs to get high-level features from the function call graph and then trained a convolution neural network (CNN), a deep learning model, over the generated graphs for IoT botnet detection. Likewise, Wang et al. [24] proposed an automated model named as BotMark. Their proposed model detects botnet attacks based on a hybrid analysis of flow-based and graph-based network traffic behaviours. The flow-based detection is performed by k-means, which calculates the similarity and stability scores between flows. While the graph-based detection uses the least-square technique and local outlier factor (LOF) which measures anomaly scores. Similarly, Yassin et al. [25] proposed a novel method that compromises a series of approaches such as the utilization of the frequency process against registry information, graph visualization and rules generation. The authors investigated the Mirai attacks using the graph-theoretical approach. In order to identify similar and dissimilar Mirai patterns, the authors used directed graphs. The proposed approach only focuses on the Mirai attack.
Almutairi et al. [27] proposed a hybrid botnet detection technique that detects new botnets implemented on three levels, i.e., host level, network level and a combination of both. The authors focused on focused HTTP, P2P, IRC, and DNS botnet traffic. The proposed technique consists of three components: host analyser, network analyser, and detection report. The authors used two machine learning algorithms, i.e., Naïve Bayes and a decision tree for traffic classification. Similarly, Blaise et al. [28] proposed a bot detection technique named BotFP, for bot fingerprinting. The proposed BotFP framework has two variants, i.e., BotFP-Clus which groups similar traffic instances using clustering algorithms and BotFP-ML is designed to learn from the signatures and identify new bots using two supervised ML algorithms, i.e., SVM and MLP. Likewise, Soe et al. [30] developed a machine learning-based IoT botnet attack detection model. The proposed model consists of two stages: a model builder and an attack detector. In the model builder stage, data collection, data categorization, model training and feature selection are performed step by step. While in the attack detector stage, the packets are first decoded and then the features are extracted in the same way as in the model builder phase. Finally, the features are passed to the attack detector engine where artificial neural network (ANN), J48 decision tree, and Naïve Bayes machine learning models are used for botnet attack detection.
Sriram et al. [31] proposed a deep learning-based IoT botnet attack detection framework. The proposed solution specifically considered network traffic flows, which are further converted into feature records and then passed to the deep neural network (DNN) model for IoT botnet attack detection. Nugraha et al. [32] evaluated the performance of four deep learning models for botnet attack detection by performing a couple of experiments. The experimental results revealed that CNN-LSTM outperformed all deep learning models for botnet attacks detection.
Parra et al. [33] proposed a distributed deep learning framework based on cloud computing. Their framework is designed to detect phishing and IoT botnet attacks. Their model consists of two machine learning models: (1) a distributed CNN (DCNN) for detecting URL based attacks directed to a client's IoT devices, (2) a recurrent neural network (RNN) and an LSTM network model for detecting Botnet attacks at the backend. Pektacs et al. [34] performed botnet detection using deep learning on network flow traffic. The proposed deep neural network was deployed to classify the traffic as benign or malicious. For making the performance of the model better, hidden layers and neurons are investigated. The proposed model has achieved 99% accuracy. Likewise, Ahmed et al. [35] proposed a deep learning model for botnets attacks detection.
Maeda et al. [36] proposed the botnets attack detection via deep learning on software-defined network (SDN). For botnet detection, the authors trained the deep learning model using the data collected on flow-based traffic from the botnets and then evaluated the detection accuracy. The authors used a multi-layer perceptron (MLP), a deep learning model, to detect infected IoT devices. Similarly, Meidan et al. [18] developed a novel IoT botnet attack detection technique via deep auto-encoder. For the malicious network traffic, nine IoT devices were infected with well-known IoT botnets, Mirai and BASHLITE. The authors trained deep auto-encoders separately for each IoT device on both benign and attack traffic.
Bovenzi et al. [37] proposed a hybrid two-stage intrusion detection system (IDS) for the IoT environment. Their proposed approach first detects the anomalies from the network traffic, while in the second stage they classify the anomalies into attack classes. The authors used a multi-modal deep auto-encoder for anomalies detection, while used three machine learning classifiers to classify anomalies detected in the first stage. Likewise, Mirsky et al. [39] also used autoencoders and proposed a plug and play network IDS, i.e., Kitsune to detect anomalies on local network traffic using an unsupervised learning approach. The authors used a selfgenerated botnet attack dataset and evaluated the performance in both online and offline modes. Their proposed solution achieved good performance comparable to offline anomaly detectors. Table 2 summarizes the distinctive characteristics of the works discussed above. It can be observed that most of the existing botnet detection approaches used traffic flow-based machine learning approaches for botnet attacks detection. Moreover, most of these solutions do not perform well on other datasets due to the diversity of attack patterns [6]. However, these existing techniques only detect the botnet attack after the IoT devices get compromised and start performing malicious activities like DDoS attacks. In order to prevent the IoT devices from being compromised, in this study, we propose a two-fold approach that detects the attacker's malicious activities at the premature stage (i.e., scanning) of

A. COMPONENTS OF AN IOT BOTNET
In general, a botnet comprises four components. These components include the bot program, zombie device, bot-master and command and control (C&C) server. The botnet attack starts with the bot-master, which can be an attacker itself or an automated program written by the attacker. The bot-master scans the target IoT devices connected over the internet. Based on the scanning results, the bot-master exploits the vulnerable IoT devices and installs a bot program in vulnerable devices. The bot program establishes a connection with a bot-master or C&C server to receive the instructions for performing malicious activities. A brief description of each component is given in the following subsections:

1) Bot Program
A bot program is a malware installed on an infected device by an attacker. The bot program resides in the victim IoT device and establishes a connection with the C&C server or botmaster to receive the instructions for performing malicious activities like sending spams, performing flooding attacks [6], [8], etc.

2) Zombie Device
A physical victim device, on which a bot program is installed by the attacker, is called a zombie device. In the IoT scenario, these devices include smart cameras, smart TVs, smart wearables, etc.

3) Bot-master
A bot-master or a bot header is the main controller of a botnet. It can be a hacker or an automated program controlled by the hacker to organize the botnet attacks. The bot-master works as the main operator of a botnet that issues commands to the C&C server (in client-server architecture) or specific bots (in peer-to-peer architecture).

4) C&C Server
The C&C server is the central computer that controls the zombie devices based on the control signals received from the bot-master. The C&C server is not a compulsory component of every botnet. In the case of a peer-to-peer network, the bot-master directly sends control signals to zombies.

B. LIFE-CYCLE OF AN IOT BOTNET
The botnet attack is a multi-stage attack [9]. A vulnerable IoT device passes through five stages to become a bot for performing malicious activities like sending spams, performing DDoS [9], etc. These stages are also referred as the botnet life cycle. These stages include scanning, malware injection, botnet connection, command execution, and maintenance & up-gradation as shown in Fig. 1. Scanning is the initial stage VOLUME 4, 2021 of a botnet life cycle in which the attacker collects information to proceed with further steps. The information collected by scanning helps an attacker to exploit the vulnerability which allows him/her to inject malware into the target device. Afterwards, the injected malware establishes the connection with the bot-master and executes the instructions received from the bot-master. Finally, the bot-master maintains and upgrades the infected devices in order to perpetuate the infection for future use. All these stages are defined in the following subsections:

1) Scanning
Scanning is the preliminary step of a botnet life cycle. In this stage, the attacker/bot-master scans the target network or target device to collect the initial information about the services, protocols, OS, etc. of the target device. The attackers use different techniques for scanning an IoT network or target device. Some popular tools used for scanning include: NMap [40], ZMap [41], Masscan [42], OpenVAS [43], etc.

2) Malware Injection
Once the attacker collects information about the target network/device, next he/she applies different exploitation methods to find the vulnerabilities of the target device/network. The successful exploitation, allows an attacker to inject the malware into a target device. Besides the vulnerabilities' exploitation, the attacker can trap the victim by sending malware through phishing, email attachments, etc. The victim unknowingly downloads the malicious software from the phishing website or email attachments, which helps the attacker to proceed with further steps. At first, the attacker installs a shellcode on the victim device. This step is also called an initial infection. The running shellcode fetches some more details about the victim device and sends it to a central server from where a bot program binary is downloaded along with some additional configurations. Afterwards, the bot program is installed with respect to the target device properties [44], [45]. This step is also called a secondary injection. When the bot program is installed in the victim device, it becomes a 'zombie' [45].

3) Botnet Connection
When the bot program starts running, it establishes a communication channel with the bot-master or C&C server to deem a valid bot. The initial connection attempts done by the zombie with the C&C server to receive further commands from the bot-master are called as rally [45]. When the malware is injected, it starts executing as per the attacker's strategy. The attacker can code the malware to run as a Trojan and run the malware based on some event. The running malware connects the target machine with the bot-master, from where it receives the commands to do further steps.

4) Command Execution
Once an infected device gets connected with a bot-master or C&C server, it becomes part of a botnet army. Afterwards, it waits for the C&C server's commands to perform malicious activities as instructed by the bot-master. This phase is also called the waiting phase. The malicious activities include scanning for new bots, sending spams, performing DoS attacks [6], [8], [46], etc. When many infected devices are connected with the bot-master, the bot-master sends commands to these infected devices to do flooding/DDoS attacks on a target server/network.

5) Upgradation & Maintenance
A bot program installed in a victim machine needs to be updated and maintained with time in order to remain undetected in the victim machine. This step has great importance for an attacker to perpetuate the malware infection so that the infected machine can be used in future attacks/malicious activities as well.
The life cycle of a traditional and IoT botnet is similar [5]. The difference is only in target devices. In traditional botnet attacks, the target of an attacker is to victimize the computers, servers, etc. while in the case of IoT botnet attacks, the target of an attacker is to victimize the IoT devices like smart cameras, smart TVs [5], etc.

IV. PROPOSED METHODOLOGY
In this work, we proposed a novel two-fold machine learning approach to prevent and detect botnet attacks in IoT networks. In the first fold, we trained a state-of-the-art deep learning model, i.e., ResNet-18 [23] for detecting the (preattack stage) scanning activity to protect the IoT network from botnet attacks. While in the second fold, we trained another ResNet-18 [23] model for detecting the DDoS attack that attackers perform after compromising the weaklysecured IoT devices.
As discussed earlier that the lifecycle of a botnet attack consists of five stages, i.e., scanning, malware injection, botnet connection, command execution, and maintenance & up-gradation. Scanning is the initial premature attack stage of the botnet attack. The proposed methodology stops an attacker during the scanning activity so that an attacker cannot proceed to further attack stages. Thus, the proposed methodology prevents botnet attacks by detecting the scanning attack Each fold of the proposed approach passes through five major stages for scanning and DDoS attacks detection, as illustrated in Fig. 2. In the first stage, we generated and captured the scanning and DDoS attack traffic (in .pcap format) in order to use it for training the machine learning models. In the second stage, we converted these network packet traces (.pcap files) into flows, then extracted the features and stored them in .csv files. Further, we labelled the dataset with labels such as 'normal' for benign traffic, 'scan' for scanning traffic and 'DDoS' for the DDoS attack traffic. In the third phase, we applied the Logistic Regression (LR) feature selection technique to optimize the performance of the machine learning model using minimum unique features. We used the LR feature selection technique due to its efficient performance in the existing literature [6], [8], [17]. Moreover, it is fast, simple, and has low complexity as compared to other feature selection techniques [6], [8], [17]. In the fourth stage, we trained two ResNet-18 [23] models over the resultant feature vector, once for scanning and then for DDoS attack detection. Finally, we test the performance of the trained machine learning models in order to validate their performance for real-time attack scenarios.
As discussed earlier that an IoT botnet attack initiates with the scanning activity and ends at the DDoS attack. Therefore, the proposed framework consists of two machine learning models, i.e., one for preventing the IoT botnet attacks while the other for detecting the DDoS attacks. For preventing the IoT devices and network from IoT botnet attacks, we trained the ResNet-18 [ Finally, we integrated both the ResNetScan-1 and ResNetDDoS-1 models to classify the incoming network data as scan, DDoS, or normal as shown in Fig. 2. The following subsections describe all the steps followed to train and test the proposed machine learning models for preventing and detecting IoT botnet attacks.

A. SCANNING ATTACK DETECTION
In the first fold, we trained a ResNet-18 [23] model for scanning attack detection by following the five steps as mentioned previously. These steps are described in the following sections.

1) Data Collection
The data collection is the preliminary step of the proposed methodology for scanning attack detection. In this step, we first analysed some existing techniques and approaches [47]- [51] that the attackers commonly use for scanning the IoT network and devices to collect the information.
Based on the literature review [47]- [51], we selected eleven different scanning methods that attackers widely use VOLUME 4, 2021 for gathering information about the vulnerable IoT during the premature attack stage. These scanning techniques include SYN scan, FIN scan, ACK scan, NULL scan, SYN-ACK scan, FIN-ACK scan, XMAS scan, UDP scan, TCP Window scan, TCP connect scan, and Banner grabbing and are performed using three widely used scanning tools, i.e., Nmap [40], Hping3 [52], and Dmitry [53]. In order to generate the scan traffic for this work, we performed these scanning attacks on two of the lab servers using three different scanning approaches which include horizontal scan, vertical scan, and box scan. So, a total (11 x 3) = 33 types of scanning attacks were performed to generate and collect scanning traffic. We performed scanning attacks by installing three widely used scanning tools, (i.e., Nmap [40], Hping3 [52], and Dmitry [53]), on an Ubuntu machine with a Core i7 processor and 8 GB RAM. Afterwards, we write some python scripts to execute different scanning commands. While performing the scanning attacks, we captured the network packets into .pcap files using the Wireshark tool [54]. We call this self-generated dataset as ScanLab dataset.
Besides generating and collecting the network packets from our experimental setup, we also obtained scan and normal traffic samples from three publicly-available datasets which include CICIDS-19 [55], CICIDS-17 [56] and Bot-IoT [38] dataset. The scan and normal samples of these three publicly-available datasets are acquired to compare the performance of the ResNet-18 [23] model trained over the ScanLab dataset, with ResNet-18 [23] models trained over other datasets for scanning attack detection.

2) Data Pre-processing
After capturing the scanning traffic, we need to pre-process the data. In the pre-processing step, we first extracted the features from the captured .pcap files of the ScanLab dataset using the CICFlowmeter [57] Tool. The CICFlowmeter [57] tool reads a given .pcap file and extracts more than 60 flow features for each flow which is identified based on five-tuple which include source IP, destination IP, source port, destination port, and protocol. The details of these features extracted by CICFlowmeter [57] are given at [58]. The CICFlowmeter [57] results a .csv file which consists of the flow features of a given .pcap file. The resultant data is unlabelled. So based on the IP addresses used for scanning, we labelled them as scan while the rest of the network traffic is labelled as normal traffic.
Similarly, we pre-processed CICIDS-19 [55], CICIDS-17 [56] and Bot-IoT [38] dataset and labelled the resultant .csv files with respect to the description of these datasets. Eventually, we partially integrated the scan attack samples (by randomly selecting the 50K samples from these three datasets) with the ScanLab dataset for maximum attack coverage for better training of machine learning algorithms.

3) Features Selection
Once we extracted the features from all .pcap files, the next step is to select the useful features that can better help a Based on the frequency analysis of the features selected by the LR algorithm, we found 15 most frequently selected features and named them as features set 1 (FS-1) as displayed in Figure 2 and enlisted in Table 3. The 15 features enlisted in Table 3 are selected from each dataset in order to use them in the subsequent stages of scanning attack detection.

4) Training ML Model for Scan Detection
After selecting the useful features, we split each dataset into the train, validation, and test set. For this purpose, we randomly selected 60% data for training, 20% data for validation and 20% data for testing to avoid overfitting and for efficiently training the ML model. Both the training set and validation set are used during the training phase. The training set is used to train the machine learning model. In order to efficiently train and better optimize the weights of an ML model, we validate the trained model on the validation set after each epoch based on which the optimizer algorithm updates the weights of the ML model. Finally, when the ML model completes its training, we test its performance over unseen data, i.e., test set. As mentioned earlier that we used the ResNet-18 [23] model and first trained it over the train set of the ScanLab dataset. Originally, the ResNet-18 [23] model is designed to classify the image processing and computer vision problems [4] which consist of images, i.e., high dimensional arrays. So, before starting the training, we need to convert the data into high dimensional arrays since the ResNet-18 [23] model is prone to overfit at low dimension data [4], [59]. Therefore, we first converted the whole ScanLab dataset into high dimension arrays of size 15x15x1 and saved them as images by following the method described in [4]. Similarly, we converted the other three datasets into greyscale images of size 15x15x1. Furthermore, we also need to mention the hyperparameters, i.e, learning rate, batch size, number of epochs, and optimizer. The learning rate controls the updates of ML model weights based on the estimated error after each epoch. The epochs tell about the number of iterations for which an ML model is trained. The batch size divides the given dataset into small chunks for fast and better training. While the optimizer set the better attributes of weights of a neural network to improve the speed and performance of a machine learning model. So, we set the learning rate as 0.01 with batch size 100 with 12 epochs, and selected stochastic gradient descent (SGD) optimizer.

5) Testing and Affirmation
Once the trained models are saved, we then test the performance of the trained ResNet model over the test set. The test set consists of samples that are separated before training the ML model. As the test set is unknown for the trained model, so in order to check the performance of the trained model over the test set we evaluated the performance of the trained model over four commonly used performance metrics. These performance metrics are described in Section V. The results of the trained model over the test set for scanning attack detection are also given Section V.
In order to affirm the effectiveness of the proposed methodology, we test the performance of all scan detection models in four phases. In the first phase, we test the proposed ResNetScan-1 model over the three datasets that were not used in its training, i.e., CICIDS-19 [55], CICIDS-17 [56] , and Bot-IoT [38] dataset. Similarly, in the second phase, we test the ResNetScan-2 model over the three datasets that were not used in its training, i.e., ScanLab, CICIDS-19 [55] , and Bot-IoT [38] dataset. Likewise, in the remaining phases, the other two ResNetScan models, i.e., ResNetScan-3 and ResNetScan-4 are tested over datasets that were not used in their training. Finally, we compared the performance of all ResNetScan models on each dataset.

B. DDOS ATTACK DETECTION
In the second fold of the proposed approach, we trained a ResNet-18 [23] model for DDoS attack detection. As mentioned earlier, the DDoS attack detection model is proposed to detect the DDoS attacks in case if an attacker invades the scanning attack detection model, installs malware on IoT devices and starts performing DDoS attacks.
In order to develop the DDoS detection model, we followed the five steps as mentioned earlier. These steps are described in the following sections.

1) Data Collection
Like the scanning attacks, in this step, we first analyzed the existing DDoS attack techniques that are commonly used by the attackers to perform DDoS attacks. There exist a vast literature on DDoS attacks as compared to the scan attacks, therefore, we reviewed some recent studies [60]- [62] on DDoS attacks to analyze the DDoS attack types.
Based on the analysis of different DDoS attack techniques, we performed 60 different types of DDoS attacks which include all 57 TCP flag based attacks given in Table 4, UDP, ICMP, and HTTP flooding attacks. All these DDoS attacks are performed using the Hping3 tool [52]. We write python scripts that execute different commands of Hping3 [52] to perform DDoS attacks on two of our Lab servers. While performing the DDoS attacks, we captured the network packets into .pcap files using the Wireshark tool [54]. We named this self-generated dataset as the DDoSLab dataset. VOLUME 4, 2021 Besides generating and collecting the network packets from our experimental setup, we also obtained DDoS and normal traffic samples from three publicly-available datasets which include CICIDS-19 [55], CICIDS-17 [56] and Bot-IoT [38] dataset. These datasets are considered to compare the performance of the ResNet-18 [23] model trained over selfgenerated DDoSLab, with the ResNet-18 [23] model trained over other datasets for scanning attack detection.

2) Data Pre-processing
After capturing the DDoS traffic into .pcap files, we extracted the features from the captured .pcap files of the DDoSLab dataset using the CICFlowmeter [57] Tool. Since the CI-CFlowmeter [57] results in an unlabelled .csv file which consists of the flow features of a given .pcap file. So based on the IP addresses used for the DDoS attack, we labelled them as DDoS while the rest of the network traffic is labelled as normal traffic.
Likewise, we pre-processed DDoS and normal traffic of CICIDS-19 [55], CICIDS-17 [56] and Bot-IoT [38] dataset and labelled the resultant .csv files with respect to the description of these datasets. Further, we partially integrated the DDoS attack samples (by randomly selecting the 50K samples from these three datasets) with the DDoSLab dataset for maximum attack coverage for better training of machine learning algorithms.

3) Features Selection
Once the features are extracted and the dataset is labelled, we then applied the LR algorithm on the labelled dataset in order to select the useful features that can better help a machine learning model for distinguishing the normal and DDoS traffic. So, using the LR algorithm, we first selected the top 20 features including the DDoSLab dataset and three other selected datasets. Afterwards, we performed a frequency analysis-as done in [6]-of the features selected by the LR algorithm from DDoSLab and all three selected datasets. Finally, based on the frequency analysis of the features, we found 15 most frequently selected features and named them as features set 2 (FS-2) as displayed in Figure 2 and enlisted in Table 3 that are selected from DDoSLab and the other three datasets.

4) Training ML Model for DDoS Detection
After selecting the useful features, we then divided the dataset into train, validation, and test set. We randomly selected 60% data for training, 20% data for validation and 20% data for testing to avoid overfitting and for efficiently training the ML model. In order to train the ML model over the DDoS dataset, we selected the ResNet-18 [23] model due to its better performance in existing research works.
Before starting the training, we converted the whole DDoSLab dataset into high dimension arrays of size 15x15x1 and saved them as images by following the method described in [4]. Similarly, we converted the other three datasets into greyscale images of size 15x15x1. Furthermore, we set the values of hyperparameters, i.e, learning rate as 0.01 with batch size 64, 12 epochs, and selected SGD optimizer.

5) Testing and Affirmation
Once the training of the ResNet-18 [23] model is complete, we then test the performance of the trained model over the test set. Since the test set is unknown for the trained model, so in order to check the performance of the trained model over the test set, we evaluated the performance of the trained model over four commonly used performance metrics. These performance metrics are described in Section V. The results of the trained model over the test set for DDoS attack detection are also given in Section V.
In order to authenticate the efficiency of the proposed methodology, we evaluated the predictions of all the trained models in four phases. In the first phase, we test the proposed ResNetDDoS-1 model over the three datasets that were not used in its training, i.e., CICIDS-19 [55], CICIDS-17 [56] , and Bot-IoT [38] dataset. Likewise, we cross-validated the performance of other saved models over the dataset that were not during their training. Finally, we compared the performance of all ResNetDDoS models on each dataset.

A. EXPERIMENTAL SETUP
As mentioned earlier that in this study, we first analysed the existing scanning and DDoS attacks techniques. Based on the analysis, we generated 33 types of scanning attacks traffic and 60 types of DDoS attacks traffic using 3 different network traffic generator tools, i.e., Nmap [40], Hping3 [52] and Dmitry [53]. All these tools were installed in a Core i7 machine with 8 GB RAM and having Ubuntu-18 operating system installed on it.
The generated network traffic was captured using the Wireshark tool [54] in .pcap format. After that, we extracted features from these .pcap files and performed labelling according to the IP addresses of the machines used in the experiment. Afterwards, we applied the feature selection techniques and split the dataset into train and test set to proceed with the proposed methodology with further steps as described in Section IV. The proposed two-fold approach detects two cyber-attacks, i.e., scan attack and DDoS attack. Both of these attacks have different nature. Similarly, the best features resulted from the feature selection are different for both attacks, as enlisted in Table 3. Therefore, we trained two The ResNet-18 [23] is basically a convolutional network with eighteen layers which consists of 10 convolution layers and 8 pooling layers. Fig. 3 shows the architecture of the ResNet-18 [23] model used in this work. In total, our ResNet-18 [23] model had 11,185,666 computational parameters out of which 11,176,066 were trainable parameters while 9600 were non-trainable parameters. In order to train and test ResNet-18 [23] model for this work, we used Python3 and TensorFlow v2.2 library run over Google Colab environment. Table 5 enlists the execution time spent while training and testing the two separate ResNet-18 [23] models in the proposed two-fold approach. Based on the testing stats shown in Table 5, it can be verified that the ResNetDDoS-1 model takes (49.45/63668 =) 776.685 µs for testing an image while the ResNetScan-1 model takes (5.75/12546 =) 458.313 µs for testing an image. So, using a second model induced 1.59 times more delay (while considering the ResNetDDoS-1 as a single primary model) as compared to a single detection model.

B. PERFORMANCE EVALUATION
The proposed two-fold approach for preventing and detecting IoT Botnet attacks is evaluated based on the four commonly used performance parameters. These parameters include precision, recall, accuracy, and F1-score. These parameters are defined below. In order to calculate these performance parameters, we first define the following terms: 1) Accuracy: It is defined as the ratio of correctly classified the attack flows as 'attack flow' (i.e., TP) and normal traffic flows as 'normal flow' (i.e., TN). Mathematically, it is defined as (1): 2) Precision: It tells about how many of the predicted attack flows were correct. Mathematically, it is described as (2): 3) Recall -It defines the ability of the system to correctly detect the attack upon the occurrence of the actual attack. It is also called as sensitivity. Mathematically, it is expressed as (3): 4) F1-Score -It is defined as the weighted harmonic mean of precision and recall. Mathematically, it is defined as (4): Table 6 summarizes the testing results of each ResNetScan model when tested over the test-set of the similar dataset on which it was trained for detecting the scanning attack traffic. The first row of the Table 6 shows the performance of the ResNetScan-1 model, which is obtained by training the ResNet-18 [23] model over the train-set extracted from the ScanLab dataset. When the ResNetScan-1 model is tested over a similar dataset (i.e., ScanLab dataset) on which it was trained, it resulted in 99.20% accuracy, 99.39% precision, 99.05% recall, and 99.22% f1-score for classifying the normal and scanning attack traffic. Similarly, the ResNetScan-2 model is obtained after training the ResNet-18 [23] model over the CICIDS-19 [55] dataset. When the ResNetScan-2 is tested over the test-set of its respective dataset, i.e., CICIDS-19 [55] dataset, it showed 99.91% accuracy, 100% precision, 99.83% recall, and 99.92% f1-score for detecting the normal and scanning  attack traffic. Likewise, the performance of other ResNetScan models for detecting the normal and scanning traffic over their respective datasets, is shown in Table 6. It can be observed that all the ResNetScan models performed well with accuracy, precision, recall, f1-score more than 98% when trained and tested for scanning attack detection. Overall, the ResNetScan-2 model outperformed all other models for correctly classifying the normal and scan traffic, when we evaluated its performance over the test-set of the similar dataset on which it was trained.

2) Test Scenario 2: When each ResNetScan Model is Tested over Other Datasets
In this test scenario, we performed experiments to crossvalidate the performance of each ResNetScan model over the test-set of other datasets. Similarly, in the second experiment, we tested the performance of the ResNetScan-2 model over the test-set of ScanLab, CICIDS-17 [56] and Bot-IoT [38] dataset. The experimental results reveal that on average, the ResNetScan-2 model resulted in 54.41% accuracy, 47.36% precision, 35.01% recall, and 40.23% f1-score as shown in Table 7. As a whole, the average performance of the ResNetScan-2 model over the other scan datasets decreased by 45.50%, 52.64%, 64.82%, and 59.69% in case of accuracy, precision, recall, and f1-score respectively. In a nutshell, the performance of the ResNetScan-2 model was significantly downgraded as compared to its performance when trained and tested over a similar dataset.
In like manner, in the third experiment, we tested the performance of the ResNetScan-3 model over the test-set of ScanLab, CICIDS-19 [55] and Bot-IoT [38] dataset. The experimental results illustrate that on average, the ResNetScan-3 model demonstrated 64.30% accuracy, 74.94% precision, 11.33% recall, and 17.02% f1-score. These results indicate that the ResNetScan-3 aptly detected the normal traffic samples, however, it did not fairly detect the scan samples due to which the average f1-score drastically reduced as compared to average precision.
Eventually, in the fourth experiment, we tested the performance of the ResNetScan-4 model over the test-set of ScanLab, CICIDS-17 [56] and CICIDS-19 [55] dataset. The experimental results summarized in Table 7 present that the ResNetScan-4 model exhibited 48.75% accuracy, 44.12% precision, 5.88% recall, and 9.87% f1-score. These results also present that the ResNetScan-4 model insufficiently detected the scan attack samples, owing to which the average f1-score terrifically decreased as compared to average precision.  Fig. 4 (a)-(d), it can be noticed that the highest per-formance is achieved when a ResNetScan model is tested over the test-set of a similar dataset on which it was trained. However, if we further observe the Fig. 4 (a)-(d), it can be the proposed ResNetScan model, i.e., the ResNetScan-1 model accomplished the second-highest performance scores and outperformed all other ResNetScan models with remarkable accuracy, precision, recall, and f1-score. Thus, the experimental results prove that the proposed ResNetScan-1 model outperformed all other ResNetScan models when each ResNetScan model is tested over the dataset on which it was not trained. In case, an attacker invades the scan detection stage, compromises an IoT device, and starts performing the DDoS attack, then the attacker can be detected in the second stage. Further, if an IoT device or network becomes the victim of the DDoS attack (i.e., the device is under attack from outside the network) then the DDoS attack detection model will detect the inbound DDoS attack to stop the attacker from performing further damaging the network.

3) Test Scenario 3: When each ResNetDDoS Model is Trained and Tested over Similar Dataset
In this scenario, we performed experiments to compare the performance of each ResNetDDoS model when each ResNet-DDoS model is tested over the similar dataset on which it was trained. Table 8 [55] dataset. Likewise, the performance of other ResNetDDoS models for detecting the normal and DDoS traffic over their respective datasets, is shown in Table 8. It can be noticed that all the ResNetDDoS models performed well with accuracy, precision, recall, f1-score. Overall, the ResNetDDoS-2 model outperformed all other models for correctly classifying the normal and DDoS traffic, when we evaluated its performance over the test-set of the similar dataset on which it was trained.

4) Test Scenario 4: When each ResNetDDoS Model is Tested over Other Datasets
In this test scenario, we performed experiments to crossvalidate the performance of each ResNetDDoS model over the test-set of other datasets. Table 9 summarizes the results of all experiments in which we test each ResNetDDoS model for normal and DDoS attack traffic detection. In the first experiment, we tested the ResNetDDoS-1 model (obtained after training the ResNet-18 [23] model over the DDoSLab dataset) over the test-set of CICIDS-19 [55], CICIDS-17 [56] and Bot-IoT [38] datasets. The experimental results manifest that the ResNetDDoS-1 model performed predominantly well over all three datasets for correctly classifying the normal and DDoS attack traffic. On average, the ResNetDDoS-1 model manifested 98.70% accuracy, 97.53% precision, 97.96% recall, and 97.74% f1-score which is equivalent to its Likewise, in the second experiment, we tested the performance of the ResNetDDoS-2 model over the test-set of ScanLab, CICIDS-17 [56] and Bot-IoT [38] dataset. The experimental results showed that on average, the ResNetScan-2 model resulted in 72.40% accuracy, 84.41% precision, 19.80% recall, and 30.13% f1-score as shown in Table 9. As a whole, the average performance of the ResNetScan-2 model over the other DDoS datasets decreased by 27.09%, 14.99%, 79.83%, and 69.39% in case of accuracy, precision, recall, and f1-score respectively. In summary, the performance of the ResNetScan-2 model was significantly downgraded as compared to its performance when trained and tested over a similar dataset.
In the same way, in the third experiment, we tested the performance of the ResNetDDoS-3 model over the test-set of ScanLab, CICIDS-19 [55] and Bot-IoT [38] dataset. The experimental results illustrate that on average, the ResNetDDoS-3 model demonstrated 58.77% accuracy, 82.28% precision, 11.53% recall, and 19.36% f1-score. These results indicate that the ResNetDDoS-3 fairly detected the normal traffic samples, however, it did not fairly detect the DDoS traffic due to which the average f1-score drastically reduced as compared to average precision.
Finally, in the fourth experiment, we tested the performance of the ResNetDDoS-4 model over the test-set of ScanLab, CICIDS-17 [56] and CICIDS-19 [55] dataset. The experimental results summarized in Table 9 present that the ResNetDDoS-4 model illustrated 65.81% accuracy, 75.50% precision, 14.29% recall, and 20.25% f1-score. These results also present that the ResNetDDoS-4 model insufficiently detected the DDoS attack samples, owing to which the average F1-score crucially decreased as compared to average precision.  Fig. 5 (a)-(d), it can be noticed that the highest performance is achieved when a ResNetDDoS model is tested over the test-set of the similar dataset on which it was trained. However, if we further observe the Fig. 5 (a)  The experimental results of all the above experiments performed in four test scenarios reveal that all the ResNetScan and ResNetDDoS models efficiently detect the scan and DDoS attack when they are tested over the test-set of a similar dataset on which they were trained. However, the performance of all ResNetScan and ResNetDDoS models except ResNetScan-1 and ResNetDDoS-1 model crucially reduced when these models are tested over the test-set of other datasets on which they were not trained. Fig. 6 (a)-(b) illustrates the average performance of all ResNetScan and ResNetDDoS models over the test-set of other datasets on which they were not trained. It can be easily perceived that the average performance of both the proposed ResNetScan-1 and ResNetDDoS-1 models persisted highest as compared to all other models. Hence, the proposed ResNetScan-1 and ResNetDDoS-1 models outperformed all others for detecting scan and DDoS attacks. Overall, the proposed two-fold approach manifested 98.89% accuracy, 99.01% precision, 98.74% recall, and 98.87% f1-score to prevent and detect IoT botnet attacks as displayed in Fig. 7. The testing results and comparative analysis of the proposed two-fold approach on both self-generated datasets and three publicly-available datasets affirm that the proposed two-fold approach is effi- Overall performance of the proposed Two-Fold machine learning approach to prevent and detect IoT botnet attacks cient and robust to prevent and detect IoT botnet attacks with large attack patterns coverage.

VI. CONCLUSION
In this work, we proposed a two-fold machine learning approach to prevent and detect IoT botnet attacks. In the first fold, we trained a state-of-the-art deep learning model, i.e., ResNet-18 for scanning attack detection, and named it ResNetScan-1 model. While in the second fold, we trained another ResNet-18 model (named as ResNetDDoS-1 model) in order to detect the DDoS attack in case if the scanning detection model fails to prevent a botnet attack. In order to authenticate the performance of the proposed ResNetScan-1 model and ResNetDDoS-1 model, we performed a couple of experiments in which we take the scan and DDoS traffic samples from three publicly-available datasets, trained the ResNet-18 model over these datasets, and saved the resultant ResNetScan and ResNetDDoS models. We then tested each resultant ResNetScan and ResNetDDoS model over the testset of other datasets on which they were not trained. The experimental results revealed that the performance of all ResNetScan and ResNetDDoS models except the proposed ResNetScan-1 and ResNetDDoS-1 model crucially reduced when tested over the datasets on which they were not trained. Furthermore, the experimental results proved that the proposed ResNetScan-1 and ResNetDDoS-1 models persisted in their performance and outperformed all other models for detecting the scan and DDoS attacks. Hence, the proposed two-fold approach is efficient and robust to prevent and detect IoT botnet attacks with a large attack patterns coverage.
The current work only covers 33 types of scanning and 60 types of DDoS attacks. In future, we aim to cover more scanning and DDoS attacks techniques in order to well train the proposed framework for more efficient prevention and detection of IoT botnet and DDoS attacks. Further, we can deploy the proposed two-fold approach in an IDS to investigate its effectiveness on live network traffic.   In 2000, he joined Elixir Technologies Pvt. Ltd., Pakistan as a software engineer and worked in the areas of Enterprise database applications and printing streams based applications for heavy duty printers. From 2004 to 2008, he worked as team lead and software architect for Interactive Group Pvt. Ltd., Pakistan -in the area of multimedia applications design and development. In 2008, he joined Next Generation Intelligent Networks Research Center (nexGIN RC), Islamabad as a project manager, software architect and researcher. Currently, he is working as Principal Researcher in Ebryx (Pvt) Ltd. Pakistan. His research area encompasses cyber security, non-signature base, intelligent and self-healing security solutions for smart phones and desktop operating systems, data mining and evolutionary algorithms. VOLUME 4, 2021