Cyber Threats Detection in Smart Environments using SDN-enabled DNN-LSTM Hybrid Framework

Internet of Things (IoT) is an instantly exacerbated communication technology that is manifesting miraculous effectuation to revolutionize conventional means of network communication. The applications of IoT are compendiously encompassing our prevalent lifestyle and the integration of IoT with other technologies makes this application spectrum even more latitudinous. However, this admissibility also introduces IoT with a pervasive array of imperative security hazards that demands noteworthy solutions to be swamped. In this scientific study, we proposed Deep Learning (DL) driven Software Defined Networking (SDN) enabled Intrusion Detection System (IDS) to combat emerging cyber threats in IoT. Our proposed model (DNNLSTM) is capable to encounter a tremendous class of common as well as less frequently occurring cyber threats in IoT communications. The proposed model is trained on CICIDS 2018 dataset, and its performance is evaluated on several decisive parameters i.e Accuracy, Precision, Recall, and F1-Score. Furthermore, the designed framework is analytically compared with relevant classifiers, i.e., DNNGRU, and BLSTM for appropriate validation. An exhaustive performance comparison is also conducted between the proposed system and a few preeminent solutions from the literature. The proposed design has circumvented the existing literature with unprecedented performance repercussions such as 99.55% accuracy, 99.36% precision, 99.44% recall, and 99.42% F1-score.

security concerns. The involvement of the internet in large scale IoT environments encourages cyber security solutions to overcome these dynamic threat metrics. So a plethora of cutting edge technologies are rubbing shoulders together to ensure security around IoT environments against internal and external security threats [6]. Software-Defined Networks (SDN) based solutions are considered to be more prominent to obtain these desired security objectives [7]. Artificial Intelligence (AI), and Machine Learning (ML) are some other significantly prominent technologies that are progressively functioning to obtain the same goals [8]. These technologies can be interlaced together and this amalgamation can provide an aggregated response to counter a diverse variety of security threats in IoT. Over the past decade, the conglomeration of ML with SDN based approaches has flourished as a prominent tool to detect the presence of security threats in IoT communication [9]. SDN based approaches legislatively contribute towards the identification of anonymous activities whereas ML-based approaches provide supportive strength towards the durability of detection mechanism [10,11]. The programmable features of SDN propound ample room for AI as well, where AI-based algorithms in acquaintance with SDN based frameworks are contemplated as an exquisite solution to overwhelming security threats in IoT [12,13]. A conventional SDN framework can be majorly classified into three planes referred to as control plane, data plane and application plane [14]. The control plane is entirely configurable and can possess the potential capabilities to integrate interloper networks such as IoT with the data plane. The data plane then ensures a smooth flow of data across both participants under the regulations of the control plane [15]. The control plane in other words is capable to control the inner communicational infrastructure of IoT by taking a pilot control over the assemblage system. All the heterogeneous nodes in the IoT network are dynamically supervised through the control plane where surveillance of cyber threats can be performed in an acclaimed fashion [16,17]. The DLbased approach offers extensive strengths in the analysis of traffic patterns. The classic deep learning-based framework is initially trained on a comprehensive dataset where it matriculates through a vast range of exclusive security threats. Then the system is deployed in the actual communication environment where it can momentously identify the existence of relevant malicious entities in the concerned communicational network [18]. All these consequential impressions are the core motivation that prodigiously fascinated us to propose a deep learning-driven, SDN-based, intrusion detection system for IoT based communication environments.

A. CONTRIBUTION
Our major contributions in the under contention research study are enlisted as follows: • We contemplated a deep learning-inspired, SDNenabled intrusion detection system labelled as Cu-DNNLSTM to interrogate the presence of emerging cyber threats in IoT environments.
• CICIDS2018 dataset is used to train and enhance the threats detection capabilities of the proposed model. • The constituted framework encircles a consolidated sequence of Cu-DNNGRU and Cu-BLSTM classifiers that are acquired as a comparison to the same dataset. • The performance of the designed model is evaluated on a comparative scale with existing solutions in the same regard. • Simulation results insinuate to strengthen the proposed model in terms of efficient threat detection, high accuracy, significant precision, low resource consumption, and less computational overhead. • Finally, 10 fold cross validation is conducted to show unbiased results.

B. ORGANIZATION
This scientific study is organized in a systematic order in which, Section 2 discloses detailed background along with relevant scientific literature. Section 3 contains the proposed methodology accompanied by the elaboration of the dataset, and algorithms. Section 4 spotlights the performance evaluation setup used to validate the performance of the proposed model. The obtained results are discussed in Section 5, and finally, the study is concluded in Section 6 of this paper.

II. BACKGROUND AND RELATED WORK A. IOT AND SDN
IoT is an instantly evolving communication technology that comes to transmute the long-established mediums of communication. The synchronized and automated connectivity among various heterogeneous devices is the core strength of IoT [19]. The applications of IoT canvasing every facet of our lives, and the utility circle of IoT is still expanding. IoT also possesses the capability to be assimilated with other states of the art technologies to share the harmonized objectives [20]. The catalogue of such technologies encloses machine learning, SDN, fog computing, etc. Moreover, cloud sharing, big data analysis, blockchain spectrum etc are some other third party consolidated technologies that can be actively synchronized with IoT [21]. SDN is desegregated with IoT for bounteous reasons as SDN is capable to enhance the effectiveness of IoT in the manifold. SDN comprises three basic layers that transparently govern its communication architecture. These layers are widely categorized as the control plane, data plane and the application plane [22,23]. The application plane is strategically different from the rest of the planes and it only provides a comprehensive implementation of commands made by the other planes [24]. The control plane comes with programmable features that adequately interlinked the aspiring outsider communication technologies such as IoT within the data plane [25]. The control plane can further take conclusive control upon the communication nodes of IoT. All the data traffic transmitting over an IoT network can be dynamically analysed through the SDN control plane. In this Figure 1: Proposed Network Framework way, SDN offers conglomerated services, i.e., customization, scalability, and security in IoT [26].

B. RELATED WORK
The aggrandised range of IoT applications makes it influenceable against a multifariousness of security threats that need to be encountered. SN based solutions are considered as an optimal choice to ensure secure, and reliable communication in IoT environments. A plethora of scientific attempts have been made in this regard, and we have included some of such meaningful contributions in this study.
Researchers proposed a Long-Short Term Memory (LSTM) approach to detect the presence of security threats in IoT. The whole traffic stream is analysed, and the suspicious entities are predicted to mitigate the chances of security breaches. Various datasets i.e CIDCC-15, UNSW-NB15, and NSL-KDD are amassed to evaluate the training matrix. A Bio-inspired Firefly Swarm Optimization (FSO) is further integrated with the proposed system to reduce the computational overhead [27]. A comprehensive feature set containing abnormal traffic patterns acts as an essential component to investigate the anonymous behaviour of malicious entries.
The same sort of attempt is made in [28], where authors proposed an Intrusion Detection System (IDS) that withholds the ideal feature set used for threat detection. LightGBM is used for feature screening, where, PPO2 and ReLU are used to strengthen the threat detection mechanism. [29] addresses the detailed elaboration of deep learning-driven IDS mainly designed to investigate common security attacks such as DOS slowloris, DOS Hulk, and port scanning based attacks. The system is integrated with the CICIDS2017 dataset to achieve the desired security objectives. The designed model is then evaluated with existing solutions and shows significant productive superiority with an attack detection accuracy of 98%. Another DL inspired intrusion detection scheme is presented in [30], which is purely inspired by Convolutional Neural Networks (CNN). Authors claim to investigate and further categorize the existence of crucial security threats in IoT. CICIDS2017 and UNSW-NB15 are integrated to enhance the attack detection compatibilities of the proposed system. However, significantly high resource consumption is noticed which makes this scheme not appropriate for resource-constrained networks. Another DL-driven intrusion detection scheme is designed in [31]. Binary classifier and VOLUME 4, 2016 multiclass classifier are employed in accompanying with BOT-IoT dataset. The designed scheme is capable to identify abnormal traffic with appreciable accuracy of 99%. Text-CNN and Gated Recurrent Unit (GRU) classifiers are categorized as the optimal choice for sequential data extraction as a language model. This pattern enhances the selection possibilities of best features, which typically tends to enhance the F1 score. Authors employed these classifiers in integration with KDD99, ADFA-LD datasets to effectively monitor abnormal activities in an IoT communication environment with 98% precision [32]. CNN is employed to another anomaly detection mechanism in an affix with BOT-IoT, and MQTT-IoT-IDS2020 datasets. The core purpose is to evaluate traffic patterns to discover suspicious events in large scale networks [33]. A hybrid feature selection model is acquired to fetch mostly commonly used features for attack detection by segmenting TCP/IP packets. NSL-KDD and UNSW-NB15 are further interlinked to strengthen the proposed system. The performance is evaluated in terms of industrial scenario where the proposed model seems to beat existing solutions with an admirable security matrix containing 97% accurate precision [34]. A combination of Single-hidden Layer Feedforward Neural Network (SLFN) and LSTM classifier is considered as an effectively practicable choice to clip healthy features with more probabilities of being used in threat detection. Authors have adopted these two classifiers to initiate a multi-layer threats classification approach. The IoT-ID20 dataset is procured for training purposes [35]. The proposed framework produces momentous consequences for threat detection and classification. However, the system seems to consume voluminous resources of the network. Deep learning affected malicious packet filtering approach is proposed for SDN-based IoT communication scenarios [36], Mirari data set and a manually formulated data set the video injection dataset are subsisted together to achieve the desired filtration target. DNN classifier is embedded to control the entire processing infrastructure. The proposed system is only capable to deal with DDoS attacks, and port scan attacks. A multi-CNN based approach is adopted with an alliance of the NSL-KDD dataset [37]. The authors aim to interrogate adversaries in industrial IoT. Simulation results prove the compatibility of the designed framework, however, a notable complexity is also experienced in large-scale networks. DoS attacks are responsible to slow down the overall performance of the system by casting aggregating impacts on central resources. Researchers aim to design a competent detection mechanism to examine the compromised nodes that are dedicated to creating DoS and DDoS attacks [38]. The DoS attacks are catered in a hierarchical pattern by using the approach presented in [39]. To fulfil the claim, researchers have incorporated three generic classifiers that are best known by their competencies to symmetrical categorize the traffic streams. CICIDS2017 and BOT-IoT datasets are used for training purposes. The designed framework exhibits its strength towards DoS attack detection with 99% accuracy and notable precision. Another DL-driven IDS is presented in [40], which is trained on a customised dataset by the researchers. Decision Tree (DT), Multilayer Perceptron (MLP), and LSTM are the classifiers employed to boost the detection potential of the proposed framework. Adversaries are discovered with higher comparatively higher accuracy of 98%. Keylogging attacks, and Data exfiltration attacks are gaining conspicuous popularity in SDN-based IoT communication networks. Authors have constructed a robust IDS to diagnose these attacks in IoT. C5 and SVM classifier are retrieved to design this framework and BoT-IoT is interlinked for the appropriate learning process. The proposed system pays high accuracy of 99% for attack detection, however, communication delays are experienced while evaluating the designed model [41]. Userto-Root (U2R) attacks, Probe attacks, and Remote-to-Local (R2L) attacks are categorized as detrimental security concerns towards the integrity of a communication system. Researchers have acquired Spider Monkey Optimization (SMO) algorithm, and Stacked Deep Polynomial Network (SDPN) algorithm to design a detection mechanism for such security concerns. NSL-KDD is inter-bounded to train the system and on an evaluation scale, the proposed model have shown 97% accuracy for attack detection with a precision of 95% [42]. Man in the Middle (MITM) attack, Reconnaissance, and spoofing attack can also be classified into major security threats for IoT. Researchers have designed an IDS with the integration of SVM, Naïve Bayes, and MLP classifiers. The system is trained on the NSL-KDD dataset, and the performance is evaluated in a scalable virtual simulation environment. The proposed system shows 98% accuracy towards attack detection with a distinguished extensive precision [43]. In [62] the authors used a novel approach "CANintelliIDS" for intrusion detection on Controller Area Network (CAN). The authors used a combination of CNN and GRU and claimed that the combination of these two models increases the performance of detection. The authors achieved an F1score of 93.79 %, 93.69 % precision, and 93.91 % recall. The authors in [63] used a temporal weighted averaging algorithm for asynchronous federated learning (AFL) to simulate an intrusion detection environment. The authors trained and tested the proposed model on the NSL-KDD dataset and achieved an accuracy of 99.50 % respectively. The authors of [64] proposed a Principal Component Analysis (PCA), Grey-Wolf Optimizer (GWO) hybrid model based on DNN for efficient and effective threat detection in the Internet of Medical Things (IoMT) environment. The authors claimed that their proposed model outclassed the existing ML techniques with a 15 % increase in detection accuracy and a 32 % decrease in time complexity. The related work is summarized in Table 1.

A. PROPOSED NETWORK MODEL
SDN is acknowledged as a granted solution to boost the paramount potential of a dynamic heterogeneous network [44]. Moreover, scalability, heterogeneous connectivity, customizable communication, surveillance, and security are some other ascendancy characteristics of SDN that must need to be discussed over here [45,46]. The core charisma lies in the core architecture of SDN as it compasses two processing layers and one interface layer. The interface layer is only responsible to implement, and reflecting the decisions made by the processing layers [47]. However, processing layers included the control plane and data plane, that actively participate in the decision making as well as facilitate other integrated technologies. The control plane introduces an entirely programmable architecture that provides a customizable administrative experience over the network [48]. It further can authorize the IoT devices into the data plane. We proposed a DNNLSTM model to overcome the emanated cyber threats in the industrial IoT. The designed model is embedded with the control plane of SDN because of multitudinous reasons.
The control plane of SDN is acquainted with an integral programmable interaction that further helps to control the fundamental operations of IoT. Hence it regularizes the communication mechanism in IoT networks and provides heterogeneous connectivity, dynamic scalability, and dominant governance. The data plane comprising of prevalent IoT devices that are are transmitting data across the network and this data is interlinked with the control plane through open flow switches. Hence, the control plane becomes capable to expedite the IoT devices into its data plane that opens doors for data filtering, traffic monitoring, and general inspection of communication streams. Thus, by integrating SDN with IoT, the emerging cyber threats along with the presence of other suspicious antagonists can be efficaciously overthrown.  For a thorough performance evaluation, a comparison is conducted with two meticulously identical classifiers i.e DNNGRU classifier and BLSTM classifier. The DNNGRU classifiers hold one layer of DNN with 500 and 300 neurons respectively, and one layer of GRU with 200 neurons. Moving forward, the BLSTM classifier engrossed a BLSTM layer with 500, 300 and 200 neurons respectively. Table 2 conscripts detailed information of the proposed model and other classifiers.

C. DATASET DESCRIPTION
Dataset is an integral component of every DL driven intrusion detection scheme. The selection of an adequate and commensurate dataset actively reinforce the threat detection scheme [49]. There exist a diverse variety of auxiliary datasets that comes to conspire these intrusion detection schemes. UNSW-NB15 [50], NSL-KDD [51], BOT-IoT [52], ADFA-LD [53] are some of these commonly endorsed dataset. However, along with numerous benefits, some prejudices have also adhered to these datasets. Lack of appropriate features for IoT, use of malevolent scripts for attack detection, and susceptibility to external cyber malfunctions are some of such enmities [54]. We have adopted the CICIDS2018 dataset which is remarkably known for its spacious range of features towards IoT communications [55,56]. This dataset implicates seven useful categories with up to 14 contemporary cyber threats (e.g brute force, heartleech attack, DDoS, infiltration attack, and port scanning attacks) [57]. More than 80 traffic scenarios are embedded in this dataset. [58]. In our proposed work, we have included all features of the CICIDS2018 dataset and its classes details along with instances are inducted in Table  3.

D. DATASET PREPROCESSING
CICIDS2018 dataset brings forth the acquiescent data in divergent forms. Using this raw data to classify an algorithm cannot retain substantive results. And hence, it needs to be sorted out before actually bringing it to perform. The first step was to remove any data that contained blank or NANvalues,as they can impact the quality of the data and the evaluation model. We used the label encoder, sklearn, to convert all non-numeric values to numeric values because DL algorithms primarily analyse numeric input. Additionally, the output label has been encoded as a one-hot encoding because the category ordering can have a negative impact while validating the performance of a proposed model.

E. DATA NORMALIZATION
When it comes to numeric columns in a dataset, normalisation refers to the act of translating their values to a similar scale without manipulating the value ranges. For machine learning, each dataset does not require normalisation. It is necessary only when features have a diverse range of values. To normalise CICIDS2018, we have used the Min-Max Scalar function. In this approach, the data is normally scaled to a fixed range that is usually between 0 and 1. A normalized dataset leads towards the effectiveness of the proposed model and yields productive outcomes.

A. EXPERIMENTAL SETUP
The performance validation of our proposed framework is carried out through analytical simulations, where an Intel processor, Core i7-7700 accompanied by a Graphical Processing Unit (GPU) is used. During the experimentation process, we have considered various comprehensive libraries such as Numpy, Tensor Flow, Pandas and Keras. However, the proposed model is concurrently trained on Keras with the 3.8 version of Python.

B. EVALUATION METRICS
To validate the performance of an intrusion detection framework, the evaluation matrix should be generic and it should indulge all possible attributes of a targeted framework. Although there is no standardized scale to classify a performance matrix, however, the matrix that included Accuracy, Recall, Precision, and F-1 score is quite frequently used. We have captivated this performance matrix to examine our proposed DNNLSTM framework. The accuracy of a model The recall is considered as a nucleus parameter to determine the performance of an IDS. It indicates the total number of results correctly determined by an algorithm. It is the ratio of TP to the accumulative aggregation of TP and FN as engraved in Equation 2.
The term precision confusion overlapped with recall in some cases as it expresses the total number of relevant results declared by the system. Equation 3 numerically represents precision which is the ratio of TP to the TP and FP. P recision = T P T P + F P However, when the TP is multiplied with 2, and its ratio is implied to the two multiples of TP and summation of FP and FN yields us an F1-score. The equation can be used to calculate this score.

V. RESULTS AND DISCUSSION
This chapter comprises a detailed discussion regarding the outcomes obtained after a systematic performance evaluation of our proposed framework. For a complete performance comparison, the proposed scheme (DNNLSTM) is compared with two distinguished classifiers DNNGRU, and BLSTM along with existing Literature in Table 5.

A. DISCUSSION
We used cuDNNLSTM model for effective and efficient threat detection in IoT environment. The proposed model (cuDNNLSYM) can detect brute-force, bot, infiltration, VOLUME 4, 2016 and DDoS attacks and is trained and evaluated under CI-CIDS2018 dataset having 500, and 300 neurons of DNN and LSTM comprises only 200 neurons. As IoT devices are heterogeneous and resource-constrained devices, and are designed to meet the requirement of the specific user purposes, so it is hard to come up with a common solution for all of them. The proposed work used SDN-based threat detection framework for IoT because SDN efficiently adapts with network heterogeneity. Therefore, the integration between SDN and IoT provides accurate guidelines for monitoring network traffic to detect suspicious activities. The proposed model is easy to implement and deploy in IoT environments to detect sophisticated threats. However the proposed model is vulnerable to insider threats. A complete discussion on the results are provided in the following sections.

B. CROSS-VALIDATION
Every DL based IDS comes with the conceivable potential to overcome malicious entities. However, cross-validation is an ideal phenomenon to determine the fertility of a system. Our proposed system is validated through 10 fold cross-validation under a diversified bracket of performance parameters such as Accuracy, Precision, Recall, and F-1 score. Significantly supportive results were obtained towards our proposed model as compared to existing solutions embraced for this comparison. While considering accuracy, the DNNLSTM accomplish high accuracy of 99.45% at the first fold. The number trounces the milestone achieved by other competitors DNNGRU and BLSTM, and the sequence goes with the same pattern until the 10th fold. The same manoeuvre can be observed for Recall, where the proposed scheme enacts 98.97% of certainty by beating the results achieved by other schemes. The same productive flow s examined till the final round. Furthermore, DNNLSTM conspicuously procures a prominent number of 99.56% for F1-Score at the 1st fold, and 98.54% at the 10th fold where other schemes experience less F1-score. When it comes to Precision, DNNLSTM again pageant dignitary triumph upon competitors scheme throughout the 10 fold evaluation. The complete analysis of the 10 fold cross-validation is encapsulated in Table 4.

C. CONFUSION MATRIX ANALYSIS
A confusion matrix is a performance measurement technique for the performance evaluation of DL-based IDS. Our proposed model is evaluated on this performance monitoring scale as well and is further compared with DNNGRU and BLSTM. Figure 3 exhibits the fact that the proposed DNNL-STM have shown superior performance than DNNGRU, and BLSTM.

D. ROC CURVE ANALYSIS
The Receiver Operating Characteristic (ROC) Curve possess significant importance while validating a security mechanism. The True Positive Rate (TPR), also known as sensitivity or recall, is a metric used in machine learning to quantify the percentage of correctly detected positive events.
Conversely, a True Negative Rate (TNR) is an outcome where the model correctly predicts the negative events. ROC curve shows a deliberated analysis of TPR and TNR, hence, the effectivity of an IDS is truly evaluated. Proposed DNNL-STM possess miraculous performance on the ROC curve as compared to DNNGRU, and BLSTM as can be witnessed in Figure 4.

E. ACCURACY, PRECISION, RECALL AND F1-SCORE
Accuracy is an essential component that spectacle the actual assessment regarding the performance of a specific classifier. The precision determines the degree of accuracy that is measured based on real-time predicted events. The term "Recall" can be interchangeably used with TPR, and it determines the investigated attacking scenarios. F1 score is a rational parameter to expose the strength of an intrusion detection framework. The proposed DNNLSTM is classified on all the above-mentioned performance indicators. A phenomenal performance shown by DNNLSTM in comparison with DNNGRU, and BLSTM makes it a marvellous choice to overcome cyber threats in IIoT. The proposed model achieved an accuracy of 99.55% with precision, recall, and F1-score of 99.36%, 99.44%, and 99.42% respectively. The whole performance analysis is engraved in Figure 5.     Figure 7.

H. TIME EFFICIENCY
The time that a system takes to acquire the internal sustainability of its absolute features is referred to as the training time, and it is considered an indispensable scale to check the performance of a system. The proposed DNNLSTM imprison a training time of 14.39ms, which is comparatively low with DNNGRU, and BLSTM with a training time of 29.54ms and 21.44ms respectively as projected in Figure 8.    evaluation is drawn on Accuracy, precision, Recall, and F1score. All of these algorithms are analysed in terms of these parameters, however, the DNNLSTM envisage prodigious performance. A 10 fold performance evaluation approach is conducted to achieve more analytical and interpretive consequences. Our proposed model reveals monumental performance on a comparison scale with other benchmark algorithms. This comparison is elaborately enlisted in Table  4. To expand the validation spectrum of DNNLSTM, a comprehensive performance comparison is further drawn between the proposed model and some state of the art existing frameworks from the literature. On all the above-mentioned performance parameters, DNNLSTM has accomplished an astounding performance by drubbing the existing literature in an impressive way. An inquisitive comparison can be overviewed in Table 5.

VI. CONLUSION
This study is drafted about intrusion detection in IoT, where we have proposed a DL based SDN enabled intrusion detection mechanism to combat emerging cyber threats in IoT.
The proposed system (DNNLSTM) provides commensurate strength to encounter an assimilated range of potential security threats including DOS, DDOS, MITM, botnet attacks, infiltration attacks, brute force attacks, port scanning attacks etc. The performance of the proposed model is evaluated on a diverse performance matrix where several indispensable parameters i.e accuracy, precision, recall, F1-score are taken into consideration. For validation perspective, the designed framework is compared with two benchmark classifiers, i.e., DNNGRU, and BLSTM. For more comprehensive and analytical scalability, the DNNLSTM is also compared with state-of-the-art intrusion detection schemes focusing on the same domain. The proposed framework has outclassed the existing literature with eloquent performance towards efficient attack detection. 99.55% accuracy, 99.36% precision, 99.44% recall, and 99.42% F1-score are the perceptible achievements of our proposed framework that makes it an ideal choice to investigate malicious entities in IoT environments.