Machine and Deep Learning Solutions for Intrusion Detection and Prevention in IoTs: A Survey

The increasing number of connected devices in the era of Internet of Thing (IoT) has also increased the number intrusions. Intrusion Detection System (IDS) is a secondary intelligent system to monitor, detect, and alert about malicious activities; an Intrusion Prevention System (IPS) is an extension of a detection system that triggers relevant action when an attack is suspected in a futuristic aspect. Both IDS and IPS systems are significant and useful for developing a security model. Several studies exist to review the detection and prevention models; however, the coherence in the opportunistic or advancements in the models is missing. Besides, the existing models also have some limitations, which need to be surveyed to develop new security models. Our survey is the first one to present a study of risk factor analysis using mapping technique, and provide a proposal for hybrid framework for an efficient security model for intrusion detection and/or prevention. We explore the importance of various Artificial Intelligence (AI)-based techniques, tools, and methods used for the detection and/or prevention systems in IoTs. More specifically, we emphasize on Machine Learning (ML) and Deep Learning (DL) techniques for intrusion detection-prevention systems and provide a comparative analysis focusing on the feasibility, compatibility, challenges, and real-time issues. This present survey is beneficial for industry and academia to categorize the challenges and issues in the current security models and generate the new dimensions of developments of security frameworks with efficient ML or DL methods.


I. INTRODUCTION
The past decades have seen a revolution in computing with advanced technologies and smart device communication.
Internet of Thing (IoT) establishes internal communication using sensor devices. It is the most preferred technology for all day-to-day activities in this era [1]. IoT devices transfer huge data over a network with minimum human interaction The associate editor coordinating the review of this manuscript and approving it for publication was Sotirios Goudos . using internet as a central communication medium. The impact of global connectivity and the exchange of data created major significance on education, business, health care system, military capabilities, international trade, agriculture, and home applications. Massive connectivity with heterogeneous devices, unsafe network architecture, exposure of global data, raise critical security issues in IoTs [1]. Cyber security is the major concern in this digital world to ensure protection from malicious activities, which aim to corrupt or steal data and interrupt an organization's systems with VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ unauthorized access. At the same time, IoT have become a major channel for the spread of dangerous malware attacks. Unpatched and less secured devices are the targets for botnet operators to capture the system and get control over the devices. Strong security services to control the access mechanism with a perfect authenticated framework is essential. An Intrusion Detection System (IDS) is a suitable solution to handle security issues and mitigate the effects of the attacks. IDS becomes an essential part of security management in the network and host systems. IDS detects intrusions or misuse of network or system by reporting to the administrators and filing a record for further investigations. It handles the suspicious events without interrupting the regular activities during the malicious outbreak [2]. Many tools and techniques are available to counter the threat of these attacks. Requirement of strong firewall protection is essential, as the existing firewall can not classify the behavior or anomaly attack. Antivirus software has less scope in recognizing the new patterns of the virus. Intrusion detection triggers an alert after an attack enters the network by doing nothing to stop the attack. Currently available IDS have several limitations such as lack of flexibility and scalability [2]. Intrusion Prevention System (IPS) is a proactive method to prevent a security attack by examining the patterns of data (network traffic) and recognizing the abnormal behavior from stored data records (signature). IPS blocks the offending data when the attack is detected [3]. We consider an IDS as the second line of the defense system; however, it faces difficulties in providing secure access control [10]. On the other hand, IPS integrated with firewall and IDS can provide preventive measures with alerts for attacks in a preserved network area. Artificial Intelligence (AI) technologies like Machine Learning (ML), Natural Language Processing (NLP), Neural Networks (NN) can provide rapid insights by identifying and mitigating the effects of the attack with daily alerts using a smart Intrusion Detection and Prevention System (IDPS) [11]. Figure 1 presents the architecture of an IDPS for IoT network. The functionality of an IDS and an IPS are almost similar; an added capability of IPS is the perimeter defense appliance, gateway monitoring, network packet inspection, and blocking the suspicious activity by comparing with known patterns. Both the systems are designed to recognize potential security violations in the network system [3]. However, basic detection system uses two principles: behavior analysis or pattern recognizing and then a prevention system uses a signature mechanism to monitor the suspicious network traffic by blocking the inbound and outbound packets before they access other resources. IPS is an integrated component that combines technical firewall protection with multi-layer support and detection functionality [4].

A. CONTRIBUTION
Our present survey focuses on Machine Learning (ML) and Deep Learning (DL) approaches for IDPS. Our main contributions are as follows.
• Comprehensive taxonomy: Our study provides a detailed taxonomy of intrusion detection and prevention system in IoT using machine learning and deep learning techniques with systematic review literature.
• Performance analysis : We provide the performance analysis of the latest IDPS models based on ML and DL techniques with accuracy and notify the limitations.
• Prevention techniques: Our study explores various prevention techniques, mitigation strategies, and the methods implemented for IPS in IoT.
• Risk analysis: We propose a risk factor analyser to identify the level of risk and take an action to implement a counter measure and mitigate effects by improving the security control in the manufacturing unit.
• Hybrid framework: We propose a hybrid framework to avoid the disadvantages raised by anomaly and signature based techniques and apply the risk factor based on the complication levels.

B. ORGANIZATION OF THE PAPER
We organize the rest of the paper in the following sections. Section II highlights the importance of security in IoT application with a focus on issues, attacks, vulnerabilities caused and the relevant measures. Section III discusses the detailed taxonomy of the detection systems with their pros and cons in the real time applications. Section IV reviews various detection techniques developed using machine learning techniques in the recent years highlighting their features, techniques, and performance. Section V shows some recent IPS models based on ML techniques. Section VI explores various detection models developed using deep learning techniques proposed for IoTs; it focuses on various supervised and unsupervised techniques and discusses the issues extending to future scope. In Section VII, we conduct a systematic literature review on prevention models. Further, we explore the detailed analysis on various prevention techniques developed using ML and DL methods. Section VIII provides a risk factor analyser, using mapping technique, and a hybrid IDPS framework. Section IX provides a comparative study on the available techniques, and existing surveys in the direction of IDS. Finally, we draw the logical conclusion in Section X.

II. IoT-SECURITY
With the increasing number of devices and sophistication of attack tools: hacking and security breaches have grown unlimited. Burgeoning technologies like the public cloud, IoT, artificial intelligence, paralyzed the standard security measures [5]. IoT establishes a connection of anything, anyone, at any place, and provides smart services with a secured network platform. IoT applications are extended to a wide range, which includes smart health monitoring, traffic congestion, smart cities, waste management, logistic and emergency services, smart industrial, and retail controls.
IoT establishes a heterogeneous pervasive network of smart devices. Some of the complex IoT devices relate to a hostile interface, developed on uncontrolled platforms,  and encounter vulnerabilities to individual systems available in the integrated network [6]. Lack of interoperability and accessibility in the vast heterogeneous landscape results in poor monitoring of the security mechanism in IoT networks. We list various IoT attacks and countermeasures in Table 1; and also mention the device vulnerabilities and suggested measures for each. Scalable solutions minimize the use of resources and improve the performance to take effective decisions which mitigate anomalies in the system [7]. A strong security system is required to ensure system protection from unexpected threats, maintain confidentiality, stabilize the network connection, control network traffic, and avoid vulnerable attacks. Three major security problems of IoT as: taking control, stealing information, and disturbing services, create dangerous issues and data-threat for IoT users.
IoT connects the devices with the Internet backbone; many interactive and efficient applications use this to enhance the network services [8]. Huge private and confidential data of multiple categories are collected with these devices based on application and implementation. Ensuring high-level security for the data sent by IoT devices during transit and rest is the preliminary intention of a security control system in an IoT. Wurm et al. [8] highlights the security vulnerabilities associated with industrial and consumer IoT devices. The highest security risk is anticipated to the perception layer because of its hostile and open environment than the other network layers [9]. In the next section, we describe the security risk targeting the IoT devices and suggest some potential mitigation measures, which can help the manufacturers in strengthening the security design in future. The risk analysis also exists to mitigate the actions performed by the intruders and create a secured network framework [10]. Common steps for creating a risk analysis model are: attack and risk identification, prioritizing the categories, selection of suitable mitigation strategies, and adapting a mitigation solution based on the problem [10]. We mention some of the problem-solving solutions to avoid the intrusion activities below. VOLUME

III. IDS FOR IoTs
The open network architecture, heterogeneous device structure, and the drastic use of smart connected devices, in our daily life are leading to serious security and privacy issues [11]. The destruction of water utility pumps in industrial IoT, personal data theft [12], generating false messages as the legitimate users [13], unauthorized control over power stations, smart cars, smart restaurants, and manipulation of private information to block regular services are some of the examples of dangerous threats created in the IoT environment in the recent past [14]. Therefore, a comprehensive and distinct security mechanism is very much required to protect the digital world and secure it from serious security threats [15]. Several research proposals are available in different dimensions for securing the IoT devices, some of them include secured frameworks, privacy protection models, and authentication techniques [14], [15], [16]. However, to address these challenges and ensure effectiveness and applicability two major factors can be considered [17]. First, to identify and authenticate the devices and limit the controls for external access for sophisticated security management with real-time monitoring. Second, to coordinate the open network connectivity and ensure the security in a collaborative network [15].

A. IDS TAXONOMY
IDS is an intelligent security system for coordinating host and network activities. This analyzes the packets transferred through the network, finds suspicious events, and processes with the alert notification. Figure 2 displays IDS classifications for two major categories: network-based and host-based detection systems. IPS also have similar classifications as network and host based prevention using Network Behaviour Analysis (NBA) and monitors the abnormal activities using Wireless Intrusion Prevention System (WIPS). The taxonomy focuses on IDS and IPS techniques used to detect the malware as an anomaly and signature-based detection methods. Figure 2 projects various machine learning and deep learning techniques suitable for each IDS category to obtain an idea of the models developed in recent times.
IDS have gained immense attention with multiple notable models proposed for creating an intense security structure [16], [17], [18] due to ever-increasing zero-day patterns of network traffic and their heterogeneity. In this context, our study investigates the novel challenges to explore potential solutions to address the issues in the detection models. In specific, we emphasize the challenges of the available detection systems concerning performance, bandwidth utilization, time taken for detection, overload of processors, etc. The study also focuses on the accuracy, false positive, and negative rates of the proposed models by highlighting the future directions.
Implementation of IDS for real-time devices is limited to the applications used, data transfer, and area of the network [19]. IDS has numerous advantages, compared to traditional firewall protection but, has a critical downfall in reducing false rates. At the same time, not all IDS procedures are similar, each category has its unique qualities in tracking and defending against policy breaching [19]. Machine learning and deep learning techniques are projected under supervised and unsupervised learning models. These techniques are mostly used for fraud detection, risk assessment, image classification, and spam filtering [38], [39], [40].

B. NETWORK BASED INTRUSION DETECTION SYSTEM (NIDS)
Generally, a NIDS is placed near a firewall with an independent sensor device specially to monitor local network traffic. This identifies the malicious events from incoming packets as denial of attacks on services and scanned ports on the network. This system resides in the network ports and works with a firewall for better protection against known attacks [20]. NIDS is defined in two forms: network-nodebased NIDS and promiscuous-mode-based NIDS. Analyzing packets bounded by a single destination is the quality of node-based NIDS with distributed agents. On the other side, sniffing all the packets across the network traffic and analyzing for the suspicious attempt with a single sensor on each segment is the property of promiscuous-mode-based NIDS [20]. NIDS is set up at a selected point as a sub-net within the network to examine and match the passing traffic. Then it analyzes the pacts and raises an alert if violated [21]. These sensors activate the interfaces for managing, controlling, and receiving alerts and then forward the same to the central server. NIDS applications are attached to the network with two interfaces, one monitors the network conversation, and the other control and generate a report of the activity [21]. Table 2.

C. HOST-BASED INTRUSION DETECTION SYSTEM (HIDS)
HIDS is an intelligent detection system that acts as an agent to inspect and report suspicious activities attempted on a host device. Continuous observation of the dynamic behavior, state of the system, storage area, internal configuration, network packets targeted, program executed, and resource accessed are the primary function of HIDS [22]. Apart from this analyzing log files available on the host (kernel, system, server, and network) and monitors file access and configuration changes in run-time, and finally compares with previous attacks stored in the server the activities carried out by this system. IDS models developed for host-based detection are listed in Table 2.

D. SIGNATURE INTRUSION DETECTION SYSTEM (SIDS)
Signature-based detection technique looks for evidence known to be indicative based on defined patterns [32], [33]. Searching for a specific payload in a data packet, matching with the existing patterns generated by the NIDS/HIDS, and registering it as a signature of misuse is the procedure of the SIDS technique. The major limitation of this method is ignoring the newly launched attacks because of missing signatures. Intruders can easily deceive this method as the signatures are based on regular expressions. It uses matching string content that suits only fixed behavioral patterns.

E. ANOMALY INTRUSION DETECTION SYSTEM(AIDS)
Anomaly detection is based on the observation and deviation of behavior or activity from the normal baseline [34]. An anomaly detection system in NIDS detects the intrusion at the physical network after passing the firewall, and in HIDS it is the last layer of the protection that exists in the endpoint that allow fine-tuned protection at the application level [35]. Anomaly-based IDS has a major fall in results in false-positive rates. The detection system engine with multiple protocols must understand the process [36]. Though the protocol analysis is expensive, it has benefits of rectifying the false-positive alarms rates. The research community is working to integrate many advanced techniques such as statistical, cognition-based, machine learning, deep Learning, and data mining-based methods to develop better detection models [37]. Anomaly and signature-based detection are considered as the two primary techniques for developing detection and prevention models. We explain the opportunities and challenges faced by each category in Table 3.

IV. MACHINE LEARNING TECHNIQUES SUITABLE FOR INTRUSION DETECTION
The main aim of ML is to allow computers to learn automatically without human intervention or assistance and control actions accordingly. Machine learning is used for large-scale data processing and well suited for complex datasets with huge numbers of variables and features. The process of ML begins by accepting training data and making observations on data with direct experience, or by instruction and results with output values. Algorithm selection should be appropriate to gaze at the data patterns, improve the analytic, predictive power, and make better decisions in the future training data. Machine learning techniques are majorly categorized as supervised learning, unsupervised learning, and reinforcement learning.
Training with fully class labeled data, and establishing the relation between the input and target units are the properties of supervised algorithms. Classification and regression are the two major categories of supervised learning. Some of the popular classification algorithms are Support Vector Machine (SVM) [38], Naïve Bayes [39], Nearest Neighbour [49], Neural Network [44], Discriminant Analysis, and Logistic Regression [40]. Algorithms under the regression category most prominently usable for intrusion detection analysis includes Linear Regression, Support Vector Regression (SVR), Ensemble methods, Decision Tree (DT) [50], and Random Forest [51]. Unsupervised learning techniques find the hidden structure in the unlabelled data without training. Reduction and clustering are the two major techniques used to make relevant groups for comparison and compression with unique identification. Some of the popular clustering algorithms are K-Means, C-Means. Singular Value Decomposition (SVD) and Principle Component Analysis (PCA) are the popular feature reduction techniques.
We list the properties, advantages, and issues of machine learning approaches in IDS with in Table 4. This emphasizes the need and importance of each technique in detection process. The table provides a view of the trend for machine learning approaches to help future IDS developers to choose the appropriate technique.

V. REVIEW ON ML-BASED IDS MODELS FOR IoT
The most popular machine learning algorithms which achieve good results in detecting the specious activities of IDS are decision trees, random forest, SVM, and neural networks. The accuracy of the models and the efficiency of the algorithms depend on the application and the type of attack detected. Some of the proposed models have high performance only for the binary class detection and some are good in identifying multi-class attacks [37]. Many researchers focus only on the overall detection accuracy but, the detection effect for small-scale data is often very low. Considering the imbalance between the research done and the real-time applications, we have presented some of the popular machine learning models for IDS. Many of the traditional techniques are experimented on some popular intrusion datasets as KDD99, NSLKDD, UNSWNB-15 CSIDS. The single view model results in incomplete pattern identification, especially for large datasets. As the multiview learning models are having high popularity for detection techniques, Dinesh chowdary et al. [43] proposes Multi-View Federated-based Learning for Intrusion Detection (MV-FLID). This can learn from different data views and delivers the most distinguished prediction. Federated learning benefits peer learning and protection for profile aggregation. The authors in [45] propose seven pre-processing techniques based on traffic for ML algorithm, evaluated based on scalar and normalization functions. They apply the models on four features under the category of content, statistical properties, basic and traffic connectivity. VOLUME 10, 2022 The results of the study proves that application of categorical study enhances the performance to 45% comparatively. This help in proper classification based on the parameters related to possible attacks. Dhanke JyotiAtul et al. [47] proposes Energy Aware Smart Home (EASH) framework tested on real-time sensor data for selected IoT devices. The study is experimented with J48, Naive Bayes, Multi-Layer Perceptron (MLP), multi-nominal logistic regression for classification and detection on anomalies. Amongst all the techniques MLP has high accuracy with the capability of self learning and recognizing minute factors. We discuss some of the popular models developed in the recent years for mitigating the issues of intrusion for IoT environment in Table 5.
All the above-mentioned techniques are evaluated under two scenarios; first, under the assumption that both the training and testing data are of the same source and second, the testing samples are new and unknown patterns. This type of process helps us to understand the patterns of IDS in handling new malicious patterns. Testing on unknown patterns is very essential for new IDS models and helps in tracing the intruders who escape from the security control. The results in Table 5 show that the supervised ML techniques have better accuracy than the unsupervised models in some cases. Among these algorithms, decision tree and random forest have achieved the best results with 99% accuracy and low false rates. If there are unseen attacks in the test data, then the detection rate of supervised models decreases, as the patterns are not registered while training the data. This is where the unsupervised models have a better hold in performance as they do not show a significant difference in accuracy for known and unknown patterns.
According to the results mentioned in Table 5 random forest and K-Nearest Neighbour models (KNN) show high accuracy compared to the other classification techniques [42], [49]. Many of the integrated models with federated learning and/or self-learning methods show competitive performance than the traditional methods [43], [47]. Multi-layer framework [52], [55] with different levels of testing has more impact, where the data is filtered for multiple times and the identification becomes much stronger with clustering techniques [52], [54]. Experimenting on multiple models for better performance, and trace the most suitable model is the recent research trend. Following this concept, Verma et al. [38] experiments with six machine learning techniques as AdaBoost, random forests, gradient boosted machine, extremely randomized trees, classification, regression trees, and multi-layer perceptron for intrusion detection. All these models are tested on CIDDS-001, UNSW-NB15, and NSL-KDD datasets and the results prove that supervised techniques achieve better performance. Jinxin Liu [39] have examined eleven machine learning techniques includes Decision Tree, Matthews correlation coefficient (MCC), XGBoost, Bagging Tree, Random Forest, Bayes Net, Support Vector Machine, Naïve Bayes, AdaBoost, Expectation-Maximization, DBSCAN, K-Means. They focus on seven attack categories as SynFlood, Land, UDP Flood, Ping of Death, Smurf, IP sweeping, and Port Scan. The XGBoost model results in high performance with 0.970 accuracy and 0.968 recall. Secondly Bagging and SVM methods perform better as compared to RF and DT. The NB classification has the least results with 0.452 accuracy among all the proposed eleven techniques.

VI. DEEP LEARNING BASED INTRUSION DETECTION SYSTEM
Focusing on security applications, deep learning techniques with remarkable quality of self learning are beneficial to develop the intrusion detection models. This models result in low false rates and high accuracy as compared to traditional machine learning techniques. The standard Neural Network (NN) architecture is created with multi-layer perceptron developed using a liner stack classifier. We show a simple NN designed with input, hidden and output layers in Figure 3 Raw data in the form of numbers/images/audio are fed into the neurons as input represented with x 1 , x 2 , x 3 , . . . , x n . Each input is multiplied by weights (w 1 , w 2 , w 3 , . . . ., w n ) and passed to an activation function. An activation function is a step function that maps the input signals into an output signal which is needed for the function of the neural network. A fully connected network model with more than three hidden layers is considered a Deep Neural Network (DNN). The feed-forward algorithm begins with the input layer move forward by updating the state of each unit by multiplying the weights and add the bias, finally terminates at the output layer when all units are updated.
In Equation 1, x represents the inputs, w represents weights to be added for each input, z is used for output, b represents bias, and f represents the activation function. The model adjusts the weights and repeats the task to improve the accuracy using back propagation.   GAN are generative DL techniques, and the combination of both is considered as an ensemble technique. We discuss some of the DL techniques, their importance for IDS, and the issues in Table 6.
Yazan et al. [56] propose a Spider Monkey Optimization (SMO) algorithm for dimensionality reduction and the  popular IoT datasets including KDD99, NSL-KDD, BoT-IoT, and CICIDS-2017. It achieves higher accuracy compared to several existing approaches. Thamilarasu G. et al. [64] propose a three layer framework with network connection phase, anomaly detection phase, and the mitigation phase to identify, analyse, and reduce the risk factor using CNN techniques.

VII. LITERATURE REVIEW ON INTRUSION PREVENTION SYSTEM
Intrusion Prevention System (IPS) monitors the network and identifies the abnormal activity with the traditional techniques. IPS prevents the similar attack occurrence in future by closing the access points, terminating the TCP session, reprogram the firewalls, removing the traces of attack from payloads, headers, and infected files. IPS follows signature, anomaly, and stateful protocol based analysis for network-based and host based intrusion identification. Generally, from implementation perspectives, IDS and IPS are configured together and complementary to each other; thus, it makes Intrusion Detection and Prevention System (IDPS). Available IDPS techniques lack in dynamic attack detection for complex network structure. Probabilistic learning [77], fuzzy logic for high density attacks [78], analysing risk factors with C4.5 Decision Tree algorithms [79], genetic techniques [80], clustering [81], analyzing features and their impact with regression [82] are some of the approaches used for intrusion prevention models. All these techniques are used to frame a data-driven prediction model or the robust detection model for a feasible network to prevent intrusion and security breaches.

A. ML-BASED PREVENTION MODELS FOR IoT
A recent work experiments with interception, injection, and denial of service attacks; IPS is found to be immune to these attacks [83]. It uses K-Means techniques after removing the outliers and integrates Local Outlier Factor (LOF) algorithm to evaluate a score reflecting the abnormality of the observations. Tree Automata based on Automatic Approximations for the Analysis of Security Protocols, abbreviated as TA4SP, processes the intruder knowledge using regular tree language [84]. Nikhil et al. [85] propose an integrated technique for prediction and prevention in agriculture sector with smart connected devices. The experiment conducted on the real-time agriculture data using sensor devices and processed using machine learning and deep learning techniques. It uses Support Vector Clustering (SVC) for analysis and predicting the crop suitability based on soil condition, weather, rain estimation, ultrasonic, and infrared rays. CNN technique trains the model with three sample animal images and prevent the physical intrusion damage caused for the crops. USB camera inputs are compared with existing image using signature based detection and raise an email notification with an alarm for avoiding the harm caused for ecosystem [85]. Seo et al. propose a two level hybrid detection and prevention technique [86]. It uses random forest method and evaluate the decision tree for statistical analysis. If the ratio is less than zero the packet are forwarded, else the packets are dropped. The best features analysed from level one pass to the next level, the anomaly detection is implemented and traced for the suspicious event and dropped the packet in level two. The experiment is conducted on UNSW-NB15 and CICIDS2017 dataset. The model results with 99.80% accuracy in the second level of detection. Werth et al. [87] propose a layer-based prevention technique that stimulates a physical system based on payloads of the packets. An additional contribution of the study explores various threat model that creates consequences. It uses three layers: layer zero for physical devices, layer one for ladder logic program, and layer two to activate the internal states of the ladder logic program. Change of pattern in the layer indicates a malicious activity [87]. Serial connectivity of the network is the character of a prevention system; this may lead to potential and communication issues. Hui li et al. [88] introduce a ML technique using SVM in snort IDS to minimize the error rate and improve the performance. The combination of this model with a firewall gains high defensive ability. This proposed IPS is implemented with two-floor classification; first, to identify the possibility of intrusive event and pass to the second floor if any suspicious activity is registered and classify the category of the attack else pass on to the next packet. Inbuilt resources as Netfilter/ iptables are used to build the prevention system for inline snort.
Generic IDPS with M2M standard using edge ML technique with three level detection and prevention module is proposed by Chaabouni et al. [89]. The first level acquires the data and selects the best features; the second level classifies the packets based on know patterns to identify the normal and attack class. In the final step, the attack packets are classified into flooding or amplification class to take relevant actions and update the patterns in the database. Constantinides et al. discusses prevention framework with incremental phases based on the input levels named Self-Organizing and Incremental Neural Network Winner-Takes-All Support Vector Machine (n-SOINN-WTA-SVM ) [91]. After initializing the weights and bias the model finds the nearby input value and finds the first and second winner. The signature patterns are matched and inserted between the class and check the second winner's availability. If no traces are found, the process is restarted else, the old edges are deleted and proceed for multi-class classification. Chandre Pankah et al. [92] propose a classification-based prevention technique using five machine learning and one deep learning technique. It uses Support vector machine, random forest, k-nearest neighbors, Naïve Bayes, and Decision Tree from machine learning category and for comparison the model was tested with Convolutional Neural Network (CNN). CNN gives a better performance than SVM as NN models are much capable for larger datasets comparatively. However apart from the techniques mentioned above, there are numerous security solutions available to prevent network intrusion or illegal access in IoT environment. [93] proposed a bio-metric-based smart locking system that allows only authorized people in to the house. It can also be used to gain access if keys are lost or for disable people. Circuit-based Secure Vehicle Operating System by [94], which monitors and controls with mobile tilting and sends messages via google assistant for network authentication. Raghavendra et al. propose a Least square Bolster-based support vector machine-based prevention technique with two segments [95]. A half and half component is used to remove the redundant information in the upper level. It uses the wrapper method to select the relevant features for the classification in the lower level. After the classification of attack, the features having a high impact on the classification are observed to block the related entries for preventing intrusions. Akhil et al. propose a multi-layer perception with SVM for detection of DOS, Probe, R2L, and U2R attacks [96]. An internal script uses features like the IP address and the port number are considered for preventing the attacks. Discriminate Deep Belief Network (DDBN) based detection and prevention technique for local and non-local regularization is proposed by the work in [97]. The model is tested for two popular datasets with Hopfield, SVM, generative adversarial network (GAN), and Deep Belief Network-Random Forest (DBN-RFS) classifiers. Various parameters are changed in the process of developing prevention techniques to reduce the time span for detection of the attack category. It is been observed that the running time decreases as the hidden layers in the model are increased. Balamurugan et al. propose a two phase detection and prevention technique for real-time cloud dataset using three elements: Cloud Controller (CC), Trust Authority (TA), and Virtual Machine Management (VMM) [98]. CC monitors and migrates the packets to idle cloudlets if the traffic is heavy and scrutinize the packet based on arrival time confidence levels and the packet count using header information. Normalized K-means (NK) Recurring Neural Network model (NK-RNN) is used to classify the intruder packets available in VMM. A Queue modelling technique is used to discard the intruder packet. Finally, these packets are blocked for the network to avoid the intrusions in future [98].
A Software Defined Network (SDN) based IDPS for IoT network proposed by Amir Ali et al. [99] uses a three-tier framework. It process the user validation for IoT layer as the first tier, packet validation for data plane layer using fuzzy filtering methods to classify the attack records. Finally, the third tier flows validation with control plane layer for detection and prevention. The control layer is integrated with CNN and Deep Packet Inspection (DPI) for detecting and predicting the attack values. The model is compared with SVM, ANN, Fuzzy, and other ML techniques and results in 1% false rates. A hybrid model with the combination of Bootstrapped Optimistic Algorithm for Tree Construction (BOAT) and Artificial Neural Network for classification and One Way Hash Chain (SHA-256) for preventing in MANET is proposed by [100]. The major components of the model are packet analyzer using fuzzy controller, data pre-processing using logarithmic, and linear normalization, feature extraction using Mutual information function to select optimum feature set, and classification using Association Rule Tree (ART) [100]. The input data is considered based on the breaches caused by three test cases framed on confidentiality, authentication, and access control [101]. A risk analysis model is proposed by James et al. to prevent the attack in various levels: The initial level is to identify risk based on the event and the relations defined [101]. Then, it prioritize the event, evaluate, and rank the risk factor. It choose a mitigation strategy based on the risk connection and the common cause of the threat. Finally, it checks the feasibility and implements the suitable solution by tracking the performance with regular monitoring.
We summarize various machine learning and deep learning techniques for IPS in Table 8. The table also enlists the dataset on which the techniques are evaluated. Various mitigation strategies and the dataset used for experiment with the results based on time taken for prevention and detection accuracy are presented.

VIII. OPPORTUNISTIC SOLUTIONS
Continuous network monitoring and defending are the essential factors of network security to predict and avoid the malicious activity. Traditional detection system monitor and alert when suspicious event occurs, whereas the prevention system take a relevant action when the malware is detected. Based on the models and theories developed for detection, anticipating the importance of the risk and take significant actions, we have proposed a mapping technique. This evaluate the event type analyze the risk factor and suggested a mitigation strategy. Identifying and providing early warning for intrusion and violating the next action is very much necessary for IoT network structure. The system must be active in classifying and analyzing the risk factor to distinguish the suspicious packets and trigger the prevention technique. IPS is an inline product that focuses on identifying and blocking  the attack in real-time. Considering this we have proposed a risk factor analysis using a mapping technique, to identify and classify the suspicious and malicious events and rate the level of risk in the next section VIII.

A. RISK FACTOR ANALYSIS
The proposed approach is assumed to increase the accuracy of the model, with three strategic layers for detection, prediction, and mitigation. Furthermore, we combine our mapping technique with a hybrid IDPS framework for accurate identification and reorganization of the threat. The mapping factor is divided into three phases defined in Figure4.The data flow for normal packet is indicated with plain arrow, and the suspicious event flow with dashed arrow mark, and unknown patterns are indicated with dark arrow lines in Figure 4.
In phase one the detection phase behavior pattern change is captured and classified into suspicious and malicious packet. In phase two risk factors are analysed by matching the packets with the known attack patterns, then classified as normal, known, or unknown attack types. Mitigation strategy the phase three analyzes the risk factor rating as high, medium, and low. Thus, the active response from the event is used to analyze the network traffic in real-time. This will trigger the action as a block, allow or logging to mitigate the network complication, or block the process associated with the event.
Overall the risk factor identification help in summarizing the following solutions for three cases: 1) Case one: When the event is found suspicious but does not have any further attack variations is considered as a normal activity with a low-risk rate and allowed for further processes. 2) Case two: Suspicious event traced with known signature patterns, analyzed with medium risk rate, and logging is implemented to recheck the authentication of the user. case 3) Case three: When a suspicious or malicious event is undermined in the detection process and categorized as unknown events result in False Positive(FP) or False Negative values. These type of cases causes high-risk factor and lead to process blocking and mitigating the effects of the attack.

B. FRAMEWORK FOR FOUR LEVEL SECURITY STRUCTURE
Features required to develop an effective IDPS model are: high application-level analysis, active threat identification, and integrated prevention model with sophisticated response capability. The research community is keen on providing multiple detection models and frameworks to mitigate the external threat, many of the models focus on signature-based detection and prevention methods. Many of the methods discussed above lack in the identification of unknown patterns and are poor in handling zero-day attacks; they also fail in avoiding inside intrusion threats. Recent research explores that the deployment of a hybrid model for detection and prevention results in better performance. Figure 5 projects a four-level security framework model with the combination of anomaly and misuse-based detection. This approach is the extension of the subsequent research proposed by Stiwan et al. [2]. The study enhances the mapping procedure and is brief about the hybrid VOLUME 10, 2022 techniques. Another hybrid detection model with the combination of the immune system proposed by Yu et al. [102] with neural network techniques. The study emphasizes more on accurate detection with self-learning techniques. All the above-discussed models are good in improving the performance and accuracy level, but lack in reducing the false rate. Considering this our framework is integrated with detection, prevention, and risk factor analysis. The main aim of the framework is to integrate both anomaly and signature-based detection, to handle zero-day attacks and avoid inside intrusions with behavioral matching strategies. The framework has four key elements to avoid security violations. The first level of security is to authenticate the network packets with credentials and proceed to pre-processing techniques. This level normalizes the data packets and extracts required features based on the dimensionality reduction techniques. A twolevel detection is implemented in this process using anomaly and signature-based detection methods. The complete dataset will all collected features are observed for variation in the behavior using anomaly detection techniques. And at the same time selected features are matched with predefined signature patterns to find the malicious activity under level two. Finally, if any suspicious event is observed, the risk factor analysis is activated and performs required action based on the level of risk identified. If no thereat is detection the packet is sent back to the network for the regular procedure.

IX. SYNOPSIS OF ML-BASED AND DL-BASED IDS/IPS METHODS
ML and DL techniques reduce the human intervention and automate the detection in a short time. DL models are not compatible with large datasets and complex structures as compared to the ML techniques. ML techniques are mostly used for signature intrusion detection that acts according to the stored patterns. On the other hand, DL has a capability for self-learning; hence, it is more compatible for anomaly detection. Analyzing and detecting the attacks based on behavior helps in handling zero-day vulnerabilities. Though the ML techniques require less computational power, the DL techniques are faster than the ML techniques.
The multidimensional Compatibility of a DL technique to train and test on image, audio, video, and sequential data give a unique priority for developing new innovations. Figure 6 provide an over all summary of the current study. This study only looked at the most recent methods developed using ML and DL techniques between 2018 and 2021. In Figure 6, we first discussed various malware attacks and mitigation techniques based on the article's literature review. Because IDPS is the primary goal of the study, we will summarise the various IDS and IPS techniques proposed in the study. Finally, a list of ML and DL techniques is discussed in the paper's review section VI. In Figure 6, we provide a brief overview of the vulnerabilities caused by attack variants, as well as a list of available solutions, which is required to develop a unique model for a future feature. Our present study emphasizes various ML and DL techniques and the mitigation strategies evaluated from the models as a road map for future research. In the following, we compare the existing surveys in the direction of IDS/IPS notifying the highlights of our study and also provide some research questions to address by the researchers.
A. COMPARISON WITH EXISTING SURVEYS Table 9 and Table 10 provide comparative summarization of various parameters included in the research articles in the direction of IDS/IPS in the recent years. We use Y in the table to represent the description about the specific category in the given study. Any attribute having N signifies that a particular study does not have a particular property of discussion.
From the comparison, we see that the maximum of the available studies provide a detailed IDS taxonomy that describes the types of IDS; they also provide sub-classification based on area and the application. Our study evolves around various categories of IDS with ML-based and DL-based techniques suitable for developing the detection or prevention model.

B. HIGHLIGHTS OF CURRENT STUDY
Our work differs from the above-mentioned surveys in the following points.
• The present survey provides the detailed taxonomy of IDS and compares the IDS with security services, whereas the above mentioned surveys present the taxonomy and describe only selected modules with comparative analysis.
• Our survey explores various techniques, methods, models, the framework proposed for IDS with performance and accuracy. On the other hand, the existing surveys either provide a comparative analysis on attacks and methods or the glitches faced by available methods for limited period.
• Our study emphasizes various ML and DL proposals and models of IDS and IPS for IoT with ML and DL techniques. The existing surveys are specific to data storage issues, physical (vehicle security) issues, network-based IoT and IDS implementation issue, and etc.
• The study examines various intrusion prevention techniques and the mitigation strategies,in respect to machine learning and deep learning techniques. It is been observed that there are very limited review articles on prevention techniques, all the above mentioned articles are limited to techniques and models. Our study emphasis the mitigation techniques.
• We propose a mapping technique for analysis of the level of risk and develop a effective prediction model framework to be used as a blueprint for future developments.
• We propose an integrated multilevel hybrid framework that combines signature and anomaly detection with risk factor mapping and identify all types security threats. This framework is beneficial for future development of IDS/IPS.

C. RESEARCH QUESTIONS
Development of accurate detection model and enhancing the security in of IoTs and its allied domains are very prominent research directions in the present time. Our present survey explores more than 100 research papers related to IoT security. These papers propose different classifiers for intrusion detection. Our survey also presents a reasonable perspective of each model and provides a comparison of works in this field. We notify some research questions to provide an insight towards the futuristic development of IDS/IPS. • RQ-1: Available dataset are compatible for research? Solution: Available datasets for intrusion detection do not follow standard features. Each dataset results with different attributes based on the network and application. Consideration of common features selection technique for all models before classification obtains better results.
• RQ-2: What is the importance of feature reduction? Solution: Strong feature extraction technique to be implemented to remove irrelevant and redundant features in training; it improves the model performance.
To generate a prevention model, it is very important to know the relation between the feature and analyse the behaviour to control the zero day attack.
• RQ-3:Which is the most suitable technique for feature extraction? Solution:Machine learning models are effective in feature selection and deep learning models are effective in feature reduction. According to the study, it is stated that deep learning auto-encoder is the popular feature reduction technique. Apart form this, integrating multiple feature selection algorithms, and working with the best possible features is helpful for accurate classification.
• RQ-4:Which is best classifier -single or multiple ? Solution:Use of single classifiers or baseline classifiers in performance measurement can be replaced by hybrid or ensemble classifiers.
• RQ-5: What is the risk factor after applying the available models? Solution: Existing models are VOLUME 10, 2022 FIGURE 6. Synopsis of intrusion detection and prevention models.
121188 VOLUME 10, 2022  limited to binary or limited attack classification; majority of the models use pattern recognition and signature based techniques. Extending the detection for a wide range of attacks will be feasible to identify zero day vulnerability which has to be duly considered.
• RQ-6: Which method is the most suitable for IoT? Solution:Light weight and resource compatible ad-hoc network IDS are required without degrading the security requirements.
• RQ-7:How to solve the problem of false rates of the model? Solution:Detection delays decrease the performance of the underlying networks and generate false rates. To achieve desirable detection accuracy with effective performance time, researchers should focus on model compression techniques.
• RQ-8:What is the impact of the models on real time data? Solution:Real-Time detection models activate early warning by alert messages and protect the system from threats and suspected activities. The existing detection models lack in identifying zero-day attacks and result in high false alarms, and create impact on the response time of the model.

X. CONCLUSION
Our survey focuses on various research works evolving around IDS and IPS. We elaborate the categories of intrusion detection and prevention based on methodologies, techniques, and provide a detailed analysis of each of the models. The use of machine learning and deep learning methods in IDS has also enhanced its performance. The presented survey analyses the pros and cons of the methods to provide a pathway to the researchers in this domain. We discuss a base of IDS in various categories depending on architecture, positions, and functions. The various solutions for IDS are also classified based on latest research works. We have proposed a risk factor analysis using mapping techniques with mitigation methods. Such a survey with framework and prevention model is not yet available and therefore, our survey is helpful for the IDS and IPS designers to conceptualize the progress path of IDS/IPS methods and technologies. The state-of-theart comparison of IDS models is also given in the paper. Each ML and DL model is compared and explained through detailed tables. Finally, we have pointed some of the research issues and propose some solutions for research direction.