An Explainable and Resilient Intrusion Detection System for Industry 5.0

Industry 5.0 is a emerging transformative model that aims to develop a hyperconnected, automated, and data-driven industrial ecosystem. This digital transformation will boost productivity and efficiency throughout the production process but will be more prone to new sophisticated cyber-attacks. Deep learning-based Intrusion Detection Systems (IDS) have the potential to recognize intrusions with high accuracy. However, these models are complex and are treated as a black box by developers and security analysts due to the inability to interpret the decisions made by these models. Motivated by the challenges, this paper presents an explainable and resilient IDS for Industry 5.0. The proposed IDS is designed by combining bidirectional long short-term memory networks (BiLSTM), a bidirectional-gated recurrent unit (Bi-GRU), fully connected layers and a softmax classifier to enhance the intrusion detection process in Industry 5.0. We employ the SHapley Additive exPlanations (SHAP) mechanism to interpret and understand the features that contributed the most in the decision of the proposed cyber-resilient IDS. The evaluation of the proposed model using the explainability can ensure that the model is working as expected. The experimental results based on the CICDDoS2019 dataset confirms the superiority of the proposed IDS over some recent approaches.


An Explainable and Resilient Intrusion
Detection System for Industry 5.0 Danish Javeed , Student Member, IEEE, Tianhan Gao , Prabhat Kumar , Member, IEEE, and Alireza Jolfaei , Senior Member, IEEE Abstract-Industry 5.0 is a emerging transformative model that aims to develop a hyperconnected, automated, and data-driven industrial ecosystem.This digital transformation will boost productivity and efficiency throughout the production process but will be more prone to new sophisticated cyber-attacks.Deep learning-based Intrusion Detection Systems (IDS) have the potential to recognize intrusions with high accuracy.However, these models are complex and are treated as a black box by developers and security analysts due to the inability to interpret the decisions made by these models.Motivated by the challenges, this paper presents an explainable and resilient IDS for Industry 5.0.The proposed IDS is designed by combining bidirectional long short-term memory networks (BiLSTM), a bidirectionalgated recurrent unit (Bi-GRU), fully connected layers and a softmax classifier to enhance the intrusion detection process in Industry 5.0.We employ the SHapley Additive exPlanations (SHAP) mechanism to interpret and understand the features that contributed the most in the decision of the proposed cyberresilient IDS.The evaluation of the proposed model using the explainability can ensure that the model is working as expected.The experimental results based on the CICDDoS2019 dataset confirms the superiority of the proposed IDS over some recent approaches.

I. INTRODUCTION
T HE FIFTH industrial revolution also known as Industry 5.0 is considered as a next-level advancement.Its goal is to combine the human expert's creativity with effective, intuitive, and explicit machinery to bring forth manufacturing solutions that are more user-friendly and resource-efficient than those of Industry 4.0 [1].It provides a narrative intent to facilitate the users and organizations.Industry 5.0 is expected to have a huge impact on consumer technology, accelerating innovation and revolutionizing the way products are conceived, manufactured, and supplied to customers.Consumer Electronic (CE) devices play a crucial role in such an industry through collecting data from the sensors and machines, monitoring and controlling them remotely [2].Consumergrade sensors and cameras are used to capture environmental data, i.e., temperature, air quality, and humidity, which can then be utilized to optimize industrial operations.Additionally, CE devices like smartphones, wearables, and tablets function as interfaces between machines and operators, offering real-time data and alarms on the status of the equipment.As a result, academia, industry, and individuals are endeavoring to integrate rapid commercialization flow while paying slight attention to the safety and security of Industry 5.0 devices and networks [3].For instance, autonomous robots in manufacturing plant can be remotely hijacked and controlled by cybercriminals to frighten the company.Even with the availability of traditional security measures like authentication, encryption, access control, and data confidentiality, Industry 5.0 network have proven vulnerable to network attacks, necessitating the need for an extra layer of security.One commonly used strategy is to develop and deploy Intrusion Detection Systems (IDSs) for connected Industry 5.0 systems [4].However, the variety of cyber attacks makes traditional IDS less effective.Thus, it is crucial to design an effective and reliable system in line with contemporary criteria.The IDS tracks online activity in real-time and spots unusual behavior.In the recent years, Deep Learning (DL)-based IDS became a trending research topic for researchers around the globe and they proposed numerous DL-based IDS to protect such industries against cyber threats.The authors of [5] proposed an IDS to identify threats and safeguard the network from them.However, for early detection, the proposed IDS must be updated frequently and should include the patterns and characteristics of new potential attacks.
DL-based IDS provides an efficient performance.However, these models lack explainability and interpretability, i.e., comprehending the underlying data proof of the prediction decisions for the behavior of the designed model [6].Consequently, the decision lacks trust and their output cannot be further used to optimize the behaviour and reasoning offered by the sophisticated algorithm.The Explainable AI (XAI)-based IDS gives methodical and comprehensible justifications for its behaviors that users can follow.For instance, the authors of [7] designed a comprehensible architecture for IoT setups to track customer sentiment.They base their approach on merging enterprise data and IoT to model consumer sentiment, which improves customer prioritization and aids in problem-solving.Likewise, the authors in [8] proposed an XDL-based model to design an efficient IDS for Internet of Medical Things (IoMT) networks.Research on XDLbased IDS is still in its infancy, especially for IoT-enabled Industrial networks.Therefore, the proposed work designed an explainable and resilient IDS to protect such industries against evolving threats.

A. Contribution
The contributions of this research are as follows: • A novel explainable and cyber-resilient IDS is designed by combining bidirectional long short-term memory networks (BiLSTM), a bidirectional-gated recurrent unit (Bi-GRU), fully connected layers, and a softmax classifier to enhance the attack detection process in Industry 5.0.• The SHapley Additive exPlanations (SHAP) mechanism is employed to interpret and understand the decision made by the proposed DL-based sophisticated IDS.As a result, the explanation will help the security analyst to interpret the traffic features from the CICDDoS2019 dataset and the output can be further used to optimize and develop new algorithms for DL-based IDS.Furthermore, the experimental results based on the CICDDoS2019 dataset confirm the superiority of the designed IDS over some recent threat detection techniques.The remainder of this work is structured as follows; Section II discuss the related work.The proposed detection scheme is elaborated in Section III.The experimental details are provided in Section IV.Section V discuss the result analysis.Finally, the conclusion along with the future remarks are presented in Section VI.

II. RELATED WORK
Over the last decade, DL and ML-based approaches have demonstrated their utility in detecting anomalous entities in traditional IoT-based networks.The authors of [9] proposed a DL-based IDS which is capable to encounter the existence of threats in IoT networks.The model is based on a CNN classifier to obtain desired security objective.The authors trained and evaluated their proposed framework on the BoT-IoT dataset that comes with a huge variety of security threats and is considered an ideal choice to train IDS.The system has achieved 92.46% accuracy when evaluated on diverse performance metrics.The authors of [10] proposed a Stacked Denoising Auto-encoder Support Vector Machine (SDAE-SVM)-based model to detect threats in large-scale industrial networks.The authors used the KDD-CUP99 dataset for training and testing purposes.Their proposed system shows competitive strength against a diverse variety of potential security threats and achieved 97.83% accuracy.In [11], the authors employed Natural Language Processing (NLP) and Multi-Layer Perceptron (MLP) to differentiate between crucial and non-crucial posts on the Dark Web.
Another Deep Neural Network (DNN)-based DDoS attack detection framework is presented in [12].The authors employed the CICDDoS2019 dataset for experimentation and achieved an accuracy of 94.57%.Further, a cognitive computing-based-IDS is proposed in [13].The authors combined Gated Recurrent Unit (GRU) and Binary Bacterial Foraging Optimization (BBFO) for efficient intrusion detection.Their proposed scheme is trained and evaluated with the CICIDS2017 dataset and achieved an accuracy of 98.45%.The authors of [14] designed a Generative Adversarial Networks (GAN) based IDS.The authors trained their model using the CICIDS2017 dataset and achieved 88.70% accuracy.Another intrusion detection scheme using Aquila Optimizer (AQO) is proposed to combat botnet attacks in IoT-based smart environments.NSL-KDD and CICIDS2017 datasets are used for model training.The system significantly proves its effectiveness in terms of threat detection [15].
A DNN-based model is presented that introduces a pixel drop method to eliminate the existence of anomalies in medium to large-scale IoT-based smart networks.The framework analyzes the traffic streams to investigate suspicious entities and based on threat impressions; malicious traffic is highlighted [16].The authors of [17] proposed an IDS for industrial environments.The authors used the power system and UNSW-NB15 dataset to evaluate the performance of their beta mixture-hidden Markov (MHMMs)-based model.The size of these datasets was reduced by the authors using Independent Component Analysis (ICA).
The authors of [18] proposed a model to detect intrusions in the IIoT network.They used UNSW-NB15 and BoT-IoT datasets for experimentation and achieved an accuracy of 91.25% for UNSW-NB15 and 98.10% for the BoT-IoT dataset.A hybrid DL autoencoder MLP along with the capabilities of automatic feature extraction is employed by the authors in [19].They used the CICDDoS2019 dataset for experimentation and achieved a detection rate of 98.34%.Another DL-based IDS for SDN-based IoT networks is proposed in [20].The authors utilized DNN with GRU-RNN to detect threats in such a network.Their proposed model achieved efficient results with 80.70% and 90% accuracy.However, their proposed scheme has a high FPR of 0.78%.The authors of [21] employed LSTM with fully-connected layers along with a hyper-parameters tuning method to identify normal and malicious events.They used six datasets to evaluate binary and multi-class intrusion detection scenarios.Moreover, in [22], the authors used a DNN-based scheme for identifying fraudulent activity in different IoT devices.They evaluated their proposed scheme under UNSW-NB15 and NSL-KDD datasets and achieved 92.40% and 98.60% detection rates.

III. PROPOSED INTRUSION DETECTION SYSTEM
In this section, we discuss the main components of the proposed explainable and resilient-centric deep learning-based DS for the Industry 5.0 network.We first describe the Proposed DL-based Cyber Threat Detection Scheme, followed by Connected Layers and Classifier.We further describe Explainable AI.Finally, we present the Proposed Network  I.
A. Proposed DL-Based Cyber Threat Detection Scheme 1) BiLSTM: BiLSTM seems exclusively identical to its unidirectional counterpart (LSTM).The sole distinction is that the BiLSTM network connects with both the past and the future.For example, with synchronized repeat connections, a one-way LSTM may be trained to predict the dataset when it is loaded one at a time.On the rear tag, the BiLSTM additionally provides the following characters in succession, allowing us to access future information [23].The BiLSTM consists of three gates, such that an input (Ip t ), forget (Fg t ), and output gate (Op t ) along with a cell state (Z t ) and a candidate for the cell state (C t ).The Ip t keeps the state of the cell updated.The following equations control the operations to update the C t for forward (→) and backward (←) process respectively [23]: The Fg t takes the current input (X t ) and the previous hidden state (H t −1 ) as inputs.Further, it uses the sigmoid function (σ ) to output a value.
The Op t determines the next timestep hidden state (H t ) which comprise all the information of the prior inputs, thus it is required to make the predictions.Such a process requires two steps for finding the next timestamp: 2) BiGRU: A BiGRU consists of two GRUs; one processing the information in the forward direction and the other processing it backward.It consists of Update and reset gates(Up t ), (Re t ) along with a candidate cell(C t ) and a final state(H t ).To prevent the RNN gradient disappearance or explosion, the gate structure might opt to save context information.The GRU has a simpler structure than the LSTM and trains more quickly.The following equations compute the BiGRU transition functions for the forward process (→) [24].
where σ is the sigmoid operator, − − → is the element-wise multiplication between the two vectors, and tanh represents the non-linear point-wise activation function.The following equations computes the transition functions for the backward process (←): Finally, the concatenation (⊕) of the → and ← is done by the following equation:

B. Connected Layers and Classifier for Threat Detection
The proposed threat detection module comprises two layers of BiLSTM having 200 and 100 neurons with 0.2% dropout rate to avoid overfitting followed by a dense layer of 30 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Perform Encoding and decoding Add BiLSTM layers and perform encoding using Equations (1) to (12) 6:

Build model using BiGRU and Softmax classifier using Equations (13) to (21)
Add softmax layer Perform Testing using CDS19 Test

8:
Evaluate performance using various metrics 9: Use SHAP library to analyze the features 10: end procedure neurons.We further employed 2 layers of BiGRU with 100 and 50 neurons respectively.We adopt ADAM as an optimizer, while CC-E and RELU as activation and loss functions.A complete architecture of the proposed scheme is shown in Fig. 1 and the complete procedure of the proposed IDS is explained in Algorithm 1. Finally, in the output layer, we use Softmax classifier for attack classification.The following equations compute such operations: where σ is the softmax, D i is input vector, e D i is the standard exponential function for the D i .Further, K represents the number of classes and e D z is the standard exponential function for the output vector respectively.Finally, we calculate the loss with categorical cross-entropy loss: where y c is actual and ŷc is predicted output, x is the pattern of input sequence, n is the number of observations, and p belongs to a specific threat type y.

C. Explainable AI
DL-based models are getting popularity in safety-critical IoT applications and the demand for justifications for their predictions is rising [25].The XAI provides methodical and comprehensible justifications for its behavior that human users can follow.Many ML-based models, i.e., NB, LR, and DT are fundamentally understandable on a modular level [26].Unlike ML-based models, the DL models provide superior performance but these models are unable to interpret their predictions.Understanding the rationale behind a model's decision for users and stakeholders helps build trust and confirms that the model is solving an issue securely and robustly.One of the reasons for the "black-box" DL model's hesitant acceptance in many safety-critical sectors is their lack of transparency.Thus, scholars have been looking into numerous explainability methods to aid users in interpreting the decisions of black-box models.Some of them are as follows: 1) Text Explanations: By computing a relevance score for the model's controlled variables, this method is utilized to explain the intricate internal workings of the model.2) Local Explanations: it is used for measuring a model's reaction to small modifications for building explanations.3) Explanations using representative examples: The training data and its effect on a model's decision are better understood using this method.4) Visual Explanations: The model's behavior is visualized using the visual explanation technique.It is used to provide captions for images that explain why they belong to a certain class in image classification tasks.SHAP is one of the approaches Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II SYSTEM SPECIFICATIONS FOR EXPERIMENTATION
that has been proposed for relevance explanations [27].In this paper, we explain the importance of features in the decision of proposed DL-based IDS by employing the SHAP framework.

IV. EXPERIMENTAL SETUP
This section presents the experimental design followed by the dataset details, pre-processing, and evaluation metrics.
1) Experimental Design: The experiments are performed using a Legion PC with a 2.60 GHz Hexacore Coffee Lake CPU, 32 GB RAM, and a Geforce RTX 2060 Max-Q 8GB GPU.The proposed threat detection scheme is developed through the Keras library of TensorFlow.Further, we have employed Python to run the implementation scripts.Complete details are provided in Table II.
3) Dataset Pre-Processing: First, we removed all rows with NaN and Infinity values because they could affect the model's performance.We further used Sklearn label encoder to convert all non-numerical values to numerical values.The only non-numerical feature in the dataset is the 'Label', which we converted to binary using the Scikit-learn label encoder.Moreover, MinMax scalar function is employed for data normalization [29].
4) Evaluation Metrics: This work evaluates the proposed threat detection scheme by employing the standard evaluation metrics, such that Receiver Operating Characteristic (ROC ) curve, Confusion Matrix (CM ), Accuracy (ACC ), Recall (RE), Precision (PR ), F1-score (F1), and extended evaluation metrics, i.e., TPR , NPV , TNR , FPR , FNR , FDR , and FOR .The extended evaluation metrics are defined in Table III.While the following equations compute the value of ACC , PR , RE, and F1.
1) ACC : The effectively predicted instances over the complete number of instances.
2) PR : It is the extent of positives that are genuine positives.V. RESULT ANALYSIS In this section, we discuss the simulation results and performance analysis of the proposed intrusion detection scheme.

A. Performance Analysis of Proposed Threat Detection Framework
In this subsection, we discuss the efficiency of the proposed IDS.The proposed DL-based threat detection model has efficiently learned from the dataset as proven by the accuracy vs loss in Fig. 2. The model achieved Validation ACC of 99.77% with a validation loss of 0.0055% with 10 epochs.We also measure the performance of the proposed IDS class-wise in terms of ACC , PR , TPR , and FNR .The model has significantly learned the normal and attack signatures and achieved ACC , PR and TPR values between 92% to 100% except for the DDoS class, where the model achieved 78.63% TPR as depicted in Table IV.Further, the model achieved FNR of 0.00012% to 0.0213% for the respective classes accordingly.We further provide the CM and ROC of the proposed model to prove its efficacy.Table V depicts the CM of the model where it demonstrates its efficiency by categorizing all instances of the datasets into their respective classes.Similarly, the ROC curve values derived for various attack classes is illustrated in Fig. 3.It shows that the scores for all the classes are almost equal to one.Moreover, the proposed model achieved macro and micro-average of 0.99 and 1.00 respectively.

B. XAI Interpretation for CICDDoS2019 Dataset
In this subsection, we discuss the XAI interpretation of the dataset.The demonstration of decision-making by complex models is illustrated via the SHAP decision graphs.SHAP provides a number of plots, i.e., Decision Plot (DP ), Waterfall Plot (WP ) and Summary Plot (SP ).The DP plot is given in Fig. 4. The explainer's expected value is used to center the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

C. Comparison With Baseline Detection Schemes
In this subsection, we have conducted the performance comparison of the proposed model against the baseline detection schemes.We have used the results obtained from the proposed threat detection scheme for comparison with GRU and LSTM.Fig. 7 depicts the values of ACC , PR , RE, and F1 achieved by the Proposed IDS as 99.77%, 98.79%, 98.42%, and 99.41% respectively.
Table VI depicts the class-wise Detection ACC comparison.The Proposed IDS achieved a detection ACC of 97% to 100% for MSSQL, SSDP, DDoS, Portmap, UDP, UDP-Lag, and Benign.However, it obtained 95.56% detection ACC for SYN and 96.59% for WebDDoS attack classes.Fig. 8 depicts the values of TPR , TNR , and NPV , where the proposed IDS achieved TPR of 97.42% and 99.88% TNR and  NPV .However, the GRU and LSTM show less significant performance, which proves the superiority of the Proposed IDS against the baseline detection schemes.Finally, Fig. 9 depicts the comparison in terms of FPR , FNR , FDR , and FOR .It is shown that the Proposed IDS achieved FPR , FDR , FOR of 0.0041%, 0.0025%, 0.0004% with FNR of 0.0485% respectively.The values of the Proposed IDS are considerably lower than the other detection models.The lower rates of such metrics prove the efficiency of the Proposed IDS.

D. Comparison With Recent State-of-the-Art Detection Frameworks
Lastly, we compare the Proposed IDS's performance with recent threat detection schemes from the current literature, i.e., [10], [11], [12], [19] and [17] to further validate its efficacy.Table VII depicts the comparison in terms of ACC .Some of the recent works have either used old datasets, i.e., Power system and KDD-CUP99, which have less practical values for IoT or they achieved less significant outcomes.We adopt CICDDoS2019, which contains network flow-based instances  Overall comparison of Proposed IDS against baseline detection schemes.

TABLE VII COMPARISON OF PROPOSED IDS WITH RECENT FRAMEWORKS
and is an IoT-based dataset.The Proposed IDS outperformed the recent detection frameworks by achieving a higher ACC of more than 2%.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.VI.CONCLUSION An intrusion detection system is one of the most important security tool for industrial networks.However, most of the existing approaches based on ML and DL techniques are treated as a black box by the security analysts and developers.In this article, we have designed a new explainable and resilient intrusion detection system in Industry 5.0 that combines bidirectional long short-term memory networks, a bidirectional-gated recurrent unit, a fully connected layer, and a softmax classifier for attack detection.Furthermore, the proposed framework adopts SHAP technique to understand the importance of the features that contributed the most to attack detection using the CICDDoS2019 dataset.The experimental results confirm the superiority of the proposed approach over some existing state-of-the-art schemes.However, the proposed IDS has some limitations, such that, it is vulnerable to insider threats where intruders can disrupt the network without interfering with the flow between the industrial network and the Internet.Future research will include integrating blockchain with the proposed framework to enhance decentralization in Industry 5.0.

Fig. 9 .
Fig. 9. Comparison of the Proposed IDS against baseline detection schemes in terms of FPR , FNR , FDR and FOR .

TABLE I TABLE OF NOTATION
tModel.The notations used in this work are mentioned in Table Bs op are its respective biases.The X t represents the current input and the Hadmard product is denoted by .

TABLE III DETAILS
OF TPR , TNR , FPR & FNR3) RE:The ratio of TP to the sum of TP and FN .The harmonic mean of the RE and PR .The F1 is determined utilizing the underneath numerical condition.

TABLE VI CLASS
-WISE DETECTION ACCURACY COMPARISON OF THE PROPOSED IDS AGAINST BASELINE DETECTION SCHEMES