Robust Network Intrusion Detection System Based on Machine-Learning With Early Classification

Network Intrusion Detection Systems (NIDSs) using pattern matching have a fatal weakness in that they cannot detect new attacks because they only learn existing patterns and use them to detect those attacks. To solve this problem, a machine learning-based NIDS (ML-NIDS) that detects anomalies through ML algorithms by analyzing behaviors of protocols. However, the ML-NIDS learns the characteristics of attack traffic based on training data, so it, too, is inevitably vulnerable to attacks that have not been learned, just like pattern-matching machine learning. Therefore, in this study, by analyzing the characteristics of learning using representative features, we show that network intrusion outside the scope of the learned data in the feature space can bypass the ML-NIDS. To prevent this, designing the active session to be classified early, before it goes outside the detection range of the training dataset of the ML-NIDS, can effectively prevent bypassing the ML-NIDS. Various experiments confirmed that the proposed method can detect intrusion sessions early (before sessions terminate) significantly improving the robustness of the existing ML-NIDS. The proposed approach can provide more robust and more accurate classification with the same classification datasets compared to existing approaches, so we expect it will be used as one of feasible solutions to overcome weakness and limitation of existing ML-NIDSs.


I. INTRODUCTION
It is very important to detect a network intrusion quickly and accurately for stable operation of the network. For this purpose, a dedicated security device called the Network Intrusion Detection System (NIDS) was proposed [1], [2]. The initial NIDS generated patterns from existing attacks and detected intrusions very quickly and accurately through pattern matching with the received packets [3]- [5]. However, a method that relies on patterns for existing attacks has a disadvantage in that it is impossible to detect an attack that is not known beforehand, and the network is easily penetrated by a variant of an existing attack.
To solve this problem, various methods have been proposed and applied to the NIDS [6], [7]. The machine The associate editor coordinating the review of this manuscript and approving it for publication was Hayder Al-Hraishawi .
learning-based NIDS (ML-NIDS), which has recently received the most attention, was evaluated as an alternative that can significantly improve the shortcomings of the pattern matching NIDS (PM-NIDS). The ML-NIDS analyzes the characteristics of existing network intrusions using ML and detects the intrusions using overall behavioral characteristics. Therefore, while the PM-NIDS can be penetrated by simply changing the pattern of an intrusion session, the ML-NIDS can detect intrusions even if some characteristics are changed as long as the overall behavior is maintained. Therefore, it is well known through the results of several studies that the ML-NIDS provides higher robustness for intrusion detection than the PM-NIDS.
However, the ML-NIDS learns the overall behavior of an intrusion using a training dataset, so just like the PM-NIDS, it strongly depends on having the pattern of the existing attack, and its detection ability depends on the training dataset. In other words, like the PM-NIDS, the ML-NIDS can have a very low probability of detecting an intrusion that does not exist in the training data. Nevertheless, research on such limitations is not being conducted much. Instead, various methods of avoiding the ML-NIDS by modifying features in the feature space are in progress, and recently, studies to supplement the robustness of training datasets through a Generative adversarial network (GAN) and other deep learning have been proposed [21]- [23]. However, these studies do not directly analyze the dependence of the ML-NIDS on the learning dataset, so there are limitations in understanding the characteristics of that dependence.
In this paper, we directly analyze these characteristics and propose a method to provide robustness to the ML-NIDS training dataset without increasing its size. The proposed method analyzes the characteristics of the training dataset for the ML-NIDS and uses discovered characteristics to significantly improve intrusion detection performance without major changes in the system. To that end, the method proposed in this paper increases the detection range of the training dataset by analyzing the existing session-based dataset.
The main contributions of this study are as follows.
First, it is proven that ML-NIDS is vulnerable for detecting existing intrusion with some behavioral modification by adding some packets. Through analyzing ML-NIDS datasets, it is found that dependence on the training dataset is very high, so weaknesses similar to the PM-NIDS exist. In particular, it shows that the influence of the dependency can be quite different based on the ML algorithm.
Second, to alleviate strong dependency on the training dataset in terms of packet count, we present a method for selecting when the ML-NIDS optimally detects an intrusion. Through this, even too short or too long sessions that cannot be detected by the existing ML-NIDS can be detected with very high accuracy. In particular, compared to the existing PM-NIDS, early attack detection is possible on a similar hardware platform, so it is advantageous in keeping the network safe.
Third, since the proposed method is so light to be implemented on the existing ML-NIDS platform instead of a highcost, high-performance hardware platform, the proposed approach is feasible in economic terms.

II. PREVIOUS WORK
The types of ML-NIDS are packet-based methods that use packet data directly as features, and session-based methods that use statistical data for a logical group called a session instead of packets as features. The packet-based method can be classified in two ways: one detection method uses a single packet to detect a pattern for malicious data in every packet received, and the other detection method uses multiple packets, storing and combining packets belonging to the same session into one dataset that is used for detection [2]- [5], [9], [35], [36]. Both the single-packet detection method and the multi-packet detection method search for malicious code or patterns in the packet payloads [10]. Owing to the high accuracy of the pattern-matching algorithm, it can detect malicious traffic while maintaining a very low false positive rate (FPR). However, attacks exploiting normal packets, like a Distributed denial-of-service (DDoS), are hard to detect with the packet-based method, and the pattern-matching algorithm can easily be bypassed by adding random data to the payload. Therefore, only the pattern-based method is not used alone.
In order to solve the shortcomings of the packet-based method, studies have been proposed to extract session features and detect an intrusion through them [13]- [18]. When using session features, it is impossible to bypass the NIDS just by adding some dummy data. In addition, regardless of the packet size or the length of the session, the size of the entire feature is always the same, so the session-based method is more advantageous than the packet-based method for handling large volumes of traffic.
The NIDS using session features mostly uses machine learning algorithms to classify the received traffic. So far, various ML-NIDSs have been developed and are expected to overcome the weaknesses of the PM-NIDS. Inevitably, malicious users are developing various methods to bypass the ML-NIDS (largely divided into white box, gray box, and black box methods), depending on what information can be used. The white box method is a way to bypass the NIDS when the attacker knows all information about it [19]- [23]. This is an ideal and unrealistic case, because information about the dataset, the machine learning model, and the feature set used for learning has to be available. On the other hand, the gray box method is where a malicious user knows minimal information, such as the algorithm for extracting features [24], [25]. The black box method, however, finds a way to bypass the NIDS without having any information in advance [26]. Therefore, although it is the most realistic, it is technically quite difficult to implement, because it is necessary to actively collect the necessary information and indirectly identify the characteristics of the NIDS.
Although various approaches and many related studies exist, we can see that the accuracy of a commonly learned classification model is greatly affected by a small number of features [19], [25]. Therefore, in the white box method, the corresponding features can easily be found, and the NIDS can easily be bypassed by generating an attack that exceeds the learning range of these features. Of course, research to alleviate these weaknesses is also being conducted. Various methods, such as removing some of the most influential features and training a classification model, have been proposed. However, the method of removing some features is not a fundamental solution in that it can inevitably affect the performance of the ML model. In the end, it is urgent to make a robust ML model by lowering the dependency on, and sensitivity to, features that affect the learning model while maintaining classification accuracy at the same time [26], [27].
Fundamentally, it is unreasonable to assume that the NIDS equipped with a model created using pre-built training data can learn all the characteristics of all sessions received from an actual network in advance. In fact, the training dataset size is limited, so it is possible for a session to exist where the values for some specific features exceed the range of the learning values from the training dataset. If the corresponding feature is one that greatly affects the performance of the learning model described above, the sessions cannot be accurately classified by ML-NIDS. Therefore, this is a problem that must be solved in order to develop a system to defend against attacks. Nevertheless, research on this is not being conducted. Table 1 summarizes the pros and cons of each type of ML-NIDS, including the proposed approach.

III. THE PROPOSED APPROACH
We present a new method to improve the ML-NIDS in order to handle intrusion sessions with feature values that exceed the range of learned values. Therefore, an ML model combined with our proposed approach can detect intrusions that exceed the classification range of the training dataset in the feature space with high probability, so it is expected to not only make the ML-NIDS robust, but will also help prevent existing adversarial attacks.

A. MOTIVATION
Since the training dataset determines the performance of the ML-NIDS, it is most important to implement a training dataset including a large amount of rich data on network intrusions without redundancy as much as possible. However, since the size of the training dataset is finite, the area of the feature space learned with the training dataset is inevitably limited. In order to confirm this in more detail, it is necessary to analyze the effect when the learning range of the training dataset and the range of the test dataset do not overlap in the feature space. For further explanation, let us define some notations as follow: A session S is defined by S = {P 1 ,P 2 , . . . ,P k }, where it consists of k packets. Let us define src (P i ), size (P i ), rtime (P i ) by source IP of P i , size of P i , reception time of P i , respectively. Then the forward packet count and the total data rate are defined by |S forward | and In previous studies, the forward packet count and the total data rate are known to be very important features in the ML-NIDS [19], [25]. According to the value of the forward packet count feature, in this experiment, a training dataset consisting of sessions with values smaller than a threshold value and a test dataset consisting of sessions with larger than a threshold value were created, and an experiment was conducted to measure classification performance using them. Since the influence of the forward packet count feature may be different for each class, the following experiment was conducted to analyze it. With the data for the i-th class among the entire dataset, only sessions with a forward packet count value less than or equal to a threshold value (the maximum forward packet count value for class i, θ i ) were selected to create the training dataset, and only sessions where the forward packet count value was greater than θ i were used to construct the test dataset. Here, θ i sets the data configuration ratio at 7:3 for the corresponding class. For other classes, the training and test datasets were randomly configured, regardless of the θ i value.
θ i is set to a value close to the ratio of 7:3 because sufficient training and test dataset size are required to obtain accurate classification performance. That is, if θ i is too large, the test dataset becomes too small to accurately measure classification performance. On the contrary, if θ i is too small, the training dataset becomes too small compared to train the ML model, causing degraded classification performance. Figure 1 shows f1-scores of some selected classes according to the dataset ratio. From the figure, we can see that the ratio should not be too small or too large according to the class. Thus, each θ i was set as close to 7:3 as possible. Tables 2 to 5 show the experimental results from configuring the training dataset and the test dataset based on θ i for Brute Force-SSH, DDoS, DoS-HTTP, and Infiltration using the ISCX2012 dataset. For Brute Force-SSH, when the ML model is trained only with forward packet count values smaller than θ i (as shown in Table 2) the class cannot be detected at all using the model. On the other hand, as shown in Tables 3 to 5, when the training and test datasets are randomly configured for Brute Force-SSH, more than 98.5% were detected. As for the detection rates of the other classes (DDoS, DoS-HTTP, and Infiltration) shown in Tables 3, 4,  TABLE 2. Confusion matrix where the training dataset and test dataset, respectively, are composed of sessions with small forward packet count values and sessions with large forward packet count values for Brute Force-SSH, whereas training and test datasets for other classes are randomly composed. Columns and rows of the matrix represent instances of actual and predicted classes, respectively.   and 5, only 0.01%, 5.2%, and 3% were detected, respectively, and all classes showed similar results. In the end, it was confirmed that when the range of the value of the forward packet count configured in the training dataset and the range of the value of the forward packet count configured in the test dataset were different, the classification accuracy was significantly affected.
As shown in Tables 2 to 5, sessions with a count value greater than the number of forward packets in the training dataset are hardly ever detected, regardless of the class type.
In an actual network, the forward packet count value simply increases as the attack continues. That is, it is relatively easy to make it outside the range of the forward packet count of the training data (compared to other features), but the impact on the existing ML-NIDS is very high.
One of the solutions for this is to collect various sessions, including sessions with from a very small forward packet count to a very large count. However, this method not only makes the training dataset too large, but also makes it quite difficult to obtain sufficient training data without any missing value because the range of the forward packet count values is large. In addition, when the size of the training dataset increases, training time greatly increases due to the bigger dataset, and detection speed may decrease as the complexity of the training model increases. Therefore, increasing the size of the training dataset in order to increase the forward packet count value cannot be a fundamental solution. In the end, malicious users can disarm the existing ML-NIDS by performing an attack to increase the forward packet count, and they can easily bypass detection regardless of the dataset VOLUME 10, 2022 size, but an effective method to prevent this has not yet been presented.
This study tries to effectively solve this problem by adjusting the detection timing, instead of expanding the detection range of the training dataset in the feature space. Figure 2 is a conceptual diagram showing two ranges within which the ML model can and cannot classify sessions when the model is trained on a dataset consisting of two features. According to Figure 2 (a), session X can be classified, so the ML classifier can determine whether it is an intrusion or a benign session. On the other hand, it is impossible to classify session Y using the ML classifier because it is located in an area that cannot be classified. Now let us assume that session Y consists of four packets. We also assume that whenever each packet is received, the NIDS cumulatively creates from the first packet an intermediate session feature using the currently received packet, and plots it in the feature space as shown in Figure 2 (b). The number on the path shown in Figure 2 (b) represents the number of packets used to create the feature. For example, 2 in Figure 2 (b) indicates a session feature created using the first and second packets.
In Figure 2 (b), the features when the first and fourth packets are received are located in the unclassifiable range in the feature space, whereas the features when the second and third packets are received are located within the classifiable range. Therefore, if we find the right timing at which the corresponding session can be accurately classified, instead of classifying it when the session is terminated, we can classify the session correctly before the end of the session. Now let us discuss in detail how this idea can be implemented.

B. THE PROPOSED ALGORITHM
The algorithm should classify a session when the intermediate session feature exists in the classifiable area even before the session is terminated. However, determining if the session feature is in an unclassifiable area for the currently on-going session is difficult. Of course, if intermediate session features are created, and if classification is performed using them on every received packet, it may be possible to classify the session when the intermediate session feature exists in a classifiable region in the feature space even before the session is terminated. However, this incurs very high calculation and memory costs, which means it requires a very expensive, high-end platform that far exceeds the performance of the currently existing NIDS. As a result, in terms of cost, it is infeasible to classify every received packet.
To solve this problem, the proposed method uses the following approach. For a specific feature, a range of values that can be well classified for each class is determined in advance, and classification is attempted only when the intermediate session feature for the currently received session is included within the range. Here, the range of the feature for each class is determined, because the range of the training data for the corresponding feature may be different for each class. In this case, it is advantageous to select a feature type that can be easily calculated and that has great influence on classification accuracy. In this paper, we chose the forward packet count as the feature for the decision, since it meets all conditions.
In the proposed method, the learning process is the same as that of the existing ML-NIDS. Also, it is the same when the session ends in that the corresponding session feature is created to perform classification. However, the classification process differs from that of the existing classification in the following aspect.
By analyzing the training dataset, the maximum forward packet count value (θ i ) for each class except benign class is precalculated. Thus, we will have N − 1 values at most due to duplication, when N is the total number of classes. If the forward packet count of the currently received packet matches one of these values, an intermediate session feature of the session the packet belongs to is created and classified. At this time, if the classification result is a class in which the maximum forward packet count has the same value as the packet count of the currently received packet, the packet will be processed according to the classification result. For example, the packet will be dropped and the result will be logged or notified to an administrator Otherwise, the current classification result is ignored, and the next classification is reperformed whenever the packet number matches one of the maximum forward packet counts again, or when the session ends. Figure 3 shows how to obtain the training dataset from the original dataset using precalculated θ i . For training dataset, each session of a class i is normalized according to θ i . Such a normalized dataset is greatly helpful to avoid classification on the unclassifiable range of the class.
In this method, each class undergoes classification within the classifiable feature values, while at the same time adjusting the classification timing so that each class is best classified. Here, since the intrusion class tends to be classified as benign, if the result of each classification is benign, the result is ignored. Only when the session is finished and classified as benign, the packet is processed as benign. The detailed operation of the proposed method is in Algorithm 1. The classification is only performed when P is the last packet of the session or n (P) ∈ , the computational complexity of Algorithm 1 is O where N is the class number. Usually, the class number is smaller than the session length. It means that the algorithm has lower complexity than the per packet detection approach.
To make it easier to understand the operation of the proposed algorithm, Figure 4 illustrates two cases in which the proposed method finally obtains the classification result. As shown in the figure, classification is performed only when θ(C i ) and forward packet count are the same, thus reducing the overall classification overhead and increasing the possibility of completing classification before the forward packet count becomes too large. By doing so, the proposed Return C est 4 ELSE 5 IF n(P) ∈ THEN 6 C est = classifier(F(P)) 7 IF θ(C est ) == n(P) THEN 8 Return C est 9 ELSE 10 Postpone the decision until the next packet is received. 11 ENDIF 12 ENDIF 13 ENDIF approach improves classification speed and classification accuracy simultaneously. Figure 5 shows the entire procedure of the proposed algorithm. It calculates the maximum forward packet count for each class, build the training set using the counts. It then tries to classify the received packet to infer the class of the session that the packet belongs to.

IV. PERFORMANCE EVALUATION
In order to accurately evaluate and analyze the proposed method, various datasets and several classification algorithms were used to analyze its performance in various environments. For the evaluation, six algorithms were selected: Random forest [28], Adaboost decision tree [29], XGBoost [30], Extreme learning machine (ELM) [31], Deep neural network (DNN) [32], and Convolutional neural network (CNN) [17]. By including from the deep learning to the decision tree-based method, we compare how the proposed method affects performance when applied to various algorithms.

A. EVALUATION ENVIRONMENT
It is important to use multiple datasets, because characteristics within the same class may differ, depending on the network environment in which data are collected. In this experiment, three datasets were used: ISCX2012, CIC-IDS2017, and CSE-CIC-IDS2018 [33], [34]. Here, minor classes were excluded. In addition, classes having only one forward packet count value were excluded because there is no need to apply the proposed method. For example, in PortScan from CIC-IDS2017, there is no need to apply the proposed method because sessions with one forward packet count comprise 99.5% of the total data. For the same reason, FTP-Brute Force and DoS-SlowHTTPTest with only one forward packet count value were excluded from CSE-CIC-IDS2018. The total numbers of classes of ISCX2012, CIC-IDS2017, and CSE-CIC-IDS2018 are 6, 9, and 8 respectively.
To measure the performance of the proposed method, it is necessary to create a training dataset consisting of small forward packet counts and a test dataset consisting of large counts. To this end, in the distribution according to the forward packet count size for each class, all the session data were divided at a ratio of 7:3 to create training and test datasets. Exceptionally, if the distribution of forward packet counts is U-shaped, training and test datasets were built by dividing all the data based on the minimum point. Only the benign class created training and test datasets by randomly dividing the data at 7:3, regardless of the forward packet count value. The data sizes for the classes are shown in Tables 6 to 8. To evaluate each classification models, the following metrics were used: where TP, TN, FP and FN stand for true positive, true negative, false positive and false negative.

B. DETECTION RATE
The results of applying the proposed method to the three datasets and six machine learning algorithms are shown in Figures 6 to 8. Comparing the results of applying and not   applying the proposed method to each algorithm, we see that most of the performances improved when the proposed method was applied. Among a total of 18 test cases, the only case where the performance did not improve (based on the F1-score) was the DNN with the proposed algorithm using the CSE-CIC-IDS2018 dataset. Also, in Figures 6 to 8, we found that the overall sensitivity of the deep learning-based algorithm to the forward packet count feature was higher than DT-based algorithms. Even if the proposed method was applied, the ELM, DNN, and CNN all had F1-scores of 30% to 50%, whereas RF, Adaboost DT, and XGBoost show F1-scores higher than 75%. Therefore, it can be argued that deep learning algorithms have great difficulty in improving performance if sufficient training datasets are not collected.
RF, Adaboost DT, and XGBoost also had very low F1-scores if the proposed method was not applied. In applying XGBoost to CSE-CIC-IDS2018, as an exception, it shows a high F1-score of 86.15% even without using the proposed method. However, in other datasets, the F1-score is only about 10% higher than deep learning.
When the proposed method was applied to DT-based algorithms the F1-score improved by up to 32%. The highest F1-score was obtained by applying the proposed method to XGBoost with ISCX2012, which achieved 80.22%. With CIC-IDS2017, applying the proposed method to XGBoost achieved 94.21%, and with CSE-CIC-IDS2018, the F1-score reached 90.49% when applying the proposed method to RF. Depending on the dataset, the optimal algorithm may be different, but we confirmed it is essential to apply the proposed method.
Additionally, Figure 9 shows the error rates of the proposed and original approaches. For most cases, our algorithm shows smaller error rate than the original one regardless of types of ML algorithm and dataset.
For more detail, it is necessary to analyze the performance of each class when the proposed method is applied. We chose the case where the proposed method is applied to RF with the CSE-CIC-IDS2018 dataset for detailed analysis. According to Figure 10, the F1-score improved, or was at least maintained, after applying the proposed method to all classes except for Brute Force-XSS. In particular, Brute Force-WEB, DoS-GoldenEye, and DoS-Slowloris show that extremely low F1-scores of 7%, 0%, and 1.6% significantly improved to 95.7%, 64.7%, and 96.1% after applying the proposed method. Table 9 shows the confusion matrix from Figure 10 for a more accurate analysis. If the proposed method is not applied   in the confusion matrix, intrusions are often falsely detected as benign. On the other hand, when the proposed method is used as shown in Table 10, the number of cases in which intrusions are falsely detected as benign was greatly reduced. For example, 96% of the Brute Force-Web class was falsely detected originally, but when the proposed method was   applied, Brute Force-Web was detected with 100% accuracy. In addition, after applying the proposed method, the F1-score decreased by 5.2% for Brute Force-XSS, but looking at the confusion matrix, 10% of Brute Force-XSS sessions were falsely detected as Brute Force-WEB, so the intrusion was successfully detected as an intrusion. Therefore, the proposed VOLUME 10, 2022   method not only improved the average F1-score, but also significantly increased the intrusion detection rate for each session.

C. AVERAGE SESSION LENGTH REQUIRED FOR DETECTION
The most important performance metric in the NIDS is detection rate. In addition, detection speed is also a very important metric-the faster the NIDS detects network intrusions, the more it helps keep the network safe. The proposed method is not designed to increase detection speed; nevertheless, it is important to check whether the speed is increased or decreased by it. Therefore, in this experiment, we analyzed detection speed before and after using the proposed method. In order to measure detection speed, we measured how many packets were received in each session until detection was completed. The experimental results for each dataset are shown in Tables 11 to 13. As seen in Algorithm 1, the proposed method includes both packet-based detection and detection of session-based behaviors; therefore, we measured the number of packets required for both. With ISCX2012, the proposed method detected intrusions slightly faster than the existing method. For a DDoS, the proposed method can reduce the number of packets required for detection by 40%, and for the entire classes, it can reduce the number by 28% on average. Although the proposed method is not designed to focus on improving detection speed, it significantly improved speed, thus proving it is of great help in improving the performance of an existing NIDS.
With the CIC-IDS2017 dataset, there was a significant improvement for some classes, such as a 36% performance    improvement against a DDoS, and a 33% performance improvement against SSH-Patator.
Unlike the other two datasets, the CSE-CIC-IDS2018 dataset showed significant performance improvement for classes with a very long detection length. For example, if the proposed method is not used, detecting Brute Force-WEB and Brute Force-XSS required 151.2 and 202.7 packets, respectively, whereas using the proposed method, only 38 and 78.5 packets needed to be received, improving performance by 75% and 61%.
In the results from using the three datasets, detection speed for most classes improved, and only a few classes showed the same performance. In particular, the longer the session is, a greater improvement in detection speed can be a great advantage. Long sessions usually consume a lot of memory of NIDS, because NIDS require all the data for each packet to create a feature after the session ends. Therefore, the proposed method can classify such long sessions much before session termination, so it can significantly reduce the amount of memory for the sessions, and can improve detection performance at the same time.

D. TOTAL CLASSIFICATION NUMBER REQUIRED FOR DETECTION
Unlike the existing ML-NIDS, the proposed method may make multiple classifications until one session is successfully classified. Now, let the number of classifications be defined as ''the number of classifications required until one session is classified.'' This is important because more classifications require more processing power to classify one session, so a hardware platform with better performance is required. In the end, the closer the number of classifications is to one, the higher the possibility of implementing the proposed algorithm on the existing session-based NIDS hardware platform. As shown in Tables 14 to 16, the average number of classifications for each dataset differed greatly for each class. This is because the total session length for each class and the size of θ i for each class were different. However, in Tables 14 to 16, the average number of classifications for the entire dataset does not exceed 3. In particular, the average number of classifications with the CIC-IDS2017 and ISCX2012 datasets were 1.29 and 1.58, respectively, which is not a significant increase compared to the existing session-based classification. Therefore, even if the proposed method is implemented on the existing hardware platform, there is no significant performance degradation.

V. CONCLUSION
The most important thing in the ML-NIDS is the training dataset used to create the classifier model. However, it is impossible to obtain a training dataset including all network intrusions that occur in the wild. Rather, it is important to find a way to accurately detect an intrusion by utilizing an existing dataset, even if the intrusion data it contains are insufficient. In this paper, a new approach to solve this problem is presented. Using various datasets, the proposed method has proven that the weaknesses of the existing ML-NIDS can be greatly improved. Of course, there is still much room for improvement in the proposed method. For example, it may not be sufficient to determine whether the learning range is exceeded using only the forward packet count feature. However, if multiple features are considered, the number of sessions that can be processed per second decreases. Also, for some classes, improvement of the detection rate is not big. Despite these weaknesses, it is a great advantage to be able to broadly expand the classification range in the feature space by using a dataset consisting of limited data. In addition, classification speed can also be improved, so it is expected that the proposed method, when installed in actual NIDS equipment, will be of great help in keeping large networks safe. As our future work, we will focus on how to extend this current result to support multiple features. If the solution is successfully found, ML-NIDS can maximize the classification detection rate without deteriorating the classification speed.
TAEHOON KIM received the B.S. degree in information and communication engineering from Yeungnam University, in 2018, where he is currently pursuing the M.S. degree. His current research interests include high-speed network intrusion detection and prevention based on machine-learning.
WOOGUIL PAK (Member, IEEE) received the B.S. and M.S. degrees in electrical engineering and the Ph.D. degree in electrical engineering and computer science from Seoul National University, in 1999University, in , 2001University, in , and 2009, respectively. In 2010, he joined the Jangwee Research Institute for National Defence as a Research Professor, and Keimyung University, Daegu, South Korea, in 2013. He is currently an Associate Professor at Yeungnam University, Gyeongsan, South Korea. His current research interests include network and system security, blockchain, user behavior analytics based on machine learning, and network security for high speed networks. VOLUME 10, 2022