Journals & Magazines >Journal of Cyber Security and... >Volume: 11 Issue: 5

Monitoring and Identification of Abnormal Network Traffic by Different Mathematical Models

Abstract:

The presence of anomalous traffic on the network causes some dangers to network security. To address the issue of monitoring and identifying abnormal traffic on the netwo...Show More

Metadata

Abstract:

The presence of anomalous traffic on the network causes some dangers to network security. To address the issue of monitoring and identifying abnormal traffic on the network, this paper first selected the traffic features with the mutual information-based method and then compared different mathematical models, including k-Nearest Neighbor (KNN), Back-Propagation Neural Network (BPNN), and Elman. Then, parameters were optimized by the Grasshopper Optimization Algorithm (GOA) based on the defects of BPNN and Elman to obtain GOA-BPNN and GOA-Elman models. The performance of these mathematical models was compared on UNSW-UB15. It was found that the KNN model had the worst performance and the Elman model performed better than the BPNN model. After GOA optimization, the performance of the models was improved. The GOA-Elman model had the best performance in monitoring and recognizing abnormal traffic, with an accuracy of 97.33%, and it performed well in monitoring and recognizing different types of traffic. The research results demonstrate the reliability of the GOA-Elman model, providing a new approach for network security.

Published in: Journal of Cyber Security and Mobility ( Volume: 11, Issue: 5, September 2022)

Page(s): 695 - 712

Date of Publication: September 2022

ISSN Information:

DOI: 10.13052/jcsm2245-1439.1153

Contents

SECTION 1

Introduction

With the popularity of the network, more and more enterprises and indi-viduals are saving important data online, leading to a traffic increase on the network, but at the same time, the traffic becomes more complicated, which brings a greater challenge to traffic analysis. The presence of abnormal traffic on the network occupies bandwidth resources, causes congestion and consumes a lot of memory, Central Processing Unit (CPU), etc., making the network unable to provide normal services. Monitoring and identifying abnormal network traffic is a very important task for network security [1]. By monitoring and identifying abnormal traffic on the network, abnormal-ities can be detected timely to ensure the transmission of useful packets better, thus enabling smooth network operation. With the development of technology, many mathematical models have been applied in the monitoring and identification of abnormal network traffic [2]. Through research on big data, abnormal traffic is monitored and identified using neural networks, machine learning, deep learning, and other methods. However, among the current studies on network anomaly traffic monitoring and identification, most of them focus on the analysis of one method and improvement of algorithm performance through improvement, optimization, etc. For example, Singh et al. [3] designed an online sequential limit learning machine (OS-ELM) approach, conducted experiments on NSL-KDD 2009, and found that the algorithm had an accuracy of 98.66% and a false alarm rate of 1.74%. Liu et al. [4] designed a math method for part matching of immune elements, which can evolve in parts, and found through experiments that the method had a good adaptive performance. Roselin et al. [5] used an optimized deep clustering (ODC) algorithm combined with a deep autoen-coder to detect malicious network traffic and found through experiments that the method performed well. Nie et al. [6] proposed a method combining convolutional neural networks and reinforcement learning for anomaly detection of in-vehicle self-organizing networks and found that it was effective through experiments. Li et al. [7] proposed a model integrated temporal and spatial features using a three-layer parallel network structure, conducted experiments on ISCX-IDS 2012 and CICIDS 2017 datasets, and found that the method improved detection accuracy. Ma et al. [8] combined multi-scale Deep-CapsNet and adversarial reconstruction, optimized Deep-CapsNet with multi-scale convolution capsules, reduced noise interference with a adversar-ial training strategy, and found through experiments that the method showed better accuracy in two-classification and multi-classification. Zhang et al. [9] designed a parallel cross convolutional neural network integrating two branch convolutional neural networks to detect unbalanced abnormal traffic and found through experiments that the method spent less detection time and achieved better accuracy. Pan et al. [10] proposed a density and distance-based K-means algorithm and found through experiments that the method was feasible and stable. Lei et al. [11] studied low-rate distributed denial-of-service (LDDoS) attacks and designed a signal processing technique based on wavelet transform. They found that the technique effectively identified LDDoS. Li et al. [12] proposed an active defense-based router anomaly traffic detection strategy for the problem of single router anomaly arbitration information in mimetic defense and found through experiments that the method effectively detected network attacks. Ding et al. [13] designed an efficient bi-directional simple recurrent unit (BiSRU) and compressed the original high-dimensional features by stacked sparse autoencoder (sSAE). They found through experiment that it was advantageous in terms of accuracy and training time. Liu et al. [14] proposed a leaf node density ratio-based detection method for unknown anomalous network traffic data and found that the method had good accuracy and efficiency by comparing it with methods such as extended isolated forest. Compared with the recent literature, in addition to improving traditional methods (improving BPNN and Elman with the grasshopper optimization algorithm (GOA) for monitoring and identifying anomalous network traffic, this paper also conducted a comparison between multiple methods to more strongly demonstrate the advantages of a method, which is a better reference for network monitoring.

SECTION 2

Monitoring Identification Methods for Abnormal Traffic on the Network

The sources of abnormal traffic in the network can be divided into two types, one is caused by the network itself, such as unreasonable network structure, equipment unavailability, etc., and the other is caused by attacks on the network, which is also the object of abnormal traffic monitoring and identification. Currently, common attacks are as follows.

Distributed Denial of Service (ddos)[15]: Distributed, large-scale network attacks launched by multiple attackers on one or more hosts, causing the hosts to overload and crash. It is highly hazardous.
Remote to Local (R2L): Exploit the system vulnerability to remotely log in to the host and gain privileges to achieve some illegal operations.
Probe: Scan the target's IP, port, etc., and conduct targeted attacks on the network.
User to Root (U2R): Exploit system vulnerabilities to gain access and steal important information.

In the network, there are large differences in the characteristics of abnormal traffic and normal traffic, and these characteristics can be learned through mathematical models to distinguish them.

SECTION 3

Different Mathematical Models

The dimensionality of network traffic data is large [16]. During monitoring and recognition, if all the features are used in the calculation, the recognition efficiency will be low; therefore, before applying the mathematical model for monitoring and recognition, this paper first uses the mutual information (MI) method [17] to select the features. For a feature set, it is assumed that the target class is $C$ , the candidate feature subset is $F$ , and the candidate feature subset is $S$ . The calculation formula is:

$\begin{equation*} I=\arg\max_{f_{i}\in F}(I(C;f_{i})-\beta\sum_{f_{S}\in s}I(f_{i};f_{s}))\tag{1}\end{equation*}$ View Source

where

$\beta$

is the penalty factor, whose value is generally set as 0.5.

$I(C;f_{i})$

refers to the correlation between the candidate features and the selected features.

$\begin{equation*} I(C;f_{i})=\int_{c}\int_{f_{i}}p(C, f_{i})log\frac{p(C,f_{i})}{p(c)p(f_{i})}dCdf_{i}\tag{2}\end{equation*}$

View Source

where

$I(f_{i};f_{\mathrm{s}})$

refers to the correlation between the candidate features and the target class:

$\begin{equation*} I(f_{i};f_{s})=\int_{f_{i}}\int_{f_{s}}p(f_{i}, f_{i})log\frac{p(f_{i},f_{s})}{p(f_{i})p(f_{s})}df_{i}df_{s}\tag{3}\end{equation*}$

View Source

By calculating the mutual information value of every feature, the obtained results are ranked, and then the top k features with the largest values are selected to reconstruct the data set. Then, the network traffic is monitored and identified using a mathematical model.

For the monitoring and recognition of network traffic, this paper compares several different mathematical models, as follows.

(1) KNN Mathematical Model

The KNN model is a commonly used pattern recognition algorithm [18] to distinguish anomaly recognition by comparing distances, which has been widely used in information retrieval and data classification [19]. Its principle is as follows. For a datum to be recognized, K data that are the closest to the recognized datum are found and compared to determine whether the datum is anomalous or not. The distance is usually Euclidean distance:

$\begin{equation*} d(x, y)\sqrt{\sum_{k=1}^{n}(x_{k}-y_{k})^{2}}\tag{4}\end{equation*}$ View Source

It is used in the monitoring and recognition of network traffic. It is assumed that there are $n$ data and $m$ features, then there is a matrix: $Z=[x] x_{2}\ldots x_{m}$ . The matrix of data to be monitored and recognized is: $C=(y_{1}y_{2}\ldots y_{m})$ . The distance of the corresponding data in the two matrices is calculated to obtain the distance matrix: $T=(L_{1}L_{2}\ldots \ L_{m})$ . Let the distance between the normal data matrices be $d. d$ is compared with T. If $d\geq T$ , then the data are normal, otherwise they are abnormal.

(2) BPNN Mathematical Model

The BPNN model is one of the most widely used neural network models [20], with strong self-learning and fault-tolerance capabilities. The classical BPNN model has a 3-layer structure. For a given training set $D=(a_{i},\ y_{i})$ , let the desired output of the BPNN model be $y=(y_{1},\ y_{2},\ \ldots,\ y_{q})$ . The input and output of the hidden layer can be written as:

$\begin{align*}\mathrm{s}_{\mathrm{j}}& =\sum_{\mathrm{i}=1}^{n}\mathrm{w}_{\text{ij}}\mathrm{a}_{\mathrm{i}}-\theta_{\mathrm{j}}\tag{5}\\\mathrm{b}_{\mathrm{j}}& =\mathrm{f}(\mathrm{s}_{\mathrm{j}})=\frac{1}{1+\mathrm{c}^{-\mathrm{s}_{\mathrm{j}}}}\tag{6}\end{align*}$ View Source

where

$w_{i_{J}}$

is the weight of the input layer to the hidden layer and

$\theta_{j}$

is the threshold value. Similarly, the input and output of the output layer can be written as:

$\begin{align*} & l_{t}=\sum_{j=1}^{p}v_{jt}b_{j}-\gamma_{t}\tag{7}\\ & y_{t}=f(l_{t})\tag{8}\end{align*}$

View Source

where

$v_{jt}$

is the weight of the implied layer to the output layer and

$\gamma_{t}$

is the threshold value.

The BPNN model adjusts the weights and thresholds by back-propagation of the error to bring the error to the target value. The error is calculated by the following formula:

$\begin{equation*} E_{k}=\sum_{t=1}^{q}\frac{(y_{t}-\hat{y}_{t})^{2}}{2}\tag{9}\end{equation*}$ View Source

(3) GOA-BPNN Mathematical Model

The main drawback of the BPNN model is that it is easy to fall into local minimum and has slow convergence. This paper finds the optimal weights and thresholds through the GOA to optimize the BPNN model to obtain the GOA-BPNN model.

The principle of the GOA is the predatory behavior of grasshoppers [21]. The process of searching for food is divided into two phases, exploration and exploitation, corresponding to global and local search. It is assumed the size of a population is $N$ . The location of grasshopper $i$ is written as: $x_{i}=s_{i} +g_{i}+a_{i}$ , where $s_{i}$ is the social interaction force, $g_{i}$ is the gravity of the grasshopper, and $a_{i}$ is the wind force on the grasshopper. Among all the parameters, $s_{i}$ has the greatest effect on the position of the grasshopper, and its calculation formula is:

$\begin{equation*}\mathrm{s}_{\mathrm{i}}=\sum_{\mathrm{j}=1,\mathrm{j}\neq \mathrm{i}}^{\mathrm{N}}\mathrm{s}(\mathrm{d}_{\text{ij}})\hat{\mathrm{d}}_{\text{ij}}\tag{10}\end{equation*}$ View Source

where

$d_{ij}$

refers to the distance of grasshoppers

$i$

and

$j, d_{ij}=\vert x_{j}-x_{i}\vert, \hat{d}_{ij}$

is the unit vector,

$\hat{\mathrm{d}}_{\text{ij}}=\frac{\mathrm{x}_{\mathrm{j}}-\mathrm{x}_{\mathrm{i}}}{\mathrm{d}_{\text{ij}}}$

, and s function is the social attribute intensity,

$\mathrm{s}(\mathrm{r})=\text{fe}^{\frac{-\mathrm{r}}{1}}-\mathrm{e}^{-\mathrm{x}}$

, where f refers to the strength of attraction, f == 0.5 usually, and 1 is the attraction length scale, 1 == 1.5 usually.

To simplify the model, it is assumed that the wind direction is always optimal and the grasshopper’ gravity is negligible. Scaling factor $c$ is added to optimize the search capability of the algorithm. The position update formula of the grasshopper is:

$\begin{gather*} x_{i}^{d}=c\left(\sum_{j=1,j\neq i}^{N}c\frac{\mu b_{d}-lb_{d}}{2}s(x_{j}^{d}-x_{i}^{d})\frac{x_{j}-x_{i}}{d_{ij}}\right)+\hat{T}_{d}\tag{11}\\ c=c_{\max}-k\frac{c_{\max}-c_{\min}}{K}\tag{12}\end{gather*}$ View Source

where

$\mu b_{d}$

and

$lb_{d}$

are the upper and lower limits of the d-dimensional data,

$\hat{\mathrm{T}}_{\mathrm{d}}$

is the current optimal solution,

$K$

is the total number of iterations, k is the number of current iterations,

$\mathrm{c}_{\min}$

is usually set as 0.00001, and

$\mathrm{c}_{\max}$

is usually set as 1.

In the GOA-BPNN model, the GOA is first initialized. The population size and number of iterations are set. The fitness function of the GOA is defined as the error of the BPNN, and then the GOA algorithm is used to optimize the parameters of the BPNN. The results are input into the BPNN to obtain the GOA-BPNN model. The GOA-BPNN model is trained using data.

(4) Elman Mathematical Model

Elman neural network [22] is obtained by adding an association layer to a BPNN, which has functions of storage and delay and improved ability to process information. For Elman, it is assumed that at time $k$ , the output of the output layer is $y(k)$ , the vector of nodes in the hidden layer is $x_{o}$ , and the vector from the hidden layer to the association layer is $x_{c}$ . The specific expressions are:

$\begin{gather*} y(k)=g(w_{3}x_{o}(k))\tag{13}\\ x_{o}(k)=f\{w_{2}x_{c}(k)+w_{1}[u(k-1)]\}\tag{14}\\ x_{c}(k)=x_{0}(k-1)\tag{15}\end{gather*}$ View Source

where

$u$

is the input layer vector,

$\mathrm{w}_{1}, \mathrm{w}_{2}$

, and

$\mathrm{w}_{3}$

are the weights between the layers, g(x) is the linear function, and f(x) is an S function. The training method of the Elman model is the same as that of the BPNN model; therefore, it also has the problem of slow convergence and requires parameter optimization.

(5) GOA-Elman Mathematical Model

Referring to the GOA-BPNN model, the parameters of the Elman model are optimized by Elman. First, the population is initialized, and the fitness value is calculated. Then, the optimal parameters of the Elman model are found by continuous updating, and the obtained parameters are used as the initial parameters of the Elman model to obtain the GOA-Elman mathematical model to monitor and identify the abnormal network traffic.

SECTION 4

Experiment and Analysis

4.1 Experimental Setup

The experimental environment was an Intel(R)Core(TM)i7-37703.40GHz CPU, 4 GB memory, and Windows 10 operating system. The experiments were conducted in the MATLAB R2013a environment. Programming was performed in Eclipse using Java language. The experimental dataset used was UNSW-NB15 [23], including 2,540,044 data. The experimental dataset is shown in Table 1. In addition to normal traffic, nine types of attacks were included, and every data has 49 features, as shown in Table 2. When monitoring and identifying abnormal traffic with a model, feature selection was performed first, and the filtered features were used as input to the mathematical model for training. Then, the performance of the model was tested on the test set.

In the neural network models, the three-layer structure was used. The node in the input layer was the feature dimension, the node in the output layer was the type of abnormal traffic, and the node in the hidden layer was determined using the formula:

$\begin{gather*} N_{hidden}=\sqrt{N_{in}+N_{out}}+a\tag{16}\\ a=2\tag{17}\end{gather*}$ View Source

Table 1 UNSW-NB15 data set

Table 2 Feature descriptions of the UNSW-NB15 dataset

Table 3 Confusion matrix

In the GOA algorithm, the population size was set as 20, and the maxi-mum number of iterations was 100. Before monitoring and recognizing, the feature was selected using the mutual information method, and the top 10 ranked features were used as the input to the mathematical model.

4.2 Evaluation Indicators

The evaluation of different algorithmic models was based on the confusion matrix (Table 3). The evaluation indicators are shown in Table 4.

4.3 Experimental Results

When using mutual information for feature selection, the top k features with the largest value of mutual information were retained. The KNN algorithm was used as the basis for comparing the accuracy of the algorithm when the number of retained features varied. The results are shown in Figure 1.

Figure 1

Changes in the accuracy under different feature dimensions.

Show All

Table 4 Evaluation indicators for mathematical models

It was seen from Figure 1 that the accuracy of the KNN algorithm gradu-ally increased with the increase of retained features; when the retained feature dimension was ten, the accuracy of the KNN algorithm reached 79.64%, and it always remained stable after ten feature dimensions. This result indicated that the algorithm could achieve high accuracy when ten features were retained. Therefore, the feature dimension was set as ten in subsequent experiments.

The operation time before and after feature selection was compared. Different mathematical models were used. The inputs of the model were the 49-dimensional feature without feature selection and ten-dimensional features with feature selection. The comparison of the operation time is shown in Figure 2.

Figure 2

Comparison results of operation time.

Show All

It was seen from Figure 2 that the operation time when the 49-dimensional feature was used as the model input was significantly longer than that when the ten-dimensional feature was used. Taking the KNN algorithm as an example, the operation time was 564.36 s when the 49-dimensional feature was used and 376.84 s when the ten-dimensional feature was used, i.e., the latter saved 33.23% of time compared to the former, verifying that feature selection was effective to improve the operation efficiency. Then, when the ten-dimensional feature was used as the model input, the KNN model had the longest operation time, 376.84 s, followed by the BPNN model (292.33 s). Compared with the BPNN model, the Elman model has less operation time, 268.45 s, indicating that the Elman model was better than the BPNN model in terms of convergence speed. After GOA optimization, the operation time of both neural network models was reduced significantly. The operation time of the GOA-BPNN model was 231.46 s, which was 20.82% less than that of the BPNN model; the operation time of the GOA-Elman model was 136.79 s, which was 49.04% less than that of the Elman model. In conclusion, the GOA-Elman model has the greatest advantage in terms of operation time.

The performance of different mathematical models for monitoring and identifying abnormal network traffic was compared, and the results are shown in Figure 3.

It was seen from Figure 3 that, in general, all the indicators of the KNN model were below 80%, which indicated that it had an average performance in monitoring and identifying abnormal traffic. The comparison of the four neural network models showed that the BPNN model < the Elman model. The accuracy of the Elman model and BPNN model were 86.77% and 82.33%, respectively, and the accuracy of the Elman model was 4.44% higher than that of the BPNN model. The BPNN model < the GOA-BPNN model, and the Elman mode < the GOA-Elman model, indicating that the performance of both models was improved after GOA optimization. The accuracy of the GOA-Elman model was 97.33%, which was 10.56% higher than that of the Elman model. The recall rate of the GOA-BPNN model was 95.64%, which was 14.52% higher than that of the Elman model. The precision of the GOA-Elman mode was 98.36%, which was 10.42% higher than that of the Elman model. The F1 value of the GOA-Elman model was 95.78%, which was 10.11 % higher than that of the Elman model. The results suggested that the GOA-Elman model was reliable in monitoring and recognizing abnormal traffic.

Figure 3

Comparison results of abnormal network traffic monitoring and recognition performance.

Show All

The monitoring and recognition results of different types of traffic with the GOA-Elman model are shown in Figure 4.

It was seen from Figure 4 that all the data of the model were above 90%. Overall, the model performed well in monitoring and identifying different types of traffic. Taking Normal as an example, the accuracy was 99.87%, and the F1 value was 97.64%. In comparison, the model performed slightly lower on Shellcode and Worms, with an F1 value of 92.36% for Shellcode and 93.61 % for Worms, which may be because the small number of samples for these two types made the model training inadequate. In general, the GOA-Elman mathematical model could monitor and identify different types of traffic accurately.

Figure 4

The monitoring and recognition results of different traffics with the GOA-Elman mathematical model.

Show All

SECTION 5

Analysis

Network traffic is an important expression of the current network situation. Analyzing network traffic can help network managers to detect attacks timely and take appropriate means to intercept, thereby achieving network security. Therefore, monitoring and identifying abnormal network traffic is very important. Many mathematical models have been applied, and this paper mainly compared and analyzed the performance of several mathematical models.

First, the experimental result showed that feature selection affected the performance of the mathematical model. This paper selected features with the mutual information-based method. Reducing the feature dimension from 49 to 10 improved not only the computational efficiency but also the performance of the mathematical model. The comparison of different mathematical models demonstrated that improving neural network models by the optimization algorithm enhanced the computational efficiency and the monitoring and recognition performance. Among all the compared algorithms, the performance of the KNN model was poor, reflected in a long operation time and a low accuracy of monitoring and recognition. The pairwise comparison showed that the performance of the Elman model was better than that of the BPNN model, so the performance of the GOA-Elman model also outperformed the GOA-BPNN model. It was seen from Figure 3 that the accuracy, recall rate, precision and F1 values of the GOA-Elman model were 97.33%, 95.64%, 98.36% and 95.78%, respectively, which were significantly better than the other three mathematical models.

The results of monitoring and recognizing different types of traffic in the dataset with the GOA-Elman model showed that the model performed the best in recognizing normal traffic and performed poor in recognizing Shellcode and Worms with small volumes, but the overall accuracy and precision were above 90%, which could satisfy the need of abnormal traffic monitoring and recognition in practice.

This paper obtained some outcomes, but there are also some shortcomings. In future research, we need to:

compare more new mathematical models,
conduct experiments on more datasets to further understand the performance of mathematical models,
apply mathematical models in the real network environment to under-stand their values in practice.

SECTION 6

Conclusion

This paper introduced several mathematical models for monitoring and recognizing abnormal network traffic and compared these models on the UNSW-NB15 dataset. It was found that the KNN model had the longest operation time; the Elman model outperformed the BPNN model; after GOA optimization, the performance of both BPNN and Elman models was greatly optimized, and the GOA-Elman performed the best. The research verify the reliability of the GOA-Elman model in monitoring and recognizing abnormal network traffic by comparison, which can be further promoted and applied in practice.

References is not available for this document.

Monitoring and Identification of Abnormal Network Traffic by Different Mathematical Models

Abstract:

Metadata

Abstract:

ISSN Information:

Introduction

Monitoring Identification Methods for Abnormal Traffic on the Network