Application Layer DDoS Attack Detection Using Cuckoo Search Algorithm-Trained Radial Basis Function

In an application-layer distributed denial of service (App-DDoS) attack, zombie computers bring down the victim server with valid requests. Intrusion detection systems (IDS) cannot identify these requests since they have legal forms of standard TCP connections. Researchers have suggested several techniques for detecting App-DDoS trafﬁc. There is, however, no clear distinction between legitimate and attack trafﬁc. In this paper, we go a step further and propose a Machine Learning (ML) solution by combining the Radial Basis Function (RBF) neural network with the cuckoo search algorithm to detect App-DDoS trafﬁc. We begin by collecting training data and cleaning them, then applying data normalizing and ﬁnding an optimal subset of features using the Genetic Algorithm (GA). Next, an RBF neural network is trained by the optimal subset of features and the optimizer algorithm of cuckoo search. Finally, we compare our proposed technique to the well-known k-nearest neighbor (k-NN), Bootstrap Aggregation (Bagging), Support Vector Machine (SVM), Multi-layer Perceptron) MLP, and (Recurrent Neural Network) RNN methods. Our technique outperforms previous standard and well-known ML techniques as it has the lowest error rate according to error metrics. Moreover, according to standard performance metrics, the results of the experiments demonstrate that our proposed technique detects App-DDoS trafﬁc more accurately than previous techniques.


I. INTRODUCTION
In application-layer DDoS attacks (App-DDoS), the attackers send legitimate packets toward the victim server to bring down the server [5], [11].As the malicious packets mimic the behavior of legitimate users and contain the genuine source IP addresses, neither the victim server nor IDS (Intrusion Detection System) can distinguish the packet of attackers from legitimate users [5], [11].The main goal of DDoS attackers regardless of their types is to force victim servers to respond so slowly as to be unusable or shut down completely.To achieve this goal, they overwhelm the bottleneck resources of the victim server.The desired bottleneck resources for application-layer DDoS attacks are TCP/IP stacks, CPU cycles, memory, I/O bandwidth, The associate editor coordinating the review of this manuscript and approving it for publication was Zheng Yan .and disk/database bandwidth.In App-DDoS attacks, the adversary overwhelms the bottleneck resources via legitimate requests.To launch such an attack, every bot machine that wants to participate in the attack establishes a TCP connection with the victim server, which requires a genuine IP address.
The main aim of a defense technique against App-DDoS attacks is to detect and distinguish attack traffic from legitimate ones.This issue is difficult as attackers purposely fabricate App-DDoS traffic to look like legitimate traffic.On the other hand, professional attackers continuously change their toolkits and develop more sophisticated App-DDoS traffic; hence detecting attack traffic becomes more difficult.However, once the attack traffic is detected, the victim server can block any traffic coming from attack sources [5], [11].
Several classic, heuristic, and more recently machinelearning (ML) based techniques have been proposed to detect and distinguish App-DDoS traffic from legitimate traffic during the last decade.Previous classic and heuristic techniques, which we call traditional techniques, detect App-DDoS traffic by rules (e.g., statistical, challenge/response, and time series) programmed by traffic engineers.Traditional approaches have not achieved much to catch App-DDoS traffic, as they suffer from low accuracy due to the dynamic and evolving natures of App-DDoS attacks.Some wellknown traditional techniques are reviewed in Section II.
Recently, academics and industries are exploring artificial intelligence techniques, especially ML techniques to detect App-DDoS traffic.ML techniques such as KNN, SVM, random forest, logistic regression, and Naïve Bayesian can accurately classify binary data (the data belonging to class YES or class NO) if we have a large volume of labeled samples (the samples with labels YES or NO).The mechanism is that human experts select the classification features.In deep-learning (DL) approaches such as CNN and RNN, even feature selection can be made by machine without human intervention through a series of nonlinear processing layers.Next, an ML model is customized and trained with the labeled samples via the selected features.The trained ML model is used to predict the class of upcoming data.Therefore, ML techniques can be a powerful tool for detecting App-DDoS traffic.Section II discusses some recent ML techniques that detect App-DDoS traffic.Section II discusses some recent ML techniques that detect App-DDoS traffic.
This paper proposes a novel technique based on machine learning to detect the traffic of App-DDoS attacks.A Radial Basis Function (RBF) neural network is used at the heart of the technique.The Cuckoo Search optimizer Algorithm (CSA) is applied to the RBF network to enhance the network power detection.Through the Genetic Algorithm (GA), the most valuable features of network traffic that have the main role in detecting App-DDoS traffic are determined and applied to the RBF network to train the network.The trained RBF significantly detects and distinguishes the attack traffic from legitimate ones.Experimental results show that our proposed technique improves accuracy in detection on average by 3% and 6% compared to Bootstrap Aggregation (Bagging) technique and k-nearest neighbor (KNN), respectively.The proposed technique also performs better than well-known machine learning techniques such as support vector machine (SVM), multi-layer perceptron (MLP), and recurrent neural network (RNN).
The main contributions of the proposed technique are as follows.
1) Utilizing Radial Basis Function (RBF) neural network to detect App-DDoS traffic from legitimate traffic.2) Utilizing the Genetic Algorithm (GA) to distinguish the most valuable features of the dataset in order to increase the accuracy of attack detection.Utilizing Radial Basis Function (RBF) neural network to detect App-DDoS traffic from legitimate traffic.
3) Applying the Cuckoo search algorithm (CSA) to RBF neural network to better train the network.4) Experimental results show that the combination of CSA + GA + RBF outperforms other well-known machine learning-based techniques such as KNN, Bagging, SVM, RNN, and MLP.The rest of the paper is organized as follows: Sections II, III, IV and V present the related work, the proposed technique, experimental results and the conclusion, respectively.

II. RELATED WORK
Several App-DDoS traffic detection techniques have been proposed in the last decade.In this section, a few traditional techniques are discussed first, then the techniques based on machine learning and deep learning are discussed.

A. TRADITIONAL TECHNIQUES
CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is an elegant mechanism to separate human-based users from automated software tools [9], [25].CAPTCHA is the most trusted technique against App-DDoS attacks; however, it suffers various challenges such as annoying users, technical difficulties with some browsers, not being robust against image recognition techniques, and third-party users.
In Whitelisting Technique [19], a list of source IP addresses called very important IP addresses (VIPs) is prepared before the attack.When the victim server is under an App-DDoS attack, priority is given to traffic that belongs to the VIP list.The main challenges with this technique are (1) it is not user transparent, (2) it has the problem with mobile users (3) The zombie machines and royal clients that have resided behind a NAT (network address translator) have the same privilege to access the server.
Trust Management Helmet (TMH) [20] is a mechanism that distinguishes legitimate users from malicious users using trust.During an App-DDoS attack, the priority for establishing sessions with the victim server is given to users with higher trust values instead of identifying all bot machines.TMH depends on the browser's cookie; however, in many companies and institutes, the cookies are deleted due to privacy and security issues.
In the ConnectionScore technique [4], every session is scored based on its history and statistical analysis that has been done during normal connection.Those sessions that take lower scores are blocked, and bottleneck resources are retaken from them.However, the computational overhead imposed on the server can be large during the attack as it measures a proper score for every session.
Authors in [10] proposed a probability-based detection mechanism based on the central limit theorem and the windowing method.Through the probabilistic model, [10] trains an engine offline using distributions of six TCP flags of incoming SYN packets of the DARPA dataset.The trained engine detects malicious traffic, mimicking normal packets with spoofed information in the online mode.[10] shows that their probabilistic technique is better than the entropybased strategy in terms of false-negative rate (FNR) in various attack circumstances.The main problem with the technique is that the distribution of six TCP flags is not a robust criterion for App-DDoS traffic detection.Due to this limitation, the authors only considered the SYN flood attack in their experiments.

B. MACHINE-LEARNING BASED TECHNIQUES
Authors in [21] use machine learning and deep learning to identify transport and application layer DDoS attacks in software-defined networks (SDNs).[21] includes four modules: Flow collector, preprocessing, detection, and flow Manager.The flow collector module gathers the data packets from the network, creates the flows, and then forwards them to the preprocessing module.In the preprocessing, data is cleaned.The detection module utilizes a trained model based on various machine learning and deep learning techniques to classify the input as suspicious or benign.The model is trained based on KNN, SVM, RF (Random Forest), MLP, CNN (Convolutional Neural Network), GRU (Gated Recurrent Units), and LSTM (Long Short-Term Memory) classifiers.Two recent security datasets, CICDoS2017, and CICDoS2019 are used to train the model.Finally, the flow manager module informs the controller about the suspicious traffic.
Authors in [15] trained a model based on the decision tree (DT) approach for an imbalanced dataset in which the normal traffic dominates attack traffic.Most classification techniques are normally biased toward the majority class (i.e., normal traffic).[15] proposed an ensemble-based method by using K-Means, RUSBoost, and DT approaches to mitigate the class imbalance problem.The approach selected DT as the best classifier after optimizing the hyper-parameters in terms of accuracy, fast training time, and improved average prediction rate.
Feng et al. [8] propose a reinforcement-learning-based model to detect and mitigate App-DDoS attacks.The model is continuously trained with various metrics related to the server's load, clients' dynamic behaviors, and the victim's network load.The model utilizes the Markov decision process to construct the attack classification model.The reward function of the reinforcement-learning model is a multi-objective function to guide a reinforcement-learning agent to learn the most suitable action in mitigating App-DDoS attacks.
Banerjee et al. [3] propose two modules that work sequentially to detect and mitigate DDoS attacks.The first module, called the Signature IDS, uses machine learning algorithms like Naive Bayes, KNN, and K-Means to classify incoming packets as normal or anomalous.Hence, all the possible intruders in the incoming traffic are predicted beforehand.Among various ML techniques, the one giving the best possible outcome is implemented in the Signature IDS.The second module performs a three-way handshake to identify the exact host, which is an intruder.
Authors in [16] propose a DDoS attack detection architecture that integrates Bi-Directional Long Short-Term Memory (BI-LSTM), a Gaussian Mixture Model (GMM), and incremental learning.Unknown traffic is captured by the GMM unit and then is labeled by data engineers for discrimination.The labeled traffic is then fed back to the BI-LSTM and the GMM for incremental learning.The Bi-LSTM is used to discriminate malicious traffic from legitimate ones.
Alghazzawi et al. [2] investigate a hybrid model of CNN and Bi-LSTM for DDoS attacks classification.The chisquared (x 2 ) is used to identify highly related features.Next, a CNN network is used to extract the high-rated features.These features are fed to the Bi-LSTM model to distinguish attack packets from normal packets.
Zeeshan et al.A hybrid model of random forest (RF) and XGBoost is presented in [12] to predict the DDoS attack traffic.The data of the benchmark is pre-processed to handle irrelevant data.Next, features are extracted from the data, and data is labeled.The hybrid model is trained with the extracted features and labeled data.
Table 1 summarize and compare previous machinelearning based techniques via various parameters.In the table, '+' operator is used when two (or more) techniques are used in parallel, and '→' is used when two (or more) techniques are used in series (output of a technique becomes the input of the next one).

III. THE PROPOSED TECHNIQUE
The proposed algorithm includes data gathering, data cleaning, data normalization, feature selection based on genetic algorithm (GA), and training of the Radial Basis Function (RBF) neural network based on the optimizer algorithm of cuckoo search.Fig. 1 shows the general flowchart of the proposed technique.The details of each step are explained as follows.

A. DATA GATHERING
The data is gathered via NSL-KDD dataset [17].The NSL-KDD dataset is derived from the KDD Cup 99 dataset, one of the most available datasets for the research community.The NSL-KDD dataset includes legitimate traffic (48%), DDoS traffic (35%), Probe attack (10.55%),U2R attack (0.32%), and R2L (6.13%).Each dataset record is made up of 41 qualitative and quantitative features.In our study, we excluded all categories except DoS and legitimate.After excluding all other types of attacks, our dataset has 57% normal traffic and 43% DDoS traffic.The dataset was then reduced to a small number of 3495 samples which keeps the normal traffic rates and Dos traffic unchanged (i.e., 1992 samples are normal and 1503 samples are DoS traffic).We utilized 2796 samples for the training set and 699 samples for the testing set, which constitute 80% and 20%, respectively.Finally, 233 extra samples are randomly selected in the dataset for the holdout dataset while keeping the rate of normal data and attack data unchanged.
The holdout dataset is used to check the overfitting problem of the model.

B. DATA CLEANING
In many real-world data mining applications, even with large amounts of data and adequate storage space, some data may be lost in existing samples.However, the issue arises when large data sets cannot ignore individual values due to their efficacy.Organizations have access to vast volumes of data that affect their business decisions.The data gathered from multiple sources is dirty, which will impact the accuracy of the prediction result.Data cleaning improves data quality, allowing organizations to ensure that their data is suitable for analysis.One solution is to use fixed values to replace and clear the lost data [13].In this work, the average value of each feature's lost values is utilized and substituted in missing samples.

C. DATA NORMALIZATION
Larger values have a higher influence on the model due to the non-uniform range of features and their various units, but this does not always imply that they are more significant.Feature normalization is a technique for keeping all values within predefined ranges.However, choosing the right normalizing technique is crucial because applying normalization to the input might affect the data's structure.The goal of the feature normalization technique is to compensate for the impacts of mismatch in the environment [1].In this work, data normalization is done in order to overcome this problem.The data of every feature are adjusted to a range of [−1,1] using linear normalization [22].
where min(x) is the minimum of the input x, max(x) is the maximum of the input x and X is the normalized x.

D. FEATURE SELECTION
Feature selection is a fundamental topic in ML that significantly influences the model's performance.The data features used to train the ML models significantly impact the results that may be gotten.Model performance can be harmed by features that are irrelevant or just partially relevant [7].Since CSA is a nature-inspired algorithm, we tend to use also a nature-inspired algorithm like Genetic Algorithms (GAs) for feature selection.GA is a stochastic function in natural genetics and evolution, a heuristic optimization method based on natural evolution's rules.GA finds the best answer after a series of repeated calculations based on Darwin's ''Survival of the Fittest'' theory.We use GA for feature selection for the following reasons: (1) GAs commonly outperform conventional feature selection algorithms [24].
(2) For the mortality prediction problem, the GA-selected feature subset for one classifier can be used for others while still obtaining high performance [24].(3) GAs are capable of managing data sets with a variety of features [24].( 4) GA achieves comparable or even superior predictive outcomes with far fewer features, saving time and cost, making it more advantageous [18], [24].(5) GAs do not require specific information about the problem under study [18].Our procedure of feature selection based on GA is as follows.The GA technique creates an ideal binary vector, with each bit corresponding to a feature.If the i th bit of this vector equals 1, the i th feature is permitted to participate in classification; if the bit equals 0, the feature is not permitted to participate.The starting population is produced at random from the sample space of feature sets.A score is assigned to each member in the starting population.The performance of the provided estimator is measured in this manner.The estimator used in this study is the Mean Squared Error (MSE).
A tournament selection is conducted to decide which members will continue to the following generation.We set the tournament size at three, which determines the number of members who compete against one another based on a score criterion.The tournament winner is chosen as a parent for the following generation.After then, the child has both parents' genetic material.This attribute is called crossover, and we set it to be 0.5, representing the probability of crossing over from one generation to the next.Next, a random mutation is added to each generation in addition to the crossover.We set the probability parameter that a mutation will happen to 0.2.Finally, to set the population size to 20 and the maximum iteration to 100, we implemented the GA so that in case the population's best member has not improved over numerous generations, even before reaching the maximum iteration, the search has yielded an optimal result.
Using the above procedure, we select nine features out of 41 features.The features selected by GA are illustrated in table 2.

E. RADIAL BASIS FUNCTION NEURAL NETWORK ARCHITECTURE
Radial Basis Function (RBF) neural network is widely used for the nonparametric estimation of a multidimensional function through a set of limited features.As Fig. 2 shows, these networks include three layers: Input layer, hidden layer, and output layer.The duty of the input layer is to assign  a neuron for each input feature of the traffic and then feed the hidden layer with the features of the input layer without any change there is no weight between the input layer and hidden layer).The hidden layer establishes a nonlinear correspondence between the input space and a space with usually a larger dimension (i.e., the nonlinear RBF transfer functions).Finally, the output layer generates a weighted sum with a linear output.In the case of classification such as our work, the activation function of the output layer becomes the sigmoid activation function.
The neurons of the hidden layer use a Gaussian function as RBF, as illustrated in equation 2.
where X C j , σ j and ϕ j (X ) denote its well-pointed center (centroid), spread width (stretch constant) and the response of the j th hidden neuron corresponding to input X , respectively.In this equation, the term of X − X C j illustrates the Euclidean distance between the elements of the input vector X and the corresponding centroid of Gaussian X C j that can be calculated as follows where X = [x 1 , x 2 , . . ., x n ], x i is feature i of the input layer and n is the number of features in the input layer.The Important point in designing RBF networks is that the nonlinear function of the hidden layer neurons should cover all significant areas of the input vector space.
The outcome values of neurons of the hidden layer are multiplied by the corresponding weights of each neuron and passed to the output neurons.Each output neuron adds up the weighted values.The final summation passes through the sigmoid activation function to classify the input data as DDoS or normal traffic.Equation 4shows the output function.
In equation 4, φ is the sigmoid activation function.The RBF neural network is used in the proposed technique because (1) the RBF neural network is quick.This issue assists us in detecting DDoS traffic fast.( 2) RBF neural network is suitable for the cases where data is in the form of clusters.As App-DDoS requests are in the form of clusters from big to small [4], RBF neural network is a good choice.

F. TRAINING RBF NEURAL NETWORK USING CUCKOO SEARCH ALGORITHM
In our RBF neural network, centroid, spread width, and weight of neurons of the hidden layer are the parameters that should be well tuned during the training procedure of the network.To well train the network, improve the network's performance in terms of accuracy, and converge the network fast, the Cuckoo Search Algorithm (CSA) is utilized.More clearly, (1) CSA can find the global optimum solution with higher success rates [6].(2) CSA improves performance (accuracy) by utilizing Levy flight.( 3) CSA leads to a faster convergence rate.
CSA, which was created by Xin-She Yang and Suash Deb in 2009, is based on some cuckoo species' aggressive brood parasitism and egg-laying technique [14].CSA has inspired the behavior of the cuckoo birds that lays their eggs in the nests of birds of other species.With a probability of P a , the host bird discovers the alien eggs and either throws the alien eggs away or abandons the nest.In CSA, each egg placed in the nest is a solution.A better solution would be to place the eggs in safer nests (i.e., the host bird does not notice the cuckoo's eggs).The goal of cuckoos over different generations is to find better solutions.Each nest represents a set of solutions to the problem in which each egg is a solution.In general, CSA is based on three rules: 1) Each cuckoo lays exactly one egg at a time and places it randomly in one of the nests.
2) The nests having the high quality eggs (i.e., the best solutions) are used for the next generation.
3) The number of nests available is constant and the probability of each cuckoo egg being detected by the host bird is P a .CSA starts its work by initializing a random population of n host nests.Each host nest includes one egg of a cuckoo bird.Some of these eggs will grow and become adult cuckoos.Other eggs with the probability of P a are detected and destroyed by the host bird.The amount of eggs grown indicates the suitability of the nests in that area; hence, cuckoos are interested in migrating to those areas.The situation where the largest number of eggs is saved will be a parameter that the cuckoo algorithm intends to optimize.To migrate to the area including best nests, cuckoos could balance the global random walk and local random walk to promote searchability and find a better solution.The global random walk is modeled mathematically as equation 5.
where X t+1 i and X t i indicate the next and current positions of cuckoo i, respectively.α > 0 denotes the step size that is normally considered one.The ⊗ is an entry-wise multiplication.Levy(λ) is the Levy distribution with rate of λ.In our algorithm, equation 6 models equation 5.
where r is the deviation parameter, it is a vector with random values a uniform distribution in the range of [0, 1].To model the local random walk of cuckoo i, two cuckoos with indices of j and k are selected randomly between all cuckoos, and the next position of cuckoo i is calculated as equation 7.
In equation 7, is a random value generated based on the normal distribution.

G. THE PSEUDOCODE OF THE PROPOSED ALGORITHM
Algorithm 1 shows the pseudocode of the proposed algorithm.Let us explain the algorithm with a motivational example.Consider an RBF network that has two input neurons (i.e., two features), three neurons in the hidden layer (the centroid (X c ) and the spread width (σ ) of each neuron should be trained), and two output neurons.Fig. 3 illustrates the position vector of a cuckoo.The vector includes X 11 , X 21 , σ 1 , X 12 , X 22 , σ 2 , X 13 , X 23 , σ 3 , w 1 w 2 , and w 3 .X ij = x i −X C j distance vector between neuron x i of the input layer and centroid X c j of neuron j of hidden layer.

1) THE INITIALIZATION STEP
In the initialization step, the position vector of all cuckoos is randomly assigned.The RBF function for each neuron of the hidden layer is calculated according to equation 2. Next, the output function is calculated as equation 4. Next, the Mean square error (MSE) as the cost function is calculated for the for j=1 to m do m is the number of neurons of hidden layer   output.All cuckoos are sorted based on their MSE value in descending order.The cuckoo with the minimum value of MSE is considered the best cuckoo.

2) THE ITERATION STEP
In this step, the new position of the cuckoo is calculated with both global random walk (Levy flight) and local random walk according to equations 6 and 7, respectively.Fig. 4 illustrates the global random walk for this example.
We measure MSE for three positions: The current location, the new position after the global random walk, and the new position after the local random walk.The position, which results in the minimum MSE, is selected as the best position of the cuckoo for the current time t.The best position is saved in X t best for the next iterations.

3) ALGORITHM STOP CONDITION
The algorithm is continued until the number of iterations either reaches t max (the upper bound limitation of iterations) or the achieved MSE in the previous step becomes smaller than epsilon.The algorithm's output is a cuckoo population member with the best values of centroids X c , σ , and weights of neurons of the hidden layer in the RBF neural network so that MSE is the lowest possible.The RBF neural network is trained with the best solution.Next, the trained RBF neural network is used to classify traffic packets as App-DDoS or legitimate.

H. MITIGATION TECHNIQUE
The victim server can shut down all connections belonging to attack traffic when the attack traffic is detected.The victim server also can send CAPTCHA puzzles to attack sources.The sources who could not solve the CAPTCHA are bot machines, and their connection is terminated.

A. SIMULATION SETUP
We use the MATLAB 2020 software environment to implement our experiment's simulation stages.In the following experiments, the population size of the CSA algorithm is set to 50 in the proposed method.The maximum iterations in the proposed method are set to 500, and P a is set to 0.25.

B. PERFORMANCE METRICS
In order to evaluate the effectiveness of the proposed technique, we look at the outcomes of the proposed technique and other methods.4 shows standard performance metrics with a table.As can be seen, the proposed technique outperforms the Bagging algorithm and the k-NN technique except in SP and PPV.In terms of precision metrics, we compared our proposed technique with Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Recurrent Neural Network(RNN).Figure 11 compares the accuracy between the proposed technique, SVM, MLP, and RNN.As can be seen, our proposed technique enhances the accuracy on average by 0.5%, 3.6%, and 3.5% compared with SVM, MLP, and RNN, respectively.

This paper proposes
[23] propose a Protocol Based Deep Intrusion Detection (PB-DID) architecture in which a data-set of packets from IoT traffic is created based on flow and Transmission Control Protocol (TCP).Next, an LSTM-based unsupervised deep learning model is utilized to detect DDoS attack traffic.The model is trained on all the data available in two famous benchmarks, namely UNSW-NB15 and BotIoT data sets, to cover the maximum possible packet types.In this model, the number of features for training is reduced to almost half.

Algorithm 1 : 2 :
The Pseudocode of the Proposed Algorithm 1Initialization phase: Create an initial population of N solutions 3: for i=1 to N do N is the number of cuckoos 4:

FIGURE 3 .
FIGURE 3. The position of a cuckoo in the presented example.

FIGURE 4 .
FIGURE 4. Global random walk (Levy flight) for the presented example.

FIGURE 5 .
FIGURE 5.The comparison between CSA-trained RBF, Bagging and k-NN based on error metric, per training set.

FIGURE 8 .
FIGURE 8.The comparison between CSA-trained RBF, Bagging and k-NN based on standard performance metrics, per training set.

FIGURE 9 .
FIGURE 9.The comparison between CSA-trained RBF, Bagging and k-NN based on standard performance metrics, per testing set.

TABLE 4 .
The comparison between CSA-trained RBF, Bagging, and K-NN based on standard performance metrics.

FIGURE 10 .
FIGURE 10.The comparison between CSA-trained RBF, Bagging and k-NN based on standard performance metrics, per the whole set.

FIGURE 11 .
FIGURE 11.The comparison between CSAtrained RBF, SVM, MLP and RNN, per testing set based on Precision metric.
a technique based on machine learning to cope with App-DDoS attacks.The proposed technique is a hybrid method of Radial Basis Function (RBF) neural network and Cuckoo Search Algorithm (CSA).RBF neural network is used for classification, while CSA is used to train the RBF network.The feature selection procedure of the dataset (NSL-KDD) is done based on the Genetic Algorithm (GA).Several experiments are conducted to evaluate and compare the proposed technique with the Bagging algorithm, k-NN classifier, SVM, MLP, and RNN.The simulation results, i.e., accuracy = 96.9%,MSE = 0.134, RMSE = 0.366, and MAE = 0.067, clearly indicate the superiority of the proposed technique against standard machine learning techniques that are used to detect App-DDoS attacks.The proposed technique enhances the accuracy on average 2% compared with all mentioned machine learning-based techniques for all sets of datasets (training dataset, testing dataset, and the whole dataset).

TABLE 1 .
Summarizing and comparing previous machine-learning based techniques.

TABLE 2 .
The selected features via GA and their descriptions.

TABLE 3 .
The comparison between CSA-trained RBF, Bagging, and K-NN based on error metrics.The comparison between CSA-trained RBF, Bagging and k-NN based on error metric, per the whole set.metrics for the training dataset, testing dataset, and the whole dataset, respectively.Table3shows the same result, but via a table.As be seen, the proposed technique improves the MSE error metric on average by 22% and 57% in comparison with Bagging and k-NN, respectively.The improvement for the RMSE error metric is 10% and 24% in compared to Bagging and k-NN, respectively.The proposed technique FIGURE 6.The comparison between CSA-trained RBF, Bagging and k-NN based on error metric, per testing set.FIGURE 7.