An Adaptive Active Queue Management Based on Model Predictive Control

In recent years, active queue management (AQM) has gained more and more attention as an important part of network congestion control. Although there are many AQM algorithms, these algorithms show weaknesses to detect and control congestion due to the complexity and dynamics of the networks. Hence, this paper proposes a new AQM algorithm based on model predictive control (MPC) theory which has been widely applied in nonlinear and time-delay systems. In order to adjust the parameters of the MPC-based AQM algorithm adaptively according to network scenario variations, the adaptive mechanism is introduced into the new algorithm, named PHAQM, by using the Hebb learning rules from the neural network control theory. The simulation results show that the algorithm is effective in avoiding network congestion. Compared to the traditional AQM schemes, such as PI, REM, and GPC algorithm, the PHAQM has a faster convergence rate and smaller queue length fluctuations and outperforms especially under dynamically changing network situations.


I. INTRODUCTION
The congestion control of the transmission control protocol (TCP) network is an important tool to improve the quality of service (QoS), which can prevent network collapse, avoid lockout behavior and effectively reduce the probability of control-loop synchronization [1]. To assist TCP's management of network performance, the active queue management (AQM) mechanism was introduced to allow the router involved. As an effective congestion control approach, the AQM has become an attractive research topic [2]. In fact, the AQM mechanism implemented in the router has significant performance development for the network, for instance, improving network utilization, reducing packet drops, and keeping the best-effort service with low-delay [3].
There are many related pieces of research focused on the AQM since the first proposed AQM algorithm named Random Early Detection (RED) [4], such as adaptive RED [5], BLUE [6], and YELLOW [7]. However, these algorithms were heuristic, which made the algorithms too sensitive to The associate editor coordinating the review of this manuscript and approving it for publication was Dipankar Deb . parameter configuration. Fortunately, the fluid-flow model for the congestion control process in TCP networks was established by using stochastic theory [8], which helps the researches design and understand the behaviors of internet systems better. Based on the fluid-flow model, [9] first took the TCP/AQM system as a feedback system to be analyzed with control-theory, which has given a feedback control system depiction of AQM. The action of an AQM control law is to mark packets (with probability) as a function of measured queue length. Generally, when the TCP/AQM system is analyzed by control-theory as a feedback system, the probability of packet mark/drop in the router would be the control signal and the queue length would be the controlled variable. Afterwards, the control-theory-based approaches were used to solve the network congestion problem based on the model, for instance, PI [10], robust control [11], prescribed performance control [12], and state-feedback congestion controller [13]. These controllers improve the performance of the congestion control systems in wide network scenarios.
Among the control theories, it is known that model predictive control (MPC) has become a mature and advanced control strategy, which has been widely applied in nonlinear VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and time-delay systems. As the network congestion control system is a typical nonlinear and time-delay system, the MPC-based AQM algorithms were developed to determine the optimal control signal during each sampling time by predicting the future system dynamics. Generalized predictive control (GPC) is the first MPC that is used for network congestion control [14]. Then, predictive function control (PFC) was developed as a privileged AQM method in the high-speed networks [15]. MPAQM controller drop early packets at the router reasonably according to the predicted future queue length in the data buffer [16]. Data-AQM is proposed based on data-driven predictive control which can obtain the prediction directly basing on the input-output data alone without any explicit model of the system [17]. These researches demonstrate that MPC is able to handle system delay along with low computational load, and the MPC-based AQM algorithms outperform with a comparison of the traditional AQM algorithms. It is known that the parameters of the Internet are time-varying in practice, such as load factor, round trip time, and link capacity. However, the traditional control parameters configured in particular network scenarios are not able to be adjusted, which causes the performance sensitivity to network scenario variations. Hence, the researches draw more attention to the adaptive technique [18]- [23], which is a powerful method to uncertain system, see [24]- [26], to name a few. Although there are some results on the AQM schemes based on adaptive control, the adaptive scheme for the MPC-based AQM is a challenge because the environment of the Internet is complex. Fortunately, the neural network control is developed as a useful tool to design adaptive sampled-data systems [27], [28]. As Hebb-learning rule is an unsupervised learning rule for neural-network, it can extract the statistical characteristics of the training set and is suitable to solve the adaptive problem for the MPC-based AQM. In order to adjust the parameters of MPC-based AQM schemes adaptively, this paper investigates an adaptive Hebb-learning rules for TCP/AQM systems with MPC control and proposes a new algorithm called PHAQM.
The remainders of this paper is organized as follows. The fluid model and the Laplace transfer function are presented in Section II. In Section III, a new AQM algorithm named PHAQM is proposed based on MPC and Hebb-learning rules. Section IV gives a simulation study to verify the proposed method. Finally, a conclusion is provided in Section V.

II. SYSTEM MODELS
Reference [8] had developed a dynamic model of TCP behavior by using fluid-flow and stochastic differential equation analysis. This fluid-flow model has been widely used in the design of the AQM algorithms, which can be described by the following coupled, nonlinear differential equations: whereẋ(t) denotes the time-derivative of x, W is the expected TCP window size, N is the load factor, C is the link capacity, q is the expected queue length, and R is the round-trip time.
Note that the round-trip time can be expressed as R(t) = T p + q(t) C , where T p is the propagation delay (secs), and p is probability of packet mark/drop. The queue length q and window-size W are positive, bounded quantities; i.e. q ∈ [0,q], and W ∈ [0,W ] whereq andW denote buffer capacity and maximum window size respectively. Also, the marking probability p ∈ [0, 1].
In order to design the feedback control (AQM), [9] approximate these dynamics by their small-signal linearization about an operating point (W 0 , q 0 , p 0 ) as following: where represent the perturbed variable about the operating point, and the constraints of δW , δq, and δp are ignored for simplicity. It is noted that the operating point (W 0 , q 0 , p 0 ) is denoted as following: Performing a Laplace transform on the differential (2), the transfer function G p (s) which relates the δp and δq can be obtained as:

III. PHAQM ALGORITHM A. AN AQM ALGORITHM BASED ON MPC
To design an AQM algorithm based on MPC, the traditional zero-order hold (ZOH) is used to discretize (4). Suppose the values of δp and δq are sampled in each interval T s , then the transfer function which relates the δp and δq can be: where d = R 0 T s is the system delay, m 1 , m 2 , n 1 , n 2 are determined by the network parameters and T s as the following: Define the values of δp and δq as the input value u and output value y, respectively, then the dynamic model of TCP behavior can be described as the following input-output difference equation: where In order to derive the predictive value of δq after jth interval, i.e., jT s seconds later, a Diophantine equation is introduced: where E(z −1 ) and F(z −1 ) are the polynomials determined by A(z −1 ) and j, which can be expressed as to the two sides of (7), we have: (9) Let j = d + 1, and transfer the above equation from Z domain to time domain, then where Note that y(t) = δq(t) = q(t)−q 0 , and the reference queue length is q(t + d + 1) = q 0 , so y(t + d + 1) = 0, then (10) can be simplified as: where . . , h d+3 = − g d+2 g 1 .

B. ADAPTIVE SCHEME BASED ON HEBB-LEARNING THEORY
Whereas there are so many researches related to AQM scheme using the fluid-flow model, most of the existing works are based on fixed network arguments. Hence, the parameters of the corresponding AQM algorithm are fixed. However, the arguments of real networks are timevarying, which makes the traditional AQM algorithms perform unsatisfactorily in real networks. Besides, the fluid-flow model is not precise due to the complexity in considering all the situations in the real networks. So, the corresponding AQM algorithm obtained from the fluid-flow model is not reliable enough to bring good performance in real networks. To overcome these problems of conventional AQM algorithms, this paper introduces an adaptive scheme based on Hebb-learning theory to adjust the parameters h i (i = 1, 2, . . . , d + 3) shown in (11).
To make the equation easier to understand, rewrite (11) as following: where , K > 0 denotes the proportional coefficient, and the values of w i (t) are the adaptive-tuned parameters. Using Hebb-learning theory to adjust the parameters, the adaptive-tuned parameters can be updated as following: where η i > 0 is the learning rate, e(t − 1) = q(t − 1) − q 0 . In practice, as the arguments of real networks are time-varying, the values of h i are hard to estimate. Fortunately, the initial values of w i (t) can be set as small non-zero values, and the simulation results show that the proposed AQM algorithm, i.e., PHAQM, is still effective to avoid network congestion.

IV. SIMULATION
In this section, we evaluate the performance and robustness of the proposed PHAQM algorithm by several simulations. The network topology used in the simulation is shown in Fig. 1

A. CONSTANT NUMBER OF TCP CONNECTIONS
Load sizes are changeful in real networks, so network congestion algorithms should be able to adapt to different numbers of TCP connections.

B. PERFORMANCE FOR DIFFERENT REFERENCE QUEUE LENGTH
The reference queue length should be set to reduce round-trip delay and avoid empty buffer to improve the utilization rate of the bottleneck link. The smaller the reference queue length is, the smaller the round-trip delay is, whereas the higher the probability of empty buffer is. Therefore, it is necessary to weigh the pros and cons carefully to select a reasonable reference queue length, which requires that the algorithm can stabilize the queue length at different values to adapt to different networks. This group of experiments investigates the ability of the PHAQM algorithm to stabilize the queue length at different values. The simulation results are shown in Fig. 3 when the reference queue lengths are set to 100 and 900, respectively. It can be seen that no matter what the target queue length is, PHAQM can stabilize the queue near the target value, with fast convergence speed and small queue length oscillation.

C. COMPARISON WITH OTHER AQM SCHEMES
In the actual network, new data flows may enter the network at each time period, and existing data flows may exit the network. Therefore, the load of the network is dynamically changed. A good active queue management algorithm should be able to withstand the load dynamics. This section compares the performance of PI, REM, GPC, and PHAQM in the case of dynamic changes in the number of TCP connections using two sets of environments. The main control parameters of each algorithm are as follows: PI parameter, a = 0.00001822, b = 0.00001816; REM parameter, γ = 0.001, = 0.001; GPC parameter C = 45 Mb/s, N = 500, R = 0.14 s.
In the first set of experiments, we will compare the control effects of each AQM algorithm when the network load suddenly changes. In this experimental environment, 200 TCP connections pass through the bottleneck link at the initial time, and then 100 connections are added every 10 seconds until the 100th second, i.e., 900 TCP connections are added in 100 seconds. Then every 10 seconds, 100 connections are reduced at the same time, and a total of 900 connections are reduced at last. The results are shown in Fig. 4. It can be seen that the performances of PI and REM degrade noticeably if the network load suddenly changes, which means that PI and REM are very sensitive to mutation load. The variation of queue length in the REM method has the highest value, and REM takes a longer time to regular the queue length to the target value, especially when the network load suddenly decreases. Compared with PI and REM, GPC based on MPC reduces the sensitivity to mutation network, whereas the queue length oscillation is large. As an MPC-based algorithm, PHAQM can also absorb burst data streams well in the dramatically changing network environment, and it can maintain small queue length oscillation in addition. In fact, PHAQM gives less overshoot, shorter response time, and better stability than other algorithms. Conclusively, PHAQM is able to adapt to the environment involving sudden load reduction as quickly as possible while maintaining a small queue oscillation.  In the second set of experiments, we will focus on the effect on the performance of different AQM algorithms when the network load changes uniformly. In the experimental environment, there are 200 TCP connections through the bottleneck link at the initial moment. During the 10s to 90s, 800 connections are evenly added, and 800 connections are evenly reduced between 110s and 190s. The results are shown in Fig. 5. It can be seen that in a slowly changing network environment, although PI maintains a small queue length oscillation, the queue length does not stabilize near the expected value. REM is very sensitive to the change of the network load, and its queue length is difficult to stabilize. GPC algorithm shows an acceptable behavior when the load uniformly decreases, but it performs poorly when the load uniformly increases. It can be seen that GPC has noticeable oscillations in the first 100s. Whatever, PHAQM can stabilize the queue length near the expected value, and the queue length oscillation is small, no matter the load increases or decreases.

D. MULTIPLE BOTTLENECKS
To evaluate the performance of PHAQM under the multi-bottleneck link, this section will give the simulation results with two bottleneck links. The network topology is presented in Fig. 6. The link between R1 and R2 and the link between R3 and R4 are two bottleneck links. The settings of the environments are shown in Fig. 6. PHAQM is employed as the AQM algorithm on R1 and R3. There are 300 TCP sources connected to R1, and the destinations of each 100 sessions are connected to R2, R3, and R4. Besides, there are 200 TCP sources connected to R2, and the destinations of each 100 sessions are connected to R3 and R4, respectively. Finally, 100 TCP sessions are traveling from R3 to R4. The results are shown in Fig. 7, where (a) and (b) are the queue lengths on R1 and R3, respectively. In Fig. 7, the queue length dynamics are shown to exhibit good robustness and  fast system response. It can be seen that PHAQM can also effectively stabilize the queue length near the reference value in a complex network.

V. CONCLUSION
There are many pieces of research devoted to improving the performance of the congestion control in the Internet. However, it is still a challenge to detect and control congestion due to the complexity and dynamics of the networks. Hence, this paper is aimed to solve the above problem by applying the MPC theory and employing the adaptive scheme. In this paper, a TCP/AQM dynamic model is used to obtain a prediction model. An adaptive AQM algorithm, named PHAQM, is proposed based on the model by using MPC theory, and the adaptive scheme is designed with Hebb learning rules.
A large number of simulation experiments demonstrate the effectiveness of the PHAQM algorithm. The developed controller not only can guarantee that the queue length tracks the desired queue length, but also can be adaptive to network scenario variations. It is shown by experiments that the PHAQM has better robustness, and it has significant advantages in a dynamic network environment compared with traditional algorithms, such as PI, REM, and GPC. In particular, PHAQM can regulate the queue length around the expected value and obtain a good control effect under multiple bottleneck link scenarios.
Future work will cover the theoretical analysis of the stability and convergence speed of PHAQM and the extension of the simulation environment from numerical simulations to real network experiments. BIN XU received the bachelor's degree in electrical engineering from Donghua University, Shanghai, China, in 2011, and the master's degree in engineering management from Shanghai Jiaotong University, in 2017. He is currently a Engineer with Shanghai Shenzhou Engineering Company Ltd. He is currently engaging in project management of financial network industry. His current research interests include data, network congestion control, and quality control.