Queue Management Algorithm for Satellite Networks Based on Traffic Prediction

In connexion with the effect of the self-similar characteristic of satellite network service traffic on queueing performance, a prediction model with optimised triple exponential smoothing is first established in this paper. This model performs network traffic prediction based on the dynamic triple exponential smoothing model and optimises the smoothing coefficient of the model through the differential evolution algorithm; a cubic function based on traffic prediction is further proposed to improve the adaptive random early detection (ARED) queue management algorithm. Based on the network traffic prediction results and the ARED, this algorithm uses the cubic function to perform nonlinear processing on the packet drop probability function. The simulation results show that the prediction model with optimised triple exponential smoothing has a high prediction accuracy, and the improved ARED algorithm based on the cubic function of traffic prediction in the presence of data bursts in self-similar traffic. It can effectively reduce the packet loss rate and improve the throughput, so as to better control the network congestion caused by self-similar traffic in satellite network.


I. INTRODUCTION
In recent years, satellite networks have realised seamless global information transmission in real time and are widely applied to numerous fields such as national defence and military affairs, meteorological environment detection, positioning and navigation, and telemedicine. Large amounts of multimedia information such as images, audio, video, and streaming broadcast are generated and forwarded in satellite networks and, at the network aggregation nodes, the data generated by the satellite itself are aggregated with the data coming from other nodes.
The satellite network studied in this paper is an integrated high-capacity information network comprising different types of satellites, constellations, space stations and other spacecrafts with different orbits and performance, and corresponding ground facilities, connected together through intersatellite and satellite-ground links. It's capable of accurate The associate editor coordinating the review of this manuscript and approving it for publication was Diego Oliva . information acquisition, rapid information processing and efficient information transmission by utilizing high-speed on-satellite processing, switching and routing technologies. Among the spacecrafts included in the network, geosynchronous satellites are used as aggregation nodes of the network, and other spacecrafts as access nodes. The services transmitted by the backbone nodes include services generated by the aggregation nodes itself, services from other spacecraft and from the ground network, with high data rate and high burst rate. In cases of data burst, the queuing performance of the network aggregation nodes would be greatly affected. Applying the active queue management mechanism on the satellite aggregation nodes can effectively avoid congestion by reporting the network congestion information to the source end through packet loss of a small amount of information to reduce the information sending rate of the source end.
Research has shown that the service traffic of the terrestrial Internet generally exhibits self-similarity [1]; the selfsimilar services of the terrestrial network enter the satellite link, and the traffic after aggregation and propagation in the VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ satellite network still shows self-similarity, that is, the selfsimilarity of the traffic is also ubiquitous and is propagated in the satellite network, making the self-similarity of the network traffic an unavoidable inherent characteristic [2]. The self-similarity of satellite network services is embodied in the long-range dependence of the traffic, which puts the aggregation nodes in a burst state for a long period of time and greatly affects queueing performance. The self-similar data flow is predictable, and network traffic prediction can perceive the burst state of queueing in advance. Therefore, the establishment of a high-precision traffic prediction model and research on a queue management algorithm for satellite networks based on traffic prediction can limit the queue length in advance and control network congestion.
To solve the aforementioned problems, in this paper, the approximate self-similar traffic with a Pareto distribution generated from the aggregation of a large number of ON/OFF sources is analysed as the basic model, a queue management algorithm for satellite networks based on traffic prediction is proposed, traffic prediction is conducted using a dynamic triple exponential smoothing model, and the smoothing coefficient of the model is optimised through the differential evolution (DE) algorithm to increase prediction accuracy and reduce complexity. Furthermore, based on the adaptive random early detection (ARED) algorithm, the cubic function is used to perform nonlinear processing on the packet drop probability function, and the results of the traffic prediction are substituted into the queue management algorithm to propose an improved ARED algorithm based on the cubic function of traffic prediction. This algorithm can effectively control queue length, reduce the packet drop rate, increase throughput, and make the channel transmission more stable.
The main algorithm in this paper is as follows. First, a dynamic triple exponential smoothing (Holt-Winter) model is constructed for the self-similar traffic in the satellite network. The smoothing coefficient α in the dynamic model is not a traditional fixed value, and it can respond in a timely manner when faced with traffic data that fluctuate greatly.
Second, the selection of α directly determines the prediction accuracy. In this paper, based on the dynamic triple exponential smoothing model, the DE algorithm is used to optimise α, and a traffic prediction model with optimised triple exponential smoothing is proposed to increase prediction accuracy.
Third, the optimised traffic prediction model is introduced to queue management, and the new cubic function is used to perform nonlinear processing on the packet drop probability function in ARED.
Finally, the overall process for the queue management of satellite networks based on traffic prediction is designed, and the packet drop probability obtained is used to control the average queue length, packet drop rate, and throughput in channel transmission to effectively prevent channel congestion.
The remainder of this paper is organised as follows. The relevant prediction models and queue management models are analysed in Section II. The service model of the satellite network is described in Section III. Based on the dynamic triple exponential smoothing model, the DE algorithm is employed to optimise the traffic prediction model with triple exponential smoothing in Section IV. Section V focuses on improving the packet drop probability function in connexion with the ARED algorithm using the cubic curve, followed by the introduction of the traffic prediction model into queue management. The simulation results and performance analysis are discussed in Section VI.

II. RELATED WORK
Researchers have studied network traffic prediction models extensively. The traditional Markov model is a typical short-range dependence model that fails to satisfy the demands of long-range dependence network prediction [3]. With the development of machine learning, prediction models such as back propagation (BP) neural network and support vector machine (SVM) have emerged. Although the BP neural network can depict the long-range dependence and short-range dependence of the network traffic very well, it suffers from difficult selection of parameters, slow rate of convergence, and susceptibility to premature convergence, making it unsuitable for real-time prediction of network traffic [4]. The SVM model can be a solution only for machine learning with small samples and is difficult to implement for large-scale training samples [5]. The most widely applied prediction models are the time series models, which are divided into the moving average model, stationary time series model, and exponential smoothing model [6], as briefly described below. The moving average model can effectively eliminate random fluctuations in prediction, but it can only use the mean value of the latest set of actual data for prediction, which does not consider the later trends, making it suitable for instant prediction in which the data neither increase rapidly nor decrease rapidly. The stationary time series model has a simple structure but requires that the predicted data be stable, as this model cannot capture the pattern of bursty data. The exponential smoothing model is compatible with the advantages of the above two types of time series models and is adapted to long-range dependent, self-similar traffic dat; this model gives the past data a gradually diminishing degree of influence as they move away, calculates the corresponding exponential smoothing value, and then acts in concert with the optimised model to greatly improve the prediction accuracy.
The existing studies are mainly based on neural network flow models, multi-fractal wavelet models, multiple fractal finite shock response neural network prediction models, etc. This line of research is in the budding phase and there is no guarantee for the prediction accuracy and there hasn't been detailed discussion about the condition variables in the model. The most commonly used traffic prediction algorithms are either neural network or combined prediction algorithms. In this paper, a time series model combined with a three-time exponential smoothing algorithm and a differential evolution algorithm is used which could predict self-similar traffic accurately and be linked to the subsequent queue management algorithm seamlessly.
The research of network traffic characteristics is based on the theory of time series analysis, which is aimed at the modeling of traffic prediction. As a more comprehensive algorithm in time series, cubic exponential smoothing method is almost suitable for analyzing all application problems of time series. Time series is an objective record of the historical behaviors of the system under study, which is to analyze these historical records, find out the statistical dependence between data, grasp the structural characteristics and operating rules of the system, and predict the future behavior of the system. The research shows that the network traffic model has selfsimilarity statistically, which can accurately describe the network traffic, properly establish functions and dependencies on the existing data, analyze the inherent patterns of the data, control and predict the future data. The improved cubic exponential smoothing method is established to smooth the burst data stream three times, and the α coefficient and flow can be predicted accurately after the cubic smoothing, which has important reference value for eliminating network congestion and realizing the management and planning of complex networks. Cubic exponential smoothing algorithm has strong adaptability to the self-similarity and non-regularity of satellite network traffic. It can reduce the impact of burst traffic on the overall prediction results, and has high prediction accuracy for satellite network traffic prediction research.
Queue management algorithms can be passive or active [7]. The typical representative of passive queue management is the tail drop algorithm, which is simple in principle but poses problems of deadlock, full queue, and global synchronisation. The random-drop and the drop-front algorithms are derived subsequently to effectively avoid the deadlock problem, but the full queue and global synchronisation problems remain unaddressed. Later, proposed the use of active queue management algorithms [8], which are represented by the random early detection (RED) algorithm, the random exponential marking (REM) algorithm based on optimisation theory, and the ARED algorithm. The RED algorithm proposed by Floyd et al. is the earliest version of the active queue management algorithm; this algorithm greatly improves the operating quality of the network but is highly sensitive to parameter settings [9]. Steven et al. combined the optimisation theory with the REM algorithm, which, however, has drawbacks such as slow convergence and slow reaction to bursts of traffic [10]. Nichols and Jacobson proposed a new active queue management algorithm named Controlled Delay and shortened as CoDel [11], which provides a partial solution to the router's over-caching problem. The Controlled Delay active queue management algorithm CoDel, uses the delay of the packets in the queue as the evidence for determining congestion, and strictly controls the delay. When each packet arrives, the queue length size is checked, and if it is bigger than the maximum queue length min th , the packet is discarded, otherwise it enters the queue, and a time stamp is added to the header of the packet; if the current queue delay is longer than the queue expected delay, the queue enters the discard state, and the data packet is continuously discarded, and when the queue delay drops below the queue expected delay, the queue enters the non-discard state. The CoDel algorithm needs to check the queue size, record the time when the packet enters the queue, and get the delay of the packet in the queue by calculating the difference between the current time and the time when the packet enters the queue. The performance can then be analyzed comprehensively by ARED algorithm based on the average queue length, the queue threshold, transmission rate, link utilization, packet loss rate. ARED algorithm itself contains a discreet adaptive mechanism that can improve the stability of the algorithm.
The ARED algorithm is capable of realising the automatic adjustment of parameters, thereby significantly increasing the system channel utilisation and greatly decreasing the packet drop rate of the syste; however, the packet drop probability function in ARED is a linear function, and the overly rapid linear growth rate easily leads to queue oscillation [12]. For this reason, the S-shaped ascending half-Cauchy distribution function is used in some studies to perform nonlinear adjustment to the packet drop probability function of the ARED algorith; consequently, the packet drop rate decreases, and the queue management system becomes stable, whereas changes in the queue length is not considered in advance.
In connexion with the self-similar characteristic of the satellite network traffic, an algorithm is proposed in this paper to introduce traffic prediction in queue management for satellite networks.

III. TRAFFIC MODEL OF SATELLITE NETWORK
The satellite network communication system consists of space and ground segments, as shown in Figure 1. The space segment mainly refers to the orbiting satellites and the ground stations that control the orbiting satellites. These ground stations mainly implement tracking, remote sensing, and remote control functions to provide necessary satellite management and control and hence ensure the normal in-orbit operation of the satellites. The ground segment mainly refers to the user terminals that conduct communication through the satellites, including fixed terminals, mobile terminals, and movable terminals.
The services at the satellite communication aggregation nodes come from self-generated service and service generated at the satellite or ground segments. Each node provides various types of services, and the various services of multiple nodes aggregate at the nodes of the satellite network, thus leading to a burst of traffic on the nodes. In order to truly describe the characteristics of satellite network traffic, it is necessary to develop an accurate and reliable operating model of network traffic. Previous researchers have proposed many traffic prediction mechanisms to generate self-similar network models, such as Fractal Brown Motion (FBM), Fractional Gaussian Noise (FGN), Multi-fractal Models (MFM), Among the above mentioned models, the FBN, FGN and FARIMA models have specific modeling formulas, which make it easy to analyze the model performance. However, the disadvantage is that they suggest no physical meaning, thus unable to indicate the causes of traffic generation. The ON/OF model could explain the causes of Network Traffic Self-Similarity by aggregating a large number of independent ON/OFF sources to generate auto-correlated aggregated traffic. When the number of ON/OFF source reaches infinity, the aggregated traffic converges to fractal Brownian motion, which can explain the single fractal behaviour at long time scales. Nonetheless, this model cannot explain the multifractal characteristics of multi-fractal models at short time scales. These models mainly include MFM and MWM which usually assume that they conform to a random distribution and iterate from top to bottom to obtain a multi-fractal traffic flow. However, this approach doesn't carry much physical significance.
The ON/OFF model is used in this paper. The lengths of the duration when a single service is in the ON state and the OFF state are typically independent and identically distributed random variables. The ON cycle random variable A satisfies E(A) = 1/ν, and the OFF cycle random variable B satisfies E(B) = 1/µ; then, the average velocity of the ON/OFF source is νµ/(ν + µ). The duration of the ON/OFF state has a finite mean and infinite variance. Mathematical derivation shows that the time intervals of ON and OFF follow a heavy-tailed distribution and have strictly alternating ON and OFF cycles, and the random process generated by the superposition of a large number of ON/OFF sources has selfsimilarity.
The strictly alternating ON and OFF cycles are used to describe the alternation of the data sources of the satellite network nodes between sending data and nonsending data states [13]. Network traffic with self-similar characteristics is generated through the superposition of multiple independent ON/OFF service sources, and the heavy-tailed distribution of The Pareto distribution is the most commonly used heavytailed distribution, and its distribution function is The parameter k determines the minimum value that the random variable can take, while the parameter λ determines the mean and variance of the random variable. If λ ≤ 2, the distribution has infinite variance; if λ ≤ 1, the variance has an infinite variance and mean. The superimposed services generate self-similar traffic as long as the ON or OFF cycle follows a heavy-tailed distribution. Therefore, in this paper, the traffic of the satellite network aggregation nodes can be considered the joint action of a series of ON/OFF sources superimposed with the Pareto distribution, and the network traffic generated by multiple ON/OFF service sources is sent to the receiving end through the aggregation nodes, as shown in Figure 2.

IV. TRAFFIC PREDICTION MODEL WITH OPTIMISED TRIPLE EXPONENTIAL SMOOTHING A. PREDICTION MODEL WITH DYNAMIC TRIPLE EXPONENTIAL SMOOTHING
The exponential smoothing model for time series was first proposed by Holt in 1958. The principle is that the exponential smoothing value of any period is the weighted average of the actual observed value of the current period and the exponential smoothing value of the previous period. The smoothing technique is used to weaken the effect of short-term random fluctuations on the series and to smoothen the series, thereby obtaining the time series smoothing value as the prediction parameter for the short term [14]. This algorithm is broadly used in production forecasts and in forecasts of shortand medium-term economic development trends. The exponential smoothing model has different times of smoothing. In particular, the prediction model with triple exponential smoothing (Holt-Winter) mainly performs correction in connexion with the nonlinear trend in the time series and is able to adapt to the nonlinear, self-similar, and long-range dependent changing trends of the satellite network traffic [5]. The Holt-Winter algorithm includes three smoothing equations and one prediction equation. It calculates the weighted average of the observed values from each period according to the time sequence, and the result is used as the traffic prediction value, showing that the influence of historical data on future values decreases with time [16].
Suppose the time series is y 1 , y 2 , · · · , y t , · · · , the smoothing coefficient is α, 0 < α < 1, and the triple exponential smoothing equation i t , and S (3) t are the single, double, and triple exponential smoothing values, respectively. S (1) t is expanded successively as follows: Equation (3) indicates that S (1) t is the weighted average of all historical data, using the weighting coefficients of α, α (1 − α), and α (1 − α) 2 · · · , respectively. It can be seen that the predicted value of period t as well as the actual observed values of period t − 1 and all other previous periods increase by (1 − α) n , that is, the observed values closer to the prediction period are assigned larger weights, and the observed values farther away from the predicted value are given smaller weights, with the weights decreasing exponentially from near to far. The principle for expanding S (2) t and S (3) t is the same as above [17].
The smoothing coefficient α of the traditional triple exponential smoothing model takes a fixed value. This model is applied mainly to stable data models. In medium-and longterm data prediction, significant changes in data defy their timely adjustment, thereby leading to large prediction errors. For this reason, the traditional triple exponential smoothing method was improved to construct a prediction model with dynamic triple exponential smoothing [18], where dynamic smoothing coefficients are proposed based on the traditional algorithm, and the parameters are updated through iteration. This algorithm can reduce the prediction error in the mediumand long-term prediction of complex data and has better stability.
The smoothing coefficient is denoted as α m,n , where m is the predicted traffic data and n is the time point of the data prediction. The dynamic triple exponential smoothing equation is The first three values of the overall data are selected for X N −3 , X N −2 , and X N −1 in Equation (5). Then, every time the predicted value is calculated, the initial values S (1) , S (2) , and S (3) are updated.
Suppose the horizontal parameter is a t , the trend parameter is b t , the seasonality parameter is c t , and the predicted value of period T in the future is y t+T . Then, the prediction equation is The smoothing coefficient and parameters of each time point in the dynamic triple exponential smoothing model change dynamically. Hence, iterative calculation is required for each period as follows:

B. OPTIMISATION OF THE SMOOTHING COEFFICIENT
In the dynamic triple exponential smoothing model, determining the appropriate smoothing coefficient α m,n is the key to the entire model. The DE algorithm is an efficient global optimisation algorithm, which is used mainly to solve global optimisation problems with continuous variables. In this paper, the DE algorithm is used to optimise the smoothing coefficient α m,n to increase the prediction accuracy of the dynamic triple exponential smoothing model. Through continuous iterative calculations, the DE algorithm keeps good individuals, eliminates inferior individuals, and guides the search process to approach the global optimal solution [19]. The memory ability and selectivity characteristics of the DE algorithm enables it to dynamically track the current search situation to adjust its search strategy and optimise the smoothing coefficient α m,n of the dynamic triple exponential smoothing model at every moment. The DE algorithm optimises the smoothing coefficient α m,n , which overcomes the problems of low efficiency and inability to self-adapt in the parameter selection of the traditional Holt-Winter model, thus increasing the prediction accuracy and efficiency of the triple exponential smoothing model. The fitness function is established with the minimum mean squared error (MSE) of samples as the optimisation criterion. The MSE is expressed as follows, with S i , x i , and I representing the predicted value, the true value, and the total number of traffic data, respectively: The problem of obtaining the optimal solution for α can be transformed into the calculation of the minimum MSE between the predicted value and the actual value at each iteration and the corresponding smoothing coefficient and using it as the optimal value for this iteration. The smaller the MSE is, the higher the accuracy is. As shown in Figure 3, the individuals in the population are repeatedly subject to iterative adjustments of crossover, mutation, and selection until the maximum number of iterations is reached or the termination condition of the fitness function is satisfied, and the optimal smoothing coefficient α m,n and the corresponding predicted value at this time are used as the output.
Let be a vector of individuals in the population, the evaluation function f ( ) is used as the fitness function, and the DE algorithm is used to solve for the optimal value * . After M individuals have undergone G times of evolution, the predicted value of the output is substituted into Equation (10) to solve for the optimal value when the function f ( ) takes the minimum value. This problem is the global optimal solution problem of a multivariate function. The process of optimising the smoothing coefficient with the DE algorithm is shown in Table 1.

C. TRAFFIC PREDICTION PROCESS WITH OPTIMISED TRIPLE EXPONENTIAL SMOOTHING
The smoothing coefficient and parameters of each time point in the dynamic triple exponential smoothing model change dynamically, requiring iterative calculation in each period. The smoothing value of the data is calculated first, and then the optimal value * in Algorithm 1 is designated the dynamic smoothing coefficient α m,n , which is substituted into Equations (7), (8), and (9) to obtain the predicted value using Equation (6). The process of traffic prediction performed with the optimised triple exponential smoothing model is shown in Table 2, where m is the predicted traffic data and n is the time point of the data set prediction.

V. IMPROVED ARED ALGORITHM BASED ON THE CUBIC FUNCTION OF TRAFFIC PREDICTION
In queue management algorithms, when there are too many devices accessed by users at the buffer end, buffer expansion will cause congestions. Many ways including protocol modification such as delay-based congestion control would solve this problem, but it may produce an opposite effect caused by network jitter, dynamically changing access devices based on buffer size or a combination of both. The Active Queue Management (AQM) mechanisms which have many other potential benefits can solve the problem of buffer bloat. Apart from the first-in-first-out (FIFO) and tail-drop algorithms, cache management policies are considered to be aggressive and the corresponding method used by routers to manage queues is called aggressive queue management (AQM). It collects information about congestions by marking the data packet from routers, then transmits the status of routers and switches to the end system to obtain AOM mechanism. The classical algorithm of the AQM algorithm is RED and some other improved algorithms on the basis of the RED algorithm. The Random Early Detection (RED) mechanism detects the onset of congestions in advance and controls data packet marking. These gateways implement a management algorithm that measures the average occupied queue length. If the occupied queue length exceeds the minimum min th and is less than the maximum max th , then the packet is marked with an increasing probability value. If the average occupied queue length exceeds the maximum max th , the packet is marked with a configurable maximum probability value of maxp. When maxp is 1, instead of marking the data packets, RED algorithm would drop these packets. There are many versions of the improved RED algorithm, which could be supported by many routers and switches. However, the RED algorithm cannot be used for comprehensive networks. This paper adopts Adaptive Random Early Detection (ARED) to cope with the difficulties in setting the RED algorithm parameters.
Because of the self-similarity of network services, it becomes more difficult to conduct Queue management research. The queue management mechanism of ARED alone cannot solve the traffic burst problem, but self-similarity provides a new idea, for which we introduce a traffic prediction mechanism that can forecast traffic bursts by anticipating the network traffic condition in advance. Thus the predicted queue length as a parameter is brought into the ARED algorithm, and a three-time exponential smoothing method and the differential evolutionary algorithm are used to optimise the prediction accuracy in order to better cope with traffic bursts.

A. ARED PACKET DROP PROBABILITY BASED ON THE CUBIC FUNCTION
In the ARED algorithm, when the average queue length Qavg is between the lower limit min th and the upper limit max th of the threshold, the packet drop probability P b is linearly related to Qavg [20]. However, when Qavg is in the vicinity of min th , there are fewer data packets in the queue, the network is comparatively idle, and P b can be slowly increased or decreased at this moment to increase the throughput of the network; when Qavg is in the vicinity of the maximum threshold max th , there are more data packets in the queue, and P b should be quickly adjusted at this moment to avoid network congestion. In this paper, the cubic curve function is used to conduct nonlinear smoothing processing on the packet drop probability function of ARED to more efficiently avoid network congestion and increase channel utilisation.

1) PACKET DROP PROBABILITY BASED ON THE CUBIC FUNCTION
The cubic curve is expressed as The constraint conditions for the start time 0 and the end time T are set as s(0) = 0, s(T ) = 1, s (0) = 0, s (T ) = 0.
The following are separately obtained: Therefore, which is translated to the right by a units to obtain The cubic function curve after translation is shown in Figure 4.
In Equation (13), let t be the average queue length, a be the lower limit of the threshold min th , and T + a be the upper limit of the threshold max th . This equation is used to perform nonlinear processing on the part min th ≤ Qavg ≤ max th in the ARED packet drop probability function to obtain the ARED packet drop probability function improved by the VOLUME 10, 2022 cubic function, as shown in Equation (14): (14) where maxp is the maximum drop probability of the part min th ≤ Qavg ≤ max th . When Qavg = max th , P b = maxp. Based on the idea of the ARED algorithm, the target queue length, target, is set with a value range of [min th +0.4 (max th − min th ), min th +0.6 (max th − min th )], and the relationship between target and Qavg is used to adaptively adjust maxp [21]. If Qavg is in the vicinity of min th and Qavg < target, the congestion adjustment is too active, and hence, the value of maxp must be decreased; if Qavg is in the vicinity of max th and Qavg > target, the congestion adjustment is too conservative, and the value of maxp must be increased [22]. Let maxp + and maxp − represent the maximum drop probabilities obtained after a radical increase and a conservative decrease, respectively, which are expressed as follows: where α is the increase factor, α = min(0.01, max p/4), and β is the decrease factor, which usually takes a value of 0.9. Different drop probability curves are obtained with different values of maxp. Figure 5 shows the variation curve of the packet drop probability P b with Qavg when maxp = 0.5. P b = 0 when Qavg < min th , and P b = 1 when Qavg > max th . When min th ≤ Qavg ≤ max th , P b varies nonlinearly with Qavg. If there are fewer data packets in the queue, Qavg is small, and thus, P b increases relatively slowly; if the data packets in the queue increase continuously, Qavg is large, and thus, P b increases fast. Based on the ARED algorithm, the algorithm in this paper uses a nonlinear packet drop probability function, which can effectively solve the problem of excessively fast increase in the packet drop rate, does not easily cause queue oscillation, makes the traffic transmission more stable, increases channel utilisation, and resolves the parameter sensitivity of RED, controlling network congestion very well.

B. PROCESS OF THE IMPROVED ARED ALGORITHM BASED ON THE CUBIC FUNCTION OF TRAFFIC PREDICTION
In this paper, the DE algorithm is used first to optimise the smoothing coefficient α in the dynamic triple exponential smoothing model to obtain the predicted results of the traffic. Then, the queue management algorithm is executed and, based on ARED, the cubic function is employed to perform nonlinear processing on the packet drop probability function. Finally, the predicted traffic results are sent to the queue management algorithm, and the improved ARED algorithm based on the cubic function of traffic prediction is proposed. Figure 6 is the flow chart of the algorithm. Table 3 is the process of the improved ARED algorithm based on the cubic function of traffic prediction. Here, q is the predicted value in Algorithm 2, Qavg is the average queue length, the weight value is wq = 0.05, and interval is the time interval for the change in drop probability.

VI. PERFORMANCE SIMULATION AND ANALYSIS A. SIMULATION OF THE TRAFFIC PREDICTION MODEL WITH OPTIMISED TRIPLE EXPONENTIAL SMOOTHING
The degree of self-similarity of the traffic is represented by the Hurst parameter value H. 0 < H ≤ 0.5 means the network flow has short-range dependence; 0.5 < H < 1 indicates that the network traffic has self-similarity and long-range dependence, and a larger H value indicates higher dependence and burstiness in the data stream [23]. In this paper, the original data stream of the satellite network were simulated in Matlab. The R/S analysis method was used to estimate the Hurst parameter H = 0.835 of the original data, showing that the original data stream had high data self-similarity and longrange dependence [25]. The prediction model with optimised triple exponential smoothing was used to predict the satellite network traffic and was compared with the autoregressive  moving average (ARMA) model and the prediction model with traditional triple exponential smoothing [26].
The fitting results of the traffic predicted by the different models are shown in Figure 7. Clearly, the prediction model with optimised triple exponential smoothing of this paper significantly outperforms the other two prediction models in terms of fitting the original data and thus are able to more accurately predict the magnitude and trend of the traffic.
The MSE errors of the three traffic prediction models are compared in Figure 8. The prediction error of the ARMA model is the highest, followed by that of the prediction model with traditional triple exponential smoothing, and the error of the prediction model with optimised triple exponential smoothing is significantly lower than that of the other models,   showing that the prediction model with optimised triple exponential smoothing has a high prediction accuracy. Table 4 lists the MSEs of the three algorithms. The MSE error of the algorithm proposed in this paper is smaller than that of the other two algorithms. Hence, the proposed the algorithm has the highest accuracy.

B. SIMULATION WITH THE IMPROVED ARED ALGORITHM BASED ON THE CUBIC FUNCTION OF TRAFFIC PREDICTION
Simulation is performed using the improved ARED algorithm based on the cubic function of traffic prediction and compared with that using the ARED algorithm and the ARED algorithm based on traffic prediction. The simulation parameters are shown in Table 5. Figure 9 shows the curves of the average queue lengths for the three algorithms. Between 9 s and 14 s, the average queue length of the algorithm proposed in this paper is significantly longer than that of the other two algorithms. This result is found because data bursts occur in the self-similar traffic VOLUME 10, 2022   on the network within this time range, and the proposed algorithm is able to maintain a long queue length in the case of bursts in self-similar traffic to stabilise the transmission performance of the network.

1) AVERAGE QUEUE LENGTH
The mean and standard deviation of the average queue length for each of the three algorithms are provided in Table 6.
In the presence of data burst in the self-similar traffic, to stabilise the network transmission performance, the mean and standard deviation of the average queue length for the algorithm proposed in this paper are larger than those of the other two algorithms.

2) PACKET DROP RATE
The packet drop rate refers to the ratio of the amount of data dropped to the total amount of data sent in the data transmission process. The curves of the packet drop rates for the three algorithms are shown in Figure 10. Between 9 s and 14 s, the ARED algorithm and the ARED algorithm based on traffic prediction both have a high packet drop rate, while the packet drop rate of the algorithm proposed in this paper is significantly lower than that of the other two algorithms. This result is found because data bursts occur in the self-similar traffic on the network within this time range, and the proposed algorithm is able to maintain a low packet drop rate in the presence of bursts in self-similar traffic.
The mean and standard deviation of the packet drop rate for each of the three algorithms are listed in Table 7. The mean  and standard deviation of the packet drop rate for the ARED algorithm are the largest, followed by those for the ARED algorithm based on traffic prediction, and the mean and standard deviation of the packet drop rate for the algorithm proposed in this paper are significantly smaller than those of the other two algorithms, indicating that the improved ARED algorithm based on the cubic function of traffic prediction in this paper has a lower packet drop rate and a better congestion control performance.

3) THROUGHPUT
Throughput refers to the number of data packets transmitted by the link within a unit time. It is an important indicator reflecting the network transmission capacity. Figure 11 shows the curves of the throughputs for the three algorithms. Between 9 s and 14 s, the ARED algorithm and the ARED algorithm based on traffic prediction have low throughputs, whereas the algorithm proposed in this paper has a high throughput. The reason for this difference is that data bursts occur in the self-similar traffic on the network within that time range, and the proposed algorithm is able to maintain a high throughput in the presence of bursts in selfsimilar traffic.
The mean and standard deviation of the throughput for each of the three algorithms are given in Table 8. The mean of the throughput for the algorithm proposed in this paper is larger than that of the other two algorithms. Because the proposed algorithm is able to maintain a higher throughput in the presence of data bursts in self-similar traffic, the standard deviation of its throughput is larger than that of the other two algorithms, indicating that the improved ARED algorithm based on the cubic function of traffic prediction proposed in this paper can better control the network congestion caused by self-similar traffic.

VII. CONCLUSION
In this paper, a prediction model with optimised triple exponential smoothing is established, and the results of network traffic prediction are introduced into the queue management algorithm to propose an improved ARED algorithm based on the cubic function of traffic prediction. The proposed prediction model has a higher prediction accuracy. In the presence of data bursts in self-similar traffic, the improved ARED algorithm based on the cubic function of traffic prediction outperforms the ARED algorithm and the ARED algorithm based on traffic prediction in terms of packet drop rate and throughput performance, thereby effectively controlling the network congestion caused by self-similar traffic.