Waiting Time in a General Active Queue Management Scheme

We derive the waiting time in a queueing scheme, in which an arriving job can be denied service with probability relative to the queue size. Such scheme is a generalization of the tail-drop queue, in which the job is denied service when the buffer (waiting room) is full, and can be found in computer networking, call centers and other everyday life applications of queueing systems. To make the model very general, we use an arrival process which enables shaping arbitrary the job interarrival time distribution and interarrival time autocorrelation, as well as general distribution of the service time and job rejection probabilities. For such model, we prove theorems on the waiting time in the transient case, i.e. as a function of time, as well as in the stationary case. Theoretical results are illustrated via numerical examples, in which the dependence of the behaviour of the system on various parameters is depicted. Among other things, it is demonstrated that the assumed job rejection mechanism may induce rather unexpected waiting times if combined with strong autocorrelation of the arrival process.


I. INTRODUCTION
We analyze a queue, in which every arriving job (customer) can be rejected, i.e. denied access to the queue and service. Such a job leaves the system unserved and never returns. What is more, the decision whether a job is allowed to the system or not, is probabilistic, and probability of rejection depends on the queue size upon this job arrival.
The most important area of application of such systems is networking. Algorithms in which the packets arriving to a router's buffer are deleted with probability growing with the queue size have been known and studied via simulations for a long time (see e.g. [1], [2], [3], [4], [5], [6], [7], [8], [9]). Recently, these algorithms were implemented in a networking device and studied in a real network of a university, [10]. The main reason why these algorithms are postulated is the necessity to eliminate high buffer occupancies (bufferbloat), typical in contemporary networks, [11], [12].
Networking is not necessarily the only area of application of the queueing model with rejection probability based on The associate editor coordinating the review of this manuscript and approving it for publication was Tiago Cruz . the queue size. For instance, it can be used for modelling a call center, which exploits an answering machine to inform a new caller about the number of callers waiting for the service before him (quite common nowadays). It can be conjectured that probability that a new caller leaves the queue immediately, without service, is a function of the size of the queue ahead of him. In fact, the same reasoning can be applied to any everyday life queue, if only a customer can see the size of the queue upon arrival.
The queueing scheme described above can be perceived as a generalization of the tail-drop queueing scheme, which is well known and used in many computer or electronic systems. In the tail-drop scheme, a buffer of a limited capacity, N , is used to store jobs/tasks/packets before service. When the buffer becomes full, a newly arriving job is rejected. It is easy to see that the tail-drop scheme is a special case of the scheme described above -the rejection probability is 0 when the queue size is below N , and 1, when the queue size is N .
To make the model considered herein general, we assume that the job arrival process can have correlated interarrival times. Such autocorrelation is typical in networking, [13], [14], but can be also found in other applications of queueing systems, including everyday life queues. For instance, a cafe next to a train station may experience an autocorrelated traffic, caused by train arrivals or departures. When the autocorrelation is present in a real system, it is absolutely crucial to take it into account in its model. It has been demonstrated that performance predictions based on the model with omitted autocorrelation can be wrong by several orders of magnitude, even if all other parameters of the model are accurate.
When characterizing any queueing system, one of the most important characteristic is the mean waiting time, i.e. the mean time spent in a queue by a job, before entering service.
Therefore, we derive herein the mean waiting time in the model described above. Both the transient and the stationary case are solved. Namely, we first derive the time-dependent characteristic, i.e. the mean waiting time assuming that a hypothetical job arrives at arbitrary time t (Theorem 1). Then, letting t → ∞, we derive the mean waiting time in the stationary regime (Theorem 2). Pay attention, that having the solution for an arbitrary t, we may use small values of t to study the evolution of the system just after it has been activated, i.e. study the influence of its initial state on the short-time operation of the system.
To illustrate theoretical results, we present numerical examples with diversified parameters, including different autocorrelations of the arrival process, job rejection probabilities and initial states of the system. In these examples we can see how the mentioned parameters influence the operation of the queue.
As the model of the arrival process we use the Markov-modulated Poisson process, [15]. It combines superb modelling capabilities with moderate analytical difficulties. In particular, using this process we can mimic accurately any practically useful shape of the autocorrelation function, together with any practically useful shape of the interarrival time distribution (see, e.g. [16]).
Finally, both the distribution of the service time and the function assigning rejection probabilities are general and can have arbitrary forms. These make the model as general as possible.
Mathematical approach herein exploits regeneration points in the evolution of the system. They enable to formulate a system of integral equations using the total probability law. This method can be used not only to derive the classic performance parameters, e.g. the mean queue size and waiting time, but also some less frequently used characteristics, like the duration of the overflow period, or the burst ratio (see [17], [18]).
The remaining part of the article is organized as follows. Section II outlines the related work. In Section III, the model of the queue is specified, together with the model of the arrival process and its main characteristics. Then, the main results of the paper are shown in Section IV. Firstly, the time-dependent mean waiting time is derived and presented in Theorem 1. Then, as a corollary, the stationary mean waiting time is obtained in Theorem 2. In Section V, numerical examples are gathered. Three parameterizations of MMPP are used with different rejection probabilities and initial conditions to illustrate the waiting time evolution. Finally, conclusions are presented in Section VI.

II. RELATED WORK
As far as the author knows, the results of this article are new.
Finally, autocorrelation of traffic in a queue with probabilistic rejections was taken into account in [29] and [30]. However, these papers were devoted to different queueing characteristics, namely the queue size distribution, [29], and the burst ratio, [30].

III. MODEL OF THE SYSTEM
The following queueing model is analyzed herein. Jobs arrive to the service station according to the Markov-modulated Poisson process, which is defined below. At the service station, they are being served in the arrival order. The distribution of the service time is general with distribution function F(t).
An arriving job, if allowed, joins the queue of jobs waiting for service. A job is allowed to join the queue with probability d(n), where n is the number of jobs present in the system upon the new job arrival. That is, with probability 1−d(n), the new job is rejected -it leaves the system unserved, immediately after arrival.
The capacity of the system is finite and equal to N . Namely, if upon a job arrival there are N jobs in the system, the new job is rejected with probability 1.
By X (t) the system occupancy (queue size) at the time t will be denoted. The service position is included in the queue size, X (t), if occupied. By M we denote the mean duration of the service time, while the system load is: where λ is the rate of the Markov-modulated Poisson process, given below in (5).
We assume that a new service begins at t = 0 if X (0) > 0, what makes the time origin to be the service completion time. This is a technical assumption, which does not cause any loss of generality of the model.
It is clear that the presented model generalizes the tail-drop queueing scheme. In the tail-drop model we have simply d(n) = 0 if n < N and d(n) = 1 if n = N . In the model analyzed here, function d(n) may have an arbitrary form. Now, the Markov-modulated Poisson process (MMPP), [15], is defined using an auxiliary modulating process, i.e. a continuous-time Markov chain. The modulating process has m states {1, . . . , m}. Its rate matrix is denoted by Q, while the modulating state at time t by J (t). The arrivals in an MMPP happen according to the time-inhomogeneous Poisson process, such that the temporary arrival rate at time t is λ J (t) . Hence, to parameterize an MMPP, we need m arrival rates, λ 1 , . . . , λ m , in addition to matrix Q. In calculations, it is often convenient to use these rates in the form of a square matrix: The main characteristic of an MMPP is its total rate (intensity), denoted by λ. To calculate λ, the stationary distribution of the modulating chain, π , is needed. It can be computed using the set of linear equations: Then, the rate of an MMPP is: The interarrival time density in an MMPP is expressed by matrix exponential. Namely, we have: where P denotes probability and τ k is the k-th arrival time. The variance, V , of the interarrival time in an MMPP equals: In this paper, an important role is played by the autocorrelation function. The k-lag autocorrelation of an MMPP is equal to: MMPP enables fitting both the interarrival time distribution and the autocorrelation function to observed arrival process. Several methods for fitting matrices Q and were proposed, see e.g. [16], [31], [32], [33], and [34].

IV. WAITING TIME
Let W n,i (t) be the mean time that a job that arrived hypothetically to the system at time t, and was allowed to join the queue, would spend in the system before service, under assumptions X (0) = n and J (0) = i. It is easy to see that W n,i (t) is equal to the amount of unfinished work in the system at time t. W n,i (t) depends on the initial system occupancy, n, and the initial modulating state, i, for every t. Thus it is a transient, time-dependent characteristic.
Define the following Laplace transform: Both W n,i (t) na w n,i (s) will be also used in vector forms: Denote by A n,k,i,j (v) the probability that in a system with suspended service, k jobs were allowed to join the queue in time interval (0, v) and it was J (v) = j, under assumptions X (0) = n and J (0) = i. (It will be shown later, how A n,k,i,j (v) can be computed).
Let us assume now that the system is initially non-empty, X (0) = n > 0. Conditioning on the end of the first service time, v, we obtain the following system of integral equations for W n,i (t): where n = 1, . . . , N , i = 1, . . . , m. Indeed, if the end of the first service time happens before t, than with probability A n,k,i,j (v) there are n + k − 1 jobs in the queue at time v and the modulating state at time v is j. Counting from time v, the new, conditional value of the mean waiting time is W n+k−1,j (t −v). Therefore, summing up by all possible values of j, k and v, we obtain the first summand of (14). If the end of the first service time happens after t, than with probability m j=1 A n,k,i,j (t) there are n + k jobs in the queue at time t, and one of these jobs is under service with the residual service time of v − t. Hence, to calculate the mean waiting time, we need to sum up n + k − 1 complete service times, and one incomplete service time of length v − t. Again, summing up by all possible values of k and v we arrive at the second summand of (14). VOLUME 11, 2023 Integration by parts in (14) gives: with Administering the Laplace transform to both sides of (15) yields: where Then, (17) can be simplified to: where H n,k (s) = h n,k,i,j (s) i,j=1,...,m .
Now we can analyze the situation, where the queue is empty at the time origin. Conditioning on the time of the first event in the arrival process, v, which can be either an arrival of a job, or a change of the modulating state, we have the equation: (25) for i = 1, . . . , m, where:  v). This gives the second summand of (25). If the arriving job is rejected, then the conditional value of the mean waiting time is W 0,j (t − v), which gives the third summand of (25). We do not have to take into account the situation, when the first job arrives after t, because the mean waiting time at t is then 0. Administering the Laplace transform to both sides of (25) yields: Denoting: from (27) we get: w 0 (s) = V 0 (s)w 1 (s) + U 0 (s)w 0 (s).
As we can see, (21) and (30) constitute a system of linear equations with respect to w n (s), n = 0, . . . , N . After some easy algebra, its solution can be presented in the following explicite form.
Theorem 1: The transform of the mean waiting time at time t in a queue fed by MMPP and with rejection probabilities d(n) equals: 66538 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
and 0 is an m × m matrix of zeros. Note that matrices U 0 (s) and V 0 (s), occurring in the theorem above, are easy to compute directly from parameters of the model. On the other hand, matrices C i,j (s), R i,j (s) and H i,j (s) depend on probabilities A n,k,i,j (v), defined at the beginning of this section. Fortunately, these probabilities can be calculated using a result of [29] (see Theorem 1, page 107). Namely, it was proven that: where Finally, derivation of the stationary mean waiting time from Theorem 1 poses no problem. In particular, it is known that the limit of a function f (t) as t → ∞ is the same as the limit of sf (s) as s → 0+, where f (s) is the Laplace transform of f (t). Therefore, we get the following result.
Theorem 2: The stationary mean waiting time in a queue fed by MMPP and with rejection probabilities d(n) equals: where y(s) and B(s) are given in (34) and (35), respectively, while [·] 1 is the first entry of a column vector. Note that the stationary W does not depend on the initial queue size and the initial state of the modulating chain. Therefore, any other n and i could have been taken instead of 0 and 1 in (42).
To use Theorem 1 in practice, the Laplace transform should be inverted. All the numerical examples presented herein were obtained using the inversion method of [35], which is fast and accurate. Theorem 2 can be used directly, without inversion.

V. NUMERICAL EXAMPLES
If not stated otherwise, the following MMPP will be used in this section: of rate λ = 1.
The justification for choosing this parametrization of MMPP is that it produces a positive autocorrelation of traffic, which can be expected in networking. The autocorrelation provided by matrices (43) and (44) is moderate. It will be used as default. However, it can be easily strengthen or weakened multiplying Q by a positive number, if needed. This will be done in Section V-C, where in addition to parameterization (43) and (44), two other parameterizations will be considered, of much stronger and weaker autocorrelations, respectively.
If not stated otherwise, the following rejection probability function will be used: This will be altered in Section V-B, where five other, nonlinear rejection probability functions will be used in addition to (45). Finally, the distribution of the service time will be hyperexponential with parameters: (0.25, 0.75), (4.0, 0.8). It can be verified easily that this distribution has the mean of M = 1 and a moderate standard deviation of 1.17. Therefore, the system will be fully saturated: ρ = λM = 1.

A. IMPACT OF THE INITIAL STATE OF THE QUEUE
In Fig. 1, the mean waiting time is depicted as a function of time. The initial modulating state is 1 in every case, but different initial queue sizes are used, from 0 to 32.
As we could expect, the transient evolution of the system depends strongly on the initial size of the queue. However, after about 80s, the waiting time reaches the stationary value.
For comparison, in Fig. 2 the same initial queue sizes are used, but with the initial modulating state of 3. As we can see, different modulating state made the transient evolution quite different. However, the time of convergence to the stationary value is more or less the same in Figs. 1 and 2.
The impact of the initial modulating state on the evolution of the system can be studied further in Figs. 3 and 4. Namely, in Fig. 3 the mean waiting time is depicted as a function of time for all three modulating states, but unaltered initial queue VOLUME 11, 2023   size of 16. Similarly, in Fig. 4, the initial queue size of 32 is used in combination with different modulating states.
As we can observe in Figs. 1-4, the influence of initial modulating state on the transient evolution is stronger when the initial queue size is small or moderate, and weaker, when the initial queue size is high.
Finally, it is worth mentioning, that in many cases the transient evolution of the waiting time is non-monotonic. See, for instance, the curves for n = 8 and n = 16 in Fig. 1, almost   FIGURE 4. The mean waiting time for different initial modulating states and n = 32. all curves in Fig. 2, and all curves in Fig. 3. Moreover, both a local minimum or a local maximum can occur (Fig. 3).

B. IMPACT OF DROP PROBABILITIES
In the section above, the rejection probability function was not altered and had always the form of (45). In this section, we will use the following function instead: where x > 0 is a parameter and d(n) is given in (45). Parameter x has an easy interpretation -the smaller x, the quicker the system will start rejecting jobs and the more of them will be rejected. An vice versa -a large x is equivalent to small rejection probabilities. In Figs. 5, 6 and 7, the mean waiting time is depicted for six different rejection probability functions, namely d 0.25 , d 0.5 , d 1 , d 2 , d 4 and d 8 . The difference between Figs. 5, 6 and 7 is that Fig. 5 was obtained for the initial system state n = 0 and i = 1, Fig. 6 for the initial state n = 16 and i = 2, while Fig. 7 for the initial state n = 32 and i = 3.
As we can see in these figures, the convergence to the stationary value is slightly quicker, when a strong rejection function is used (e.g. d 0.25 ), and slightly slower, when a weak function is applied (e.g. d 8 ). The differences, however, are  not profound -in all cases the stationary value is reached somewhere between 60s and 100s.
Moreover, the evolution of the system in each figure is different, depending on the initial state of the queue, no matter which d x function was used.
Finally, we can see that for high values of x, the curves change less and less. This can be easily explained by the fact that if x → ∞, then the rejection scheme approaches the tail-drop scheme, i.e. the rejection probability becomes 0 for every n < 32 and 1 for n = 32. Therefore, the curves in Figs. 5, 6 and 7 converge to the tail-drop curve, when x grows.

C. IMPACT OF AUTOCORRELATION
So far, only the parameterization of the MMPP given in (43) and (44) was used. We will alter this in this section, to obtain different strengths of the autocorrelation. Namely, the following two other parameterizations will be used: MMPP': Q ′ = 10Q, ′ = , MMPP'': Q ′′ = Q/10, ′′ = , together with function d(n) from (45).
In Fig. 8, the autocorrelation functions for matrices Q ′ and Q ′′ are depicted, and accompanied by the original autocorrelation for Q. As we can see, autocorrelation for Q ′′ is strong and long term, while for Q ′ is weak and short term.  Autocorrelation for the original Q is moderate, somewhere between the two.
In Fig. 9 the mean waiting time is presented as a function t for weak autocorrelation (Q ′ ), several initial queue sizes and i = 2. Similarly, in Fig. 10 the mean waiting time is presented for strong autocorrelation (Q ′′ ), several initial queue sizes and i = 2.
When comparing Fig. 9 with Fig. 10, the first striking observation is the time of convergence to the stationary value. In the case of strong autocorrelation, the time of convergence is about 5 times longer (approx. 500s versus 100s). This was to be expected -the long term autocorrelation should cause such an effect.
An interesting and rather surprising observation can be made if we compare stationary waiting times for the three MMPP parameterizations considered so far. Namely, the stationary waiting times are 10.9, 9.2 and 6.9 for matrices Q ′ , Q and Q ′′ , i.e. for weak, moderate and strong autocorrelation, respectively. Note that other parameters of the system are not altered (the load, distribution of the service time and rejection probabilities).
This observation contradicts a naive belief that the stronger positive autocorrelation of traffic is, the worse all queueing performance characteristics are. In our case, the stronger autocorrelation, the better (shorter) stationary waiting time. This effect is, naturally, of great significance in networking, where we can always expect autocorrelated traffic.
The observed phenomenon can be explained by the interaction between traffic autocorrelation and rejection function, d(n). It is true that a strong, positive autocorrelation drives the queue size to grow. In a system without job rejections, we indeed would have higher queue sizes and waiting times for stronger autocorrelations. In our system, however, there are job rejections caused by d(n). Moreover, function d(n) rejects the more jobs, the longer the queue is. Therefore, the overall number of rejected jobs grows when autocorrelation gets stronger. In effect, the carried load of a system with the rejection mechanism decreases, even if the offered load, ρ, remains unaltered.
In other words, the reduced stationary waiting time for strongly correlated traffic comes with the price of increased number of rejected jobs.

VI. CONCLUSION
In this paper, an analysis of the waiting time was carried out in a queueing scheme, in which an arriving job can be denied service with probability relative to the queue size. Such queueing scheme can be found in computer networking, call centers and other customer service systems. It is also a generalization of the commonly used tail-drop scheme. Therefore, all the results presented herein can be used for the tail-drop queue as well, by applying function d(n)=0 for n < N and 1 otherwise.
A general model of the queue was used, with the arrival process of arbitrary interarrival time distribution and interarrival time autocorrelation, arbitrary distribution of the service time and job rejection probabilities. For this model, two theorems on the mean waiting time were proven -in the transient and stationary regime, respectively.
These theorems were illustrated via numerical examples, in which the dependence of the transient behaviour of the system on various parameters was depicted. In particular, it was shown how the initial state of the queue and the form of the rejection probability function influence the transient evolution of the mean waiting time.
A special consideration was given to autocorrelation of traffic. It was shown that strong autocorrelation may increase significantly the time of convergence to the stationary value. It was also demonstrated that strong autocorrelation may cause the mean stationary waiting time to be shorter, if compared with a weak autocorrelation case. This contradicts a naive belief that the stronger autocorrelation of traffic, the worse all queueing performance characteristics. Such a simplification is clearly not valid in queues with the rejection mechanism.
Both of these observations are important in networking, where we do have traffic autocorrelation. Namely, when the rejection mechanism is used, we can expect long non-stationary periods of operation of the system, as well as reduced waiting times, at the price of increased number of rejected jobs.