Adaptive Cloud-Based Extended Reality: Modeling and Optimization

Extended Reality (XR) — which includes Virtual Reality and Augmented Reality — promises to bring the virtual and telepresence experience to another level. Unfortunately, solutions leveraging these technologies require special high-performance computing platforms that degrade the cost-benefit balance. Moving processing to the cloud solves this problem but imposes strict requirements on data transmission reliability, bandwidth, and delays. The satisfaction of these requirements becomes an extremely challenging problem in the presence of other types of delay-sensitive traffic, such as remote control, industrial automation, or the control commands of the Cloud XR application itself. This article studies the joint service of the adaptive Cloud XR traffic with other high-priority delay-sensitive traffics. The paper develops an analytical model of the considered communication system. The model represents the system as a discrete state Markov chain and estimates the quality of experience for Cloud XR users in various scenarios. Using the model, the paper estimates the network capacity for the Cloud XR traffic and optimizes the bitrate adaptation function of the Cloud XR video streaming application. The goal of the optimization is to improve the visual quality of the virtual environment observed by the users, subject to the constrained probability of image impairments due to excessive delivery delays. Numerical results demonstrate the high accuracy of the developed model and the benefits provided by the optimization.


I. INTRODUCTION
Extended Reality (XR), which includes Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR), is one of the key technologies enabling virtual and telepresence. Numerous studies and emerging technological products reveal that XR technologies can be applied in various fields. For example, in education, highly immersive XR applications can improve the attention and interest of the students [1]. In medicine, XR applications can be used for clinical protocol testing and educational training [2]. In engineering, architecture, and geo-informational sciences, XR technologies simplify modeling, visualization, and analysis of largescale complex structures [3]- [6].
The associate editor coordinating the review of this manuscript and approving it for publication was Andrea F. Abate . However, the solutions leveraging XR non-tethered to a workstation require integrating special high-performance computing platforms into battery-powered XR-headsets. This simultaneously constrains the achievable visual quality, reduces the battery life, and increases the cost of the headsets [7]. A recent paradigm of Cloud XR moves most of the processing to the cloud and changes the system architecture as follows.
In Cloud XR [8], the headset does not render the virtual scene by itself, so it does not require expensive and power-consuming hardware. Instead, the headset captures the user's actions and sends the data to a remote server. The server renders the virtual XR scene according to the received data, encodes it into a video stream, and sends it back to the headset that shows the video to the user. Moving processing to the cloud makes headsets very cheap VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and reduces their weight and power consumption [9], but imposes strict requirements on data transmission reliability, bandwidth, and delays between the remote cloud server and the end-users [10]. Such an architecture might look fantastic even a decade ago, but today with multi-gigabit wired and wireless links, the architecture seems feasible [11] and is evaluated and deployed around the world [12]- [14]. Typically, XR applications are interactive. To provide an immersive experience, Cloud XR applications require minimal feedback delay and high image quality [15], [16]. Since the content is generated on-the-fly according to the actions of the user, each generated video frame shall be delivered to the headset with a limited delay. However, in the networks, video traffic may have lower priority than the other delaysensitive traffic: remote control, gaming, industrial automation. In addition to interference from these traffic types, video traffic transmissions can be affected by the control traffic of the XR application itself. Therefore, the amount of network resources available to the XR video stream can fluctuate with time, and some video frames may be delivered longer than others. To prevent playback interruptions and to ensure the maximal image quality, XR applications shall adaptively select the quality of the video stream in real-time [17]. However, the tight latency requirements and sporadic interference from higher-priority flows render it challenging for the adaptation algorithm to strike the right balance between resiliency and image quality.
The optimization of the Quality of Experience (QoE) for XR and Cloud XR applications received much attention from both researchers and industry. XR videos are panoramic, so novel approaches to efficiently represent and compress it were developed [18] and standardized [19]. Next, to increase the image refresh rate, different approaches to motion prediction and proactive video rendering and transmission were devised [20], [21].
Other important aspects of Cloud XR optimization lie in finding the right balance between the computations performed by the headset and the computations performed by the cloud or Mobile Edge Computing server [22] and optimizing the energy-efficiency of the cloud [23]- [25].
Finally, many papers propose different strategies to jointly optimize XR scene rendering, caching the proactively generated video frames and their transmission by reserving the network resources [26]- [28]. However, typically authors do not consider the structure of the XR video flows and the organization of the XR presentation at the headsets. Instead, they reduce the video stream to a series of requests that shall be processed under a fixed delay constraint. Such a traffic pattern is more relevant for WebXR services, e.g., [29]- [31]. Also, the typical assumption is that the throughput provided by the network to the XR flows is constant, or a sufficient amount of network resources can be reserved when needed. However, no interference from the other traffics in the network is considered. That is why such papers do not focus on the XR video quality adaptation.
In contrast to these papers, we look at the Cloud XR QoE optimization from a different perspective. We model the transmission of the XR data through the network and take into account the variation in its delivery rate caused by random interference from other high-priority traffics.
In the paper, we study the joint service of adaptive Cloud Extended Reality application traffic with other high-priority delay-sensitive traffics. The contribution of the paper is as follows.
First, to the best of our knowledge, we are first to design a mathematical model of an adaptive real-time Cloud XR application that allows evaluating QoE for XR in wired and wireless networks. For that, we take into account the following peculiarities of the traffic generated by XR applications often left out of consideration in the literature. 1) We consider a realistic client-side XR application design that employs a small jitter-buffer to smoothen the fluctuations in video frame delivery. 2) We consider that the XR traffic consists of two traffic types: high-priority control traffic and real-time adaptive XR video. 3) We take into account that the priority of XR video flows may be lower than that of other delay-sensitive traffic types.
We consider the average video bitrate and stalling probability as objective video QoE metrics because they provide a good trade-off between modeling accuracy and estimation complexity. Also, they are often considered in the literature [32]. For control traffic, we take into account the mean command delivery delay. Second, we use the developed model to estimate the capacity of a communication system for XR video flow. We define the capacity as the maximal average XR video bitrate for which its delay requirements can be met with a pre-defined probability. Finally, we use the model to find the XR video bitrate adaptation function that maximizes the capacity.
The rest of the paper is organized as follows. In Section II, we describe the Cloud XR system and introduce the problem statement. Section III reviews the relevant literature. Section IV describes the joint service model, how it can be used to estimate the network capacity for the XR video stream and to optimize the XR bitrate adaptation function. In Section V, we present and discuss the obtained numerical results. Finally, Section VI concludes the paper.

II. SYSTEM DESCRIPTION AND PROBLEM STATEMENT
A. SYSTEM DESCRIPTION Figure 1 presents a simplified architecture of the Cloud XR system considered in the paper. The XR scene is rendered at the cloud server, encoded into a video stream with a specific visual quality, and transmitted to the XR-headset frame-byframe. The headset presents the scene by playing the video to the user, captures her actions with the sensors, and generates the control commands for the server. The server receives the commands, updates the scene accordingly, and transmits the next frames to the headset. Such a workflow is typical for the existing cloud-based gaming and XR systems [10], [33], [34].
Because the scene changes according to the user actions, the sooner the commands are delivered, the less is the scene refresh delay perceived by the user. Therefore, in the network, the commands shall have higher priority than the video stream.
The video stream is the sequence of video frames generated by the server with the inter-frame interval T . The size of each frame is S = b · T , where b is the bitrate of the video stream. In the paper, we assume that the higher is the bitrate, the higher is the video quality, which is the typical case for a wellengineered streaming system [35].
The XR scene presentation at the headset is organized as the following video playback. The client builds up the initial playback buffer of K 0 video frames, and only after that, it starts the playback. The larger is K 0 , the more resilient is the playback to the variations in the frame delivery rate, but at the same time, the higher is the scene refresh delay perceived by the user. The client pulls one whole frame from the playback buffer every interval T . If there is no whole video frame in the buffer, the client does not play anything, and a video stall occurs. At this point, the virtual buffer of the server, i.e., the number of video frames not yet delivered to the client, contains K 0 frames. To reduce the load on the network and to keep the delay between frame generation and playback limited, the server discards the next generated frame. This way, the server and client buffer levels have a one-to-one correspondence. The pseudocode of the video playback algorithm is presented in Algorithm 1.
In the paper, we assume that the server knows the exact amount of data not yet delivered to the client. Since the server needs this information once in the inter-frame interval, it can be easily achieved in practice. For example, the client can send its current buffer level to the server with the control commands that shall be delivered with a small delay anyway. Alternatively, to obtain the most relevant information on the undelivered data, the server can use the dedicated control connections with the routers along the data transmission path [36].

Algorithm 1 XR Video Playback Algorithm 1:
K is the XR session duration (in inter-frame intervals) 2: rxFrames is the list of frames in playback buffer 3: ReceiveFrame is the function to receive data from the network 4: playbackStarted = False 5: while NOT playbackStarted do 6: frame = ReceiveFrame() 7: rxFrames.push(frame) 8: if length(rxFrames) >= K 0 then if IsFullFrame(frame) then 16: rxFrames.push(frame) 17: end if 18: if length(rxFrames) > 0 then 19: playbackFrame = rxFrames.pull() Play(playbackFrame) 24: end for In the network, in addition to the command traffic the other delay-sensitive traffic can have a higher priority than the video flows. The traffic can be generated by factory automation, telemedicine, autonomous and remote driving applications. Multiple sources with different patterns generate random high-priority traffic, and the user's actions causing the command traffic are unpredictable too. Therefore, we model the high-priority traffic as a Poisson flow of packets with the rate λ and a general packet size distribution.
In the paper, we assume that the high-priority traffic receives no interference from the XR video flow. Hence, we can estimate its QoS independently using well-known analytical results for M/G/1-type queueing systems (e.g., see [37], [38]).
The channel resources consumed by the high-priority traffic during an inter-frame interval change with time. Therefore, to reduce the probability of stalling and maximize the video quality, the bitrate of the video stream shall be adjusted adaptively according to the amount of channel resources remaining after servicing the high-priority traffic. Further, in the paper, we consider the following bitrate adaptation scheme, which is similar to the bitrate adaptation performed by the cloud gaming service Google Stadia [33]. Once in an inter-frame interval the server analyzes the current virtual buffer state and chooses the bitrate of the next generated frame from the discrete set Therefore, the bitrate adaptation function can be an arbitrary function of VOLUME 9, 2021 the buffer state. The pseudocode of the considered scheme is presented in Algorithm 2.

Algorithm 2 XR Bitrate Adaptation Scheme 1:
K is the XR session duration (in inter-frame intervals) 2: framesInTx is the list of non-delivered frames 3: GetBitrate is the bitrate adaptation function 4: GenerateFrame is the frame rendering function 5: for all k ∈ {1, . . . , K } do 6: for all frame ∈ framesInTx do 7: if IsFrameDelivered(frame) then 8: framesInTx.remove(frame) 9: end if 10: end for 11: bitrate = GetBitrate(framesInTx) 12: frame = GenerateFrame(bitrate) 13: SendFrame(frame) 14: framesInTx.push(frame) 15: end for To sum up, the considered system can be represented as a queuing system with two queues: 1) The M/G/1 queue with high-priority traffic (absolutepriority queue), 2) The D/G/1 queue with video frames (low-priority queue). We assume that the service provided to the XR video flow in one inter-frame interval of duration T is independent of the service in the previous intervals. Therefore, the evolution of the virtual server buffer state can be modeled with the discrete-time Markov chain with time unit T . The transition probabilities depend on the consumption of channel resources by the high-priority traffic during a single time unit and on the bitrate adaptation function.

B. PROBLEM STATEMENT
In the paper, we address the problem of QoE modeling and optimization for adaptive Cloud XR application traffic in the presence of interfering high-priority traffic in the network. In the considered scenario, the QoE of the Cloud XR applications can be reduced to the QoE of the real-time adaptive video streaming, which is a complex subjective metric. Its models usually take into account the factors from various parts of the video streaming system. They include the video encoder, its parameters (e.g., frame rate, frame structure, bitrate, etc.), the parameters of the network connection (e.g., capacity, delay, packet loss ratio), and the properties of the playback device (e.g., screen size and resolution and frame rate) [32].
In the paper, we focus on such QoS-derived QoE metrics as the average video bitrate and the stalling probability [39]. We choose these metrics because they provide a good tradeoff between the QoE modeling accuracy and the estimation complexity. We develop the model of the considered system and use it to estimate the QoE of the XR video flows for various buffer-state-based bitrate adaptation functions. We select a family of bitrate adaptation functions and use the model to find the optimal function. Namely, we find the function that maximizes the average video bitrate for a particular high-priority traffic rate and subject to a constrained stalling probability. We show that while the state of the network is not stationary and the rate of the high-priority traffic can change with time, we can obtain the optimal adaptation functions for a range of rates. Consequently, the server applies the bitrate adaptation functions according to the perceived high-priority traffic rate. We assume that this rate can be measured by the network and communicated to the server via a cross-layer protocol (e.g., [36], [40]).

III. LITERATURE REVIEW
In the paper, we model the QoE of the XR traffic as a QoE for the real-time (i.e., delay-sensitive) adaptive video streaming in the presence of the interfering high-priority traffic. We aim to develop an adaptation algorithm for the Cloud XR video streaming that will provide an optimal QoE. Therefore, in Section III-A, we review the prior arts in the video quality adaptation and show why we needed to develop our model. Because the model is based on the queueing theory, in Section III-B, we survey the relevant results from this area. We demonstrate that, despite it is a well-researched area, to the best of our knowledge, some of the M/G/1 queue characteristics we obtain in this article are novel.

A. VIDEO QUALITY ADAPTATION
Video quality adaptation is a well-researched area, and many papers develop different adaptation schemes in an attempt to address certain problems and improve users' QoE. However, most of the papers study the video quality adaptation problem in the framework of HTTP Adaptive Streaming (HAS) [39]. For example, a well-known algorithm is proposed in [41]. This algorithm is implemented in Dash.js, a reference opensource video player for the MPEG-DASH technology [42]. The algorithm optimizes a utility function of stall frequency and the average bitrate of the video stream. The authors show that this algorithm is optimal for infinite video streams. The paper [43] further improves the performance of the algorithm in the case of live HAS. Another well-known bitrate adaptation algorithm is proposed in [44]. This algorithm serves as a basis for the bitrate adaptation algorithm implemented by Netflix. Similar to the previous one, it uses the video buffer occupation as the main factor for the bitrate adaptation. A control-theoretic approach to the bitrate adaptation algorithm design is employed in [45]. The authors formulate the bitrate adaptation as a control problem and develop an algorithm aimed at reducing the video buffering while maximizing the video bitrate. With simulations, the authors show that the proposed algorithm outperforms the state-of-the-art ones.
These algorithms were mostly designed for video-ondemand streaming, where a client can pre-buffer a large amount of video to efficiently smoothen the network capacity fluctuations. However, the bitrate adaptation for real-time video streaming is an even more challenging task because the algorithms have a much lower policy space. So, many recent papers build artificial neural networks so that they could find the optimal bitrate adaptation scheme. In [46], the authors develop an algorithm for joint bitrate and buffer control for low-latency video streaming. In the proposed scheme, the algorithm dynamically adjusts both the bitrate of the downloaded video and the playback rate to reduce the probability of video stalling. With simulations, the authors show that the proposed algorithm provides higher QoE than state-of-the-art in low-latency streaming scenarios. Another neural-network-based algorithm is proposed in [47]. The algorithm is designed for video streaming for the remote control of unmanned aerial vehicles. The algorithm takes into account the fluctuations of the air-to-ground channel capacity and predicts the channel capacity to stream video appropriately. To further reduce the presentation latencies, the paper [48] proposes splitting the frames into subframes and encoding and sending them independently. This way, the client can receive and start decoding the parts of frames earlier. However, state-of-the-art video codecs do not support such a technique, so it is difficult to implement in practice.
Unfortunately, most of the papers analyze the performance of the algorithms with simulations or by considering the asymptotical cases. So, to develop an understanding of the QoE limits of the adaptive XR video streaming in the presence of high-priority interfering traffic, further we address the problem from the queueing theory perspective. Specifically, as stated in Section II, we need to find the probability distribution of the amount of resource consumption by the M/G/1 queue in a fixed time interval.

B. PRIORITY QUEUES AND M/G/1-TYPE SYSTEMS
The queueing systems with multiple priority queues have been well-studied in the literature for a rather long time. In [49], the authors investigate various ways to organize queues with priorities and model the transients for these queues. They consider a combination of two or more queues of type M/G/1. The authors calculate the distribution of the duration of the continuous busy period of the M/G/1 queue and the probability of the queue being busy at an arbitrary time moment. Also, as a straightforward derivation, they obtain the average resource consumption in the M/G/1 queue at the finite time interval. However, the occupation time distribution at the finite interval is not calculated, so the results of [49] are not applicable to solve the problem considered in our paper.
The paper [50] describes a mathematical model for joint service of web and MPEG-DASH video traffic. The system in consideration is a system with two queues where one is a high-priority M/D/1 queue of web-pages. The distribution of service-free time over a fixed length interval is calculated. However, in our case, the packet service time is not constant. Therefore, the distribution obtained in [50] cannot be applied to solve the problem.
In our model, the service provided to the low-priority D/G/1 XR video queue depends on the occupation time probability distribution of the high-priority M/G/1 queue during a short interval. Note that for the infinite observation interval, the server occupation time in an M/G/1 system has been studied in detail in the literature. Existing works provide wellknown methods to obtain such important long-term characteristics as the average waiting time [37], average queue length [51], average busy period duration [52]. However, these characteristics were not well-studied for the interval of finite duration.
Researchers have also considered various characteristics of the system at finite time intervals. In the classic work [53], the M/G/1 system was considered with an additional condition for customers entering the queue. If a customer arrives in the queue when the server is busy, it leaves the queue with a certain probability. In this case, the transient processes of the system were investigated, and service characteristics such as the distribution of virtual waiting time and the average duration of the busy period were obtained. The virtual waiting time is the time required to release the system from servicing requests that have arrived before a particular moment. Transient processes for the classic M/G/1 queue were investigated in the paper [54]. The authors also focused on the virtual waiting times. In this work, the time-dependent server-occupation probability and a virtual waiting period were obtained. However, the results for the virtual waiting period do not apply to the problem addressed in our paper.
The distribution of the busy period of the M/G/1 queue was obtained in paper [55]. The authors consider not only the states of the system when it is busy or free from servicing requests, but also the general case: when there are no more (or vice versa, more) than a certain number of requests in the system. For the time interval tending to infinity, asymptotic distributions of times spent by the system in these states have been obtained. Moreover, at the beginning of the time interval under consideration, only one boundary case is taken into account: the absence of requests. The resulting asymptotic distributions of the modified system [55] cannot be applied to our study because, in the considered system, the time intervals are short, and the use of asymptotic is not possible.
Although the probabilistic properties of the M/G/1 queues are well-studied, to the best of our knowledge, no method exists to calculate the required finite interval occupation time distribution for the M/G/1 queue. So, we develop such a method in Section IV-B.

IV. ANALYTICAL MODEL
This section develops an analytical model of the heterogeneous traffic service: real-time adaptive video traffic, namely, XR scene streaming, and control traffic. In particular, in Section IV-A, we discuss the design of the Markov chain modeling the video buffer state evolution. In Section IV-B, we estimate the probability distribution function of the resource consumption by the high-priority traffic. Finally, in Section IV-C, we describe the proposed bitrate adaptation VOLUME 9, 2021 algorithm optimization framework. Table 1 summarizes the main notations used in the model.

A. VIDEO BUFFER STATE EVOLUTION
We define S as a set of all possible states of the server virtual buffer (hereinafter, the buffer) and describe each state S ∈ S with a (K 0 + 1)-dimensional vector S = (j, i 1 , . . . , i K 0 ). Here, the first index, j, indicates q j ∈ Q = {q 0 , q 1 , . . . , q N q }, where q j is the discretized fraction of the last frame partially received by the client and 0 = q 0 ≤ q 1 ≤ . . . ≤ q N q = 1. The indices i r are the indices of bitrates b i r ∈ B of the video frames in the buffer.
Assuming that the service provided to the video flow in one inter-frame interval of duration T is independent of the service in the previous intervals, we can describe the buffer state evolution with a discrete-time Markov chain with the time unit T . The states of the chain are the states of the buffer at the time moments of the next video frame bitrate choice. The chain is aperiodic and irreducible, which leads to its ergodicity and the existence of stationary distribution π(S).
where R is the rate of data transmission. We derive e(t) in Section IV-B. We fix a certain state S of the considered Markov chain and estimate the buffer level V (S) in the state S as: This amount of transmitted data depends on the time free from serving a high-priority queue during a period T .
The discretization of the share q ∈ Q allows us to estimate the probability of transition from the state S to the state M i as the probability of transmission of an arbitrary number of bytes from the interval including V M i S . Such intervals should not intersect and should cover the whole set [0, V (S)]. Therefore, the boundaries of the intervals [m(i − 1), m(i)] are Thus, we obtain the transition probability from the state S to the state M i : Finally, we obtain the stationary probability distribution π by solving the following system of linear equations: A stall during playback occurs if the buffer contains K 0 video frames. Therefore, to estimate the stall probability P stall , we need to assess the probability that the chain is in the states with i K 0 = 0: The average bitrate B av of the video is estimated as:

B. OCCUPATION TIME DISTRIBUTION OF THE M/G/1 SYSTEM IN A FIXED TIME INTERVAL
System occupancy can be described with an ON/OFF process generated by the alternating busy and empty periods of the M/G/1 queue. Without loss of generality, we may consider that the interval T starts at time 0. We consider the following stages of the system evolution during the interval: • The Starting Stage: The starting stage is a part of the busy period of the system that starts before time 0 and spans over the beginning of the time interval.
• The Main Stage: The main stage is an ON/OFF process with the boundary condition: the stage starts with an OFF-period.

1) DISTRIBUTION OF ON-AND OFF-PERIOD DURATION
where F * (s) is the Laplace-Stieltjes transform of the service time c.d.f. F(t).
Let t OFF (t) = δ(t) and δ(t) is the Dirac delta function.

2) OCCUPATION TIME AT THE MAIN STAGE
The main stage starts with an OFF-period and has a duration T <= T . We calculate the distribution of the occupation time of such a stage.
The probabilities that k−1 OFF(ON)-periods have the total duration less than T and k OFF(ON)-periods have the total duration longer than T are evaluated as: For k = 1, the expressions take the following forms: ON > T . Then ∀k ∈ N, we can calculate p idle k , the probability that k ON-periods have a total duration of x and k OFF-periods have a total duration of less than T − x, and k + 1 OFFperiods have a total duration of more than T − x. In other words, exactly time x of the main stage is occupied and the system is idle at the end of the main stage: Similarly, we calculate p busy k , the probability that k OFFperiods have a total duration of T −x and (k −1) ON-periods have a total duration of less than x, and k ON-periods have a total duration of more than x. In other words, exactly time x is occupied and the system is busy at the end of the main stage: Finally, we obtain the occupation time p.d.f. for the main stage of duration T . We denote it as p T (x):

3) DISTRIBUTION OF THE STARTING STAGE DURATION
The starting stage duration is the remaining duration of an ON-period of an ON/OFF process from an equiprobably chosen starting point t 0 = 0. In [57], we prove that this remaining duration has the following p.d.f.: where F ON (t) is the c.d.f. of the ON-period duration and < T ON > is its expected value. The starting stage can be absent, so let us denote the probability of its presence as P border . For the considered ON/OFF process, this probability equals the probability that an arbitrary random point on the time axis belongs to the ONperiod. So, it can be estimated as the share of time when the system is occupied: where < T OFF >= 1 λ is the average duration of the OFFperiod.
Finally, we evaluate P start>T , the probability that a starting stage lasts for the entire time interval T :

4) SYSTEM OCCUPATION TIME DISTRIBUTION
We estimate e(t), the probability that the system is occupied for t, 0 ≤ t ≤ T , by considering the contributions of the following system evolution cases: 1) Only starting stage is present: 2) Both starting and main stages are present: 3) Only the main stage is present: By summing the above, we obtain: Using e(t), the p.d.f. h(x) of the event that inside an interval T there is sufficient free time for video transmission of x bytes can be found with (1). This function is used to obtain the transition probabilities p(S, M i ) with (3) and, subsequently, the stationary distribution of Markov chain π(S). Finally, it allows us to estimate the stall probability (4) and the average bitrate (5).

C. OPTIMIZATION OF THE BITRATE ADAPTATION FUNCTION
We use the developed model to optimize the video bitrate adaptation function. The function can take as an input the buffer level or more detailed information on the buffer state: the number of video frames in the buffer, their average bitrate, etc.
Let us consider a function space F of bitrate adaptation functions. The average bitrate and the stalling probability can be defined as functions of the scenario parameters and the bitrate adaptation function: B av = B av (B, G, λ, . . .) and P stall = P stall (B, G, λ, . . .), where B ∈ F. Thus, we can introduce the following optimization problem. For particular scenario parameters, we need to find such bitrate adaptation function that maximizes the average bitrate of the video stream and guarantees the limited video stalling probability θ: In the paper, we consider a space of piecewise constant bitrate adaptation functions. We define the current buffer level U as the total amount of bytes not yet delivered to the client. Then, we can define the bitrate adaptation function as The pseudocode of the considered bitrate adaptation function is presented in Algorithm 3.
The output of the optimization is such a piecewise constant bitrate adaptation function defined by the set ( B opt , U opt ) for which We carry out the optimization numerically and discuss its results in the next section.

V. NUMERICAL RESULTS
In this section, we use the well-known network simulation platform ns-3 [58] to validate the model and demonstrate the results of the developed method of bitrate adaptation function optimization. In Section V-A, we describe the considered scenario. Then, in Section V-B, we use the model to estimate the network capacity for the XR video. In Section V-C, we present the results of the bitrate adaptation function optimization. Finally, in Section V-D, we present and discuss the for all frame ∈ framesInTx do 6: U + = frame.size() 7: end for 8: return B i * 11: end function results of the model validation and compare QoE provided by the optimized bitrate adaptation function with one of the state-of-the-art.

A. SCENARIO
We consider a basic Cloud XR scenario with an XR-user playing an XR-game at home. To provide freedom of movement and, thus, an immersive experience to the user, the headset uses Wi-Fi and connects to a remote Cloud XR server via a Wi-Fi access point. The access point has a wired connection with the Cloud XR server. To minimize the feedback delay, the XR-headset sends the commands to the server using channel access parameters corresponding to high priority access category AC_VO. To avoid interference with the commands, the AP sends video frames to the headset using the low priority access category AC_BK.
We model the high-priority command traffic as a stream of packets or bursts of packets with an exponentially distributed size in bytes with mean µ. We choose the average command size µ = 90 kB, so that, in the considered scenario, the average transmission time of one command is 1 ms. The rate of commands varies in the range [0.05, 0.3] ms −1 . The server generates an XR video stream with a bitrate in the range of 8 to 72 Mbps. This corresponds to the image visual quality ranging from a typical home cinema FullHD to a UHD panoramic video with a high frame rate. The period of video frame generation is T = 15 ms. Other scenario parameters are given in Table 2.
Using both the developed model and simulations, we estimate the QoE of the XR user with the following metrics.

B. ESTIMATION OF THE NETWORK CAPACITY FOR CLOUD XR
In this section, we use the model to estimate the network capacity for non-adaptive XR video stream in the considered scenario. Specifically, for a given high-priority traffic intensity, we find the maximal XR video bitrate, for which the stalling probability is less than θ = 0.01. To illustrate the advantages of the developed model, we compare its results with the following network capacity estimation representing the average channel capacity available to the video stream: where C is the average capacity of the channel between the headset and the access point (C = 75 Mbps in the considered scenario). Figure 2 presents the network capacity for the XR video stream (i.e., the bitrate of the video stream) for each command rate. The results show that the actual capacity of the network for the XR video is up to 50% lower than we can obtain with (11). This happens because eq. (11) considers the average values instead of the probability distributions. The difference between the actual capacity and C 0 increases with λ because the variance of the resource consumption by the commands grows with the command rate. Also, this significantly increases the probability that a video frame is not delivered during K 0 inter-frame intervals. Consequently, Figure 3 shows that XR video streams with bitrate C (0) have up to eight times higher stalling probability than the considered QoE requirement θ.

C. BITRATE ADAPTATION FUNCTION OPTIMIZATION
In this section, we use the model to find the optimal for the considered scenario bitrate adaptation functions from the space of piecewise constant functions with the number of bitrate levels N = 4. The admissible bitrates and relative buffer levels are chosen from the sets B pool = [8, 16, 24, . . . , 72] Mbps and U pool = [0, 0.1, 0.2, . . . , 1] respectively. We set the constraint on the stalling probability θ = 0.01, i.e., the optimal bitrate adaptation algorithm shall maximize the average video bitrate while losing less than 1% of frames.
Taking into account that the bitrate adaptation functions of the relative buffer level should be non-increasing, we search through the function space. For each of the bitrate adaptation functions, we estimate the average video bitrate and stall probability and determine the optimal combination ( B (opt) , U (opt) ) for each command rate according to (10). The considered optimization scheme requires calculating the model with N eval = C N sets of parameters, where | · | represents the cardinality of the set. In turn, a single model calculation requires solving the linear system of N eq = ( K 0 i=1 N i ) · |Q| + 1 equations. Although the resulting optimization problem appears to be rather complex, in practice, we do not have to solve it online, because we can pre-calculate the optimal functions for a range of scenarios.  λ=0.2 chooses rather a low bitrate in advance (when U = 0.6) and does not decrease it further. However, if the number of function steps was larger, at U = 0.6, the optimal function would choose a higher bitrate but would decrease it further at a larger U .
To find an optimal bitrate adaptation function for the range of command rates, we aggregate the optimal bitrate adaptation functions for each λ and obtain B (opt) (U ) = B (opt) λ (U ).

D. COMPARISON OF VARIOUS BITRATE ADAPTATION ALGORITHMS
We consider the following adaptation algorithms: 1) A_OPT : the adaptation algorithm described in Section IV-C with the bitrate adaptation function B (opt) (U ). 2) A3: the adaptation algorithm described in Section IV-C with the bitrate adaptation function B (opt) λ=0.3 (U ). We consider this bitrate adaptation function because it shall provide satisfactory stall probability in the considered range of command rates, but it requires finding a much smaller number of optimal parameters in comparison to the A_OPT algorithm. 3) BOLA: the algorithm developed in paper [43] and adapted for the case of real-time adaptive streaming. The parameters of the algorithm are set according to its reference implementation in [42]. Unlike other considered algorithms, we do not constrain the set of the 35296 VOLUME 9, 2021 bitrate levels for BOLA, so it can choose from the whole set B pool . We compare the analytical and simulation results for algorithms A3 and A_OPT, but BOLA algorithm in live scenarios requires measuring the network throughput, so we estimate its performance only with simulations. Figures 5 and 6 present the target XR QoE metrics: the average bitrate and stall probability.
Let us start with the accuracy of the developed model. The figures show that at low rates of high-priority traffic (up to λ = 0.2), the developed model accurately describes QoE for the XR video stream. However, as the rate increases, the model starts underestimating the probability of stalling and accordingly overestimating the average bitrate of the video stream. This happens because the Markov property of the system disappears, i.e., the independence of the evolution of the video buffer from the consumption of resources by highpriority traffic in the previous periods of the video frame generation is lost.
Let us consider the probability P frame (T ) that a command is delivered longer than the video inter-frame interval T , and the command is not displayed in the next generated frame. For the considered M/M/1 high-priority traffic model, this probability is calculated in [60]. It appears that at rates λ > 0.23, for the considered system P frame (T ) ≥ 10 −5 , which is the typical Packet Loss Ratio (PLR) requirement for the URLLC traffic [61]. So, the system shall not be used in such a regime. Now let us analyze the QoE provided by different bitrate adaptation algorithms. We can see that the algorithm A_OPT allows significantly increasing the video bitrate at low rates of the high-priority traffic, and at the same time, fulfills the required stalling probability constraint. The provided results demonstrate the importance of optimizing the selection of bitrate for each high-priority traffic rate. While A3 fulfills the constraints on the probability of stalling at all considered rates, at low rates, it provides a 30% lower average bitrate than the optimal algorithm. As for the BOLA, at low rates of high-priority traffic, the algorithm acts too conservatively and provides a lower average bitrate than the optimal algorithm. At the same time, at high rates, it cannot adapt to the fast changes of the network state and provides too high stall probability and up to 2 times lower average bitrate. So, we can conclude that BOLA cannot provide satisfactory QoE for Cloud XR streaming.

VI. CONCLUSION
In Cloud XR technology, rendering of the virtual scenes is performed at the remote server instead of the headsets. The server encodes the scenes into a real-time video stream and sends it to the headset. To provide higher QoE to the endusers, Cloud XR applications need to adapt to the changes in the network conditions and dynamically adjust the bitrate of the generated video stream. However, tight delay and highreliability requirements significantly squeeze the room for bitrate adaptation optimization.
In the paper, we first studied analytically the problem of adaptive Cloud XR streaming in the presence of other types of high-priority real-time traffic, including the control traffic generated by the Cloud XR application itself. We designed a novel mathematical model of a real-time adaptive Cloud XR application. The model enabled us to estimate such QoE metrics for Cloud XR video stream as average bitrate and stall probability for a wide class of high-priority traffics: Poisson flow of packets with a general size distribution. With the model, we estimated the capacity of a communication network for Cloud XR video stream and found an optimal Cloud XR video bitrate adaptation function that maximizes the capacity. We considered the capacity as the maximal average XR video bitrate for which the stalling probability is below the pre-defined threshold given the load imposed on the network by the high-priority traffic. Note that because of the considered strict-priority service policy, the highpriority traffic was not affected by the XR video flow. With simulations, we demonstrated the accuracy of the model in estimating the target QoE metrics in the relevant range of scenarios. Finally, the simulations showed that, in the considered scenario, the optimal bitrate adaptation function provides up to 2 times higher average bitrate than one of the state-of-theart while keeping the stalling probability below the required constraint.
We see multiple possible extensions and applications of the developed model. First, the model can be generalized to a more realistic and complicated high-priority traffic pattern. Second, the model can be extended to take into account the peculiarities of the video encoding: different types of frames in the video stream, their sizes, and their impact on the QoE. Third, the model can be extended to take into account the interference from the other video flows. Finally, a more advanced optimization technique can be applied to reduce the computational complexity of finding the optimal bitrate adaptation function for the particular network state. With such a technique, we can perform optimization with a greater granularity and provide higher QoE to the end-users.
MIKHAIL LIUBOGOSHCHEV (Member, IEEE) received the B.S. and M.S. degrees in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 2017 and 2019, respectively, where he is currently pursuing the Ph.D. degree in telecommunications. He is currently a Researcher with the Network Protocols Research Laboratory, Kharkevich Institute for Information Transmission, Russian Academy of Sciences, and the Wireless Networks Laboratory, Kharkevich Institute for Information Transmission, Russian Academy of Sciences, since 2016 and 2018, respectively. His research interests include 5G and beyond wireless systems, QoS-aware cross-layer optimization, and stochastic network modeling and optimization. He participates in national and international projects and does research within the framework of joint research projects with the leading telecommunication companies.
KAMILA RAGIMOVA was born in Dubna, Russia. She received the B.Sc. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 2020. She is currently pursuing the M.Sc. degree with the Higher School of Economics, Moscow. From 2018 to 2020, she was an Intern with the Wireless Networks Laboratory, Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences. Her research interests include the analysis of the video-on-demand streaming services and modeling of adaptive virtual reality applications.
ANDREY LYAKHOV (Member, IEEE) is currently a Full Professor, the Deputy Director, and the Head of the Network Protocols Research Laboratory, Institute for Information Transmission Problems, Russian Academy of Sciences. He has more than 20 years of experience in Wi-Fi networks design and performance evaluation. He has authored three monographs, more than 100 articles cited in Scopus, and has ten patents. His main research interests include design and analysis of wireless network protocols, wireless network performance evaluation methods, and stochastic modeling of wireless networks based on random multiple access. He was a member of technical and program committees of large IT conferences (ICC, MACOM, MobiHoc, Networking, and MASS) and the General Chair of IEEE BlackSeaCom 2019 and WiFlex 2013. He was a recipient of many international and Russian awards. He led many joint research projects with top telecommunication companies and collaborative projects (e.g., FP7 ICT Collaborative Project ''Flexible Architecture for Virtualizable wireless future Internet Access (FLAVIA)" from 2010 to 2012).
SIYU TANG received the M.Sc. and Ph.D. degrees in electrical engineering from the Delft University of Technology, The Netherlands, in 2006 and 2010, respectively. Since then, she has been with Bell Labs, Alcatel-Lucent (later merged with Nokia), Antwerp, Belgium, working on novel algorithms and network protocols for ultra-low latency networks. In 2017, she joined the Huawei Munich Research Center, Germany, as a Principal Researcher, working in the field of telecommunications networks (e.g., future Internet architecture, next-generation network protocols) and industrial communication networks (e.g., time sensitive networking and DetNet). Her expertise is to apply queuing theories, stochastic modeling methodologies and control theories in communication networks to improve their performance, stability, and connectivity.
EVGENY KHOROV (Senior Member, IEEE) is currently the Head of the Wireless Networks Laboratory, Institute for Information Transmission Problems, Russian Academy of Sciences. He has led dozens of national and international projects sponsored by academic funds and industry. Being a voting member of IEEE 802.11, he has contributed to the 802.11ax standard as well as to the real-time applications TIG with many proposals. He has authored more than 100 articles. His main research interests include 5G and beyond wireless systems, next-generation Wi-Fi, protocol design, and QoS-aware cross-layer optimization. He was a recipient of the Russian Government Award in Science, several Best Papers Awards, and the Scopus Award Russia 2018. In 2015, 2017, and 2018, Huawei RRC awarded him as the Best Cooperation Project Leader. He gives tutorials and participates in panels at large IEEE events. He chairs TPC of the IEEE GLOBECOM 2018 CA5GS Workshop and IEEE BLACKSEACOM 2019. He was awarded as the Editor of the Year 2020 of Ad Hoc Networks. VOLUME 9, 2021