A QoS Improving Downlink Scheduling Scheme for Slicing in 5G Radio Access Network (RAN)

The 5G standard is aimed at supporting Quality of Service (QoS)-constrained traffic types, enabling new services to be reliably built into scenarios such as industrial automation and smart cities. The support comes via a strong emphasis on resource virtualization in the form of slices. Due to the strong QoS constraints of each slice, determining how to actually split the radio resources among different slices, while considering simultaneously the priority of slices, network efficiency, and each slice's target QoS, is very challenging. In this paper, we propose a radio resource scheduling scheme, designed on the basis of a strong theoretical analysis, to address the challenges. We formulate a Chance-constrained optimum resource allocation problem, which is then converted into a low complexity deterministic knapsack problem utilizing the concept of effective bandwidth. The performance analysis proves that our proposal is better in efficiency than the existing schemes, under different network conditions and QoS constraints. Results clearly show the effectiveness of our scheme in the considered 5G scenarios.

A QoS Improving Downlink Scheduling Scheme for Slicing in 5G Radio Access Network (RAN) Manoj Kumar Rana , Tommaso Pecorella , Senior Member, IEEE, Bhaskar Sardar , Rama Rao Thipparaju , Senior Member, IEEE, and Debashis Saha , Senior Member, IEEE Abstract-The 5G standard is aimed at supporting Quality of Service (QoS)-constrained traffic types, enabling new services to be reliably built into scenarios such as industrial automation and smart cities.The support comes via a strong emphasis on resource virtualization in the form of slices.Due to the strong QoS constraints of each slice, determining how to actually split the radio resources among different slices, while considering simultaneously the priority of slices, network efficiency, and each slice's target QoS, is very challenging.In this paper, we propose a radio resource scheduling scheme, designed on the basis of a strong theoretical analysis, to address the challenges.We formulate a Chance-constrained optimum resource allocation problem, which is then converted into a low complexity deterministic knapsack problem utilizing the concept of effective bandwidth.The performance analysis proves that our proposal is better in efficiency than the existing schemes, under different network conditions and QoS constraints.Results clearly show the effectiveness of our scheme in the considered 5G scenarios.Index Terms-5G, network slicing, priority scheduling, qualityof-service (QoS), resource management.

I. INTRODUCTION
I N THE last decade, a novel technical evolution known as virtualization has deeply influenced the modern cellular systems.Evidently today a new operator can hardly deploy a full, greenfield nation-wide infrastructure.Instead, it is a common practice to create virtual operators (aka tenants) using the physical infrastructure of one or more telecommunication pipe providers.Virtualization allows a greater flexibility in the core network, enabling sharing of the core network resources Manoj Kumar Rana is with the Department of Computing Technologies, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur 603203, India (e-mail: manoj24.rana@gmail.com).
Bhaskar Sardar is with the Department of Information Technology, Jadavpur University, Kolkata 700032, India (e-mail: bhaskargit@yahoo.co.in).
Rama Rao Thipparaju is with the Department of Electronics and Communication Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur 603203, India (e-mail: ramaraotr@gmail.com).
Debashis Saha is with the Management Information Systems Group, Indian Institute of Management Calcutta, Kolkata 700104, India (e-mail: ds@iimcal.ac.in).
Digital Object Identifier 10.1109/TVT.2023.3327874among different tenants.This step in the virtualization process, fully embraced in the 5G architecture, is called network slicing [1].In this approach, the network resources are not anymore owned by the tenants.Instead, it is seen as a resource pool, administrated by a manager super-party, for a set of tenants.The allocation of slices to the tenants is dynamic, and each tenant is characterized by its own target QoS.Depending on the tenant's goal and service model, different network slicing models are possible in the 5G architecture [2].It is interesting to observe that due to non-constant traffic volume of each tenant, the amount of resources allocated to each slice must also be dynamically adjusted, taking into consideration simultaneously the network profit and each tenant's target QoS.As a consequence, slicing at inter-cell interference coordination level or at packet scheduling level [3] is more suitable in this case.One of the main problems in packet scheduling is it's inability to fulfil the required end-to-end QoS constraints, mainly because the delays introduced in the Internet are random and not predictable.This is in contrast with the classical approaches, which consider mainly the user movements [4] or the wireless link variability [5].
The current state of the art makes use of a common packet scheduling function for all the slices in a cell mostly, without enforcing slice-specific treatment much.Few existing schemes use fixed prioritization to deal with heterogeneous slice types (see Section II).But the Internet induced random delay and load variation result into poor services for the slices [1], [3].It is obvious that fixed prioritization will always decrease the performance of a higher QoS-driven application while running under a low priority slice (e.g., a live video streaming running in a simple video slice).On the other hand, conventional dedicated resource reservation policy ensures minimum resources for every slice and minimizes the effect of random delay and load variation.But, the 3rd Generation Partnership Project (3GPP) is of the opinion that 5G spectral efficiency can hardly be achieved this way.Flexible resource block configuration may improve spectral efficiency; but it is a very costly technique in terms of power consumption, delay and traffic overhead in the core.
The scheme proposed in this paper considers multiple slices with heterogeneous target QoS demands.The goal of our proposed scheme is set to provide optimum performance of each slice while strictly maintaining QoS constraints in the presence of traffic dynamics.We have realized the goal by formulating a Chance-constrained optimum resource allocation problem [6].Then we have extended the concept of effective bandwidth [7] to convert the problem into a deterministic knapsack problem.Our proposed scheme ensures strict QoS demand fulfilment of every slice even in high load situation, allowing each slice to accommodate a large number of users.With the proposed method, Radio Access Network (RAN) slicing becomes possible without employing any costly static allocation process.In our solution, the amount of resources to be reserved for each slice is dynamic and is calculated to satisfy the objective of optimum resources allocation with slice priority in consideration.
We have studied the performance of our proposed scheme in 5G-LENA [8], a New Radio (NR) network simulator, designed as a pluggable module to network simulator 3 (ns-3) project.We have compared our proposed scheme with the most relevant extant schemes which can handle heterogeneous QoS demanding slices, such as QoS aware schemes like Frame level scheduler (FLS) [9], Intelligent resource scheduling strategy (iRSS) [10] and Configuration-based assignment and packing (CBAP) algorithm [11].Results clearly demonstrate that our scheme improves significantly the performance of every slice in terms of goodput and packet loss ratio (PLR) with respect to the existing schemes under different network conditions.Here, PLR represents the ratio of the number of lost packets to the total number of sent packets.A packet is may be either damaged due to a bad network condition or expired due to crossing its delay or inter-packet delay threshold, both of which are considered lost.We have shown that the priority of a slice can be controlled according to the current load and QoS demand of different applications, running under the slice.We have also demonstrated that the performance of a high priority slice is not affected even when the load of a low priority slice is increased, ensuring that slice prioritization is fulfilled always.
The main contributions of this paper are summarized as follows.
r Unlike existing works, we have considered both end-to-end and inter-packet delay as QoS parameters.Our scheduler guarantees to schedule a packet before any of these QoS parameters expire resulting in improved PLR and throughput compared to the existing schemes.
r We have proposed a novel dynamic prioritization scheme to control the service level of slices at each scheduling interval.This will allow the tenant operators to implement fine-grained policies for their designated slices.Also, to avoid starvation, traffic with varying priorities is scheduled in a non-sequential manner.Very few schemes of this kind exist, but they fail to maintain the service requirement for individual slices.
r We have used dissimilar packet sizes, Resource Block (RB) structures, and Next Generation NodeB (gNB)) configurations in our problem formulation.Accordingly, we have simulated a 5G heterogeneous network with overlapped macro and micro cells (operating at mm-wave bands).The proposed scheme based on this typical 5G network scenario is very useful for 4G network operators who want to deploy 5G networks in an incremental fashion.r We have converted a complex NP-hard scheduling problem to a simple, deterministic knapsack problem that can be solved using a linear time complexity-based approach.On the other hand, the machine learning and reinforcement learning-based approaches used in existing literature are avoided due to their slow convergence and lack of support for specific QoS constraints.The rest of the paper is organized as follows.In Section II, current state of the art on resource scheduling is presented.Section III illustrates the system model of our proposed scheme.In Section IV, our proposed scheme is designed.Performance analysis by simulation results is reported in Section V, and finally, conclusions are drawn in Section VI.

II. RELATED WORKS
In the RAN slicing model, the virtual RAN infrastructure provider arranges all spectrum resources into a pool of carriers (ranging from 1.4 MHz to 400 MHz).The carriers are distributed for all the RANs on tenant-basis [25], where slicing is adopted at the spectrum planning level.The tenant-basis distribution assigns distinct set of carriers to every tenant across cells.In this way, tenant-specific function (e.g., scheduling algorithm) and policy can be applied.A complex resource scheduling scheme is needed to achieve tenant-specific QoS goals, especially for what concerns traffic prioritization.
Conventional scheduling schemes [12], working for single slice, are irrelevant here.These schemes include Blind Equal Throughput (BET), Proportional fair (PF), Exp rule, Log rule, Fair allocation high throughput (FAHT), etc.These approaches are exclusively designed for similar QoS demanding flows.As a consequence, they cannot be directly used as multi-slice schedulers in the 5G network.
In a network slicing scenario, a tenant will request one or more slices, and each slice will carry one or more traffic flows having dissimilar QoS requirements, e.g., VoIP, enhanced mobile broadband (eMBB), massive machine type communication (mMTC), etc. Depending upon the execution process of different priority slices, the existing radio resource scheduling schemes, as shown in Table I, can be categorized as: sequential, semi-sequential and non-sequential.
In the sequential approach, lower priority slice is scheduled after scheduling the higher priority one with the combination of the above-mentioned basic schemes.The Exponential PF (Ex-PF) and Modified largest weighted delay first (M-LWDF) schemes deal with Real-time (RT) and Non-Real-Time (NRT) flows differently [12].These schemes schedule the first one by PF and the second one by a modified PF scheme, considering Head-of-line (HOL) delay and PLR as input parameters.
Similarly, the QoS-oriented time and frequency domain packet scheduler (QTFDPS), proposed in [13], uses BET and PF for the NRT and RT flows, respectively.Another approach, Pricingaware resource scheduling (PARS), proposed in [14], uses two advanced schemes: FLS [9] and M-LWDF [12] to increase operator's revenue in case of heterogeneous subscription level (in terms of price) of users within the same slice.
In the above-mentioned schemes, scheduling is only considered between RT and NRT, but in 5G, NRT flows are expected to be low in number.Two-level Virtual Scheduler (TVS), proposed in [15], and RAN slicing with EDF slice scheduling (RSESS), proposed in [16], consider multiple RT slices, but resource scheduling is still sequential.
Another two novel approaches, the Optimal coverage and rate demand fulfillment (OCRDF) scheme and its extension, proposed in [17] and [18], respectively, target optimal allocation of resources among slices while their data rate demands are heterogeneous with each other.Although the authors take into account UE's location and channel condition to determine the data rate demand of a slice, the variation of demands at different time intervals due to uneven arrivals of packets even for the same set of UEs is not considered.Moreover, the absence of an appropriate strategy to handle QoS parameters, such as delay and inter-packet delay, sometimes results in scheduling precious resources to expired packets, and even in the worst situations, packets that are going to expire soon may be delayed while those that have sufficient time to expire are scheduled beforehand.Above all, these two schemes are also sequential in nature, as the lower priority slice is scheduled after the higher priority one.
The sequential approaches are efficient in under-loaded situations, whereas in moderate or overloaded conditions, the flows of low priority slices compete with each other, thereby getting very poor service; so these approaches are not suitable for 5G networks.
The semi-sequential approaches usually work sequentially but for some specific network conditions, where they work non-sequentially.In [19], a joint RT and NRT sliced packet scheduling and RB allocation (JRNSPSA) scheme use Exp rule [12] as the base scheduling strategy.When RT and NRT both enjoy medium or good channel condition, the scheme works sequentially.However, a NRT slice with good channel condition is scheduled before a RT slice with very bad channel condition.In [20], the Packet prediction mechanism (PPM) scheme is proposed, where a user is going to cross its PLR threshold, a packet of another user having PLR lower than its threshold can be delayed.Although it is a sequential approach, the delaying process always gives some room for the low priority slices over the higher ones and so, it acts as a non-sequential approach.Another approach, delay-aware resource management algorithm (DARMA), proposed in [21], preserves some RBs for RTs before scheduling.During scheduling of the RT slice, if some of the packets are very close to expiration, it takes a fraction of RBs from NRT slice.The conservation of RBs for NRTs makes this scheme non-sequential.However, high density of RTs does not leave any RBs for NRTs.PPM may be the best as it executes non-sequentially most of the time and improves in PLR but cannot satisfy high requirement of throughput and inter-packet delay constraints.
One of the basic non-sequential approaches is the two-level downlink scheduler for RT multimedia services, FLS is proposed in [9].It is based on discrete time linear control theory.It significantly improves the throughput, PLR, fairness, and Quality of Experience (QoE).Though it considers end-to-end delay, it ignores inter-packet delay, and does not guarantee the packet loss of higher priority slices while serving lower priority slices.A slice-based non-sequential approach, iRSS is proposed in [10] where optimum resource utilization is done by intelligent prediction of current resource block allocation from the past statistics through collaborative learning.Although the deviation of their prediction from the actual allocation is minimized, they do not consider heterogeneous QoS.
A delay-sensitive cell-level approach [11] proposed the CBAP algorithm which targets to minimize the number of scheduled, but not-served QoS flows by configuring the frame with dynamic-sized RBs for multiple QoS flows.A similar nonsequential approach is also proposed in [26], where mini-slotbased resource allocation is used to accommodate upcoming URLLC packets inside a large eMBB frame.The main goal is to maximize the data rate of eMBB users while satisfying URLLC's delay constraint.However, they do not take into account the heterogeneous QoS demands of different eMBB flows.As they are designed for single cell only and consider similar kind of QoS flows, it can hardly cope up in heterogeneous 5G network slicing scenario.
A Joint scheduling and beam-forming optimization (JSBO) in Software Defined Network (SDN)-based virtual wireless environments, containing massive number of IoT devices, is proposed in [22] to minimize power consumption.A non-cellular Delay optimal stochastic scheduling scheme for vehicular networks (DOSSSVN) is proposed in [23] which targets to minimize the delay of the vehicular network applications.They have considered high fluctuation of vehicle arrival rate and optimize the allocation with respect to delay, but this strategy does not suit heterogeneous 5G network slicing scenario.
The dynamic slicing-based scheme (DSS), proposed in [24], only keeps the minimum resource reservation constraint, allocating resources non-sequentially among different slices.It uses a modified PF scheme considering only fairness among slices, and channel condition but it does not consider QoS constraints.Due to non-sequential nature, it can perform better than the other two approaches but insensitivity to QoS makes it unsuitable to manage heterogeneous 5G network slices effectively.
From the above discussion, it becomes clear that none of the existing schemes is perfectly suitable for handling heterogeneous 5G network slicing scenarios.To address this gap, we propose a non-sequential, QoS-aware scheduling scheme that ensures effective prioritization of slices.Our scheme comes close to the FLS [9], iRSS [10], and CBAP [11] schemes.But it differs from each of them in terms of QoS parameter (e.g., interpacket delay), dynamic prioritization of slices, low complexity,

III. SYSTEM MODEL
The system model illustrates the network topology, traffic model and the statistical distribution used, followed by the objective function.They are presented in the following, and the notations used there are listed in Table II.

A. Network Topology
In this paper, we have considered multiple network slices in Network Function Virtualization (NFV)/ SDN enabled integrated mobile network.The mobile network may contain multiple Base Stations (BSs) with different configuration of resource blocks (RBs) in terms of their bandwidth and time span [11].The spectrum resources of multiple BSs in terms of RBs are aggregated into a single resource pool.The data efficiency of a particular RB with respect to a User Equipment (UE) is a function of received power, intra and inter-cell interference, Additive White Gaussian noise (AWGN) and the size of the RB [11].Let us assume that there exist some Distributed Radio Resource Management (DRRM) modules, integrated into the nearby Data Center (DC), to collect the scheduling and traffic information like delay, inter-packet delay, packet arrival rate, received power, etc. from the RRM module of individual BSs.

B. Traffic Model
We have mainly focused on the downlink traffic.A slice may contain a group of traffic flows requiring different target QoS demands.Let us denote the set of slices as C = {1, .., i, . .., C} and by U (i) , i ∈ C, we denote the subset of packets belonging to slice i.If U is the set of all packets under the RAN, we can write: We have characterized the QoS of each traffic flow by end-to-end delay and inter-packet delay parameters.Threshold values for these QoS parameters for every QoS class are defined in [12].
In order not to violate the end-to-end delay and the inter-packet delay constraints, the scheduler must serve its queued packets within an expiration time threshold, denoted as Δ.So, a packet will be expired at (T + Δ), where T is the current time.Our system model is dynamic in the sense that a scheduler is implemented at every time slot, i.e., it implicitly takes care of the status of active slices, including queued-up packets, packet delays, and so on.Each packet within the subset U (i) may have different value of Δ due to previous delays in the network.
where t (q) and t thr are respectively the computed value and the threshold value of the QoS parameter, q.
The end-to-end delay contains two components: 1) the delay between the remote server and the 5G gateway (User Plane Function (UPF)), denoted as d Server,UP F , and 2) the delay between the UPF and the UE, denoted as d UP F,UE .Consequently, the following equation holds.
The current delay-based resource scheduling schemes [9] consider only d UP F,UE because 3GPP has specified the delay threshold between UPF and UE.But, it can not be considered as the end-to-end delay threshold.One-way delay measurement is very challenging due to lack of cooperation between the remote server/end node and the 5G network [27].We have assumed that the UPF can monitor the delay between itself and the remote server by resorting to ping, traceroute, or any other means, e.g., by resorting to Real Time Protocol features, IPv6 Performance and Diagnostic Metrics (PDM) [28], etc.The delay between the UPF and the UE can be measured by using a probing technique [27] where every fragment of a test packet is time stamped at both the UPF and the UE.Then d UP F,UE can be computed as: where T S UP F/UE,k is the time stamp of kth fragment at the UPF/UE.As the fragments, generated at the UPF, must be reconstructed at the UE to get back the original datagram, independent of the routing paths, the maximum values of the time stamps are used in (4).The computed value of inter-packet delay in the downlink direction is the difference between the current time and the last time packet was received by the UE.
To group packets having same expiration time, we have defined a new subset, denoted as T +Δ , which contains packets having same Δ.Thus, the ith packet group can be expressed as: where Δ is a random variable which follows certain distribution with values ranging from 0 to the maximum, M .The value of M may have a very high value if the running application can sustain a very high delay or inter-packet delay threshold values.However, we assume that the threshold value is restricted within a certain limit for the selected flows of the slices.For a packet, Δ is less than or equal to 0, then it is already expired and it will be discarded.
The illustration of valid subsets in different time intervals is shown in Fig. 1.In the (T + 1)th time interval, the packets containing in U (i) T +1 is expired as Δ = 0 and hence it is deleted.A new subset, T +M +1 , is included as total M number of subsets are valid in every time interval.

C. Statistical Distributions Used
The traffic arrival process is often modeled as a simple Poisson distribution.But arrival of traffic like VoIP (e.g., Skype voice call, Google Talk, QQ chat), HTTP (e.g., Google searching, Facebook) can be modeled as a modified Poisson distribution [29] or, non-homogeneous Poisson process [30], while inter-arrival time of video streaming (e.g., YouTube, Netflix, Hotstar live) can be modeled by Pareto distribution [31].The mMTC traffic can be modeled using a Semi-Markov Model (SMM) [32].
A modified Poisson distribution is a linear transformation of Poisson distribution as defined in [29].It can be specified by three parameters (a, b, λ).If X is a Poisson random variable with expected rate of occurrences as λ, the modified Poisson random variable Y is defined as: The Probability Density Function (PDF) of Y can be derived as: The mean (μ) of Y can be derived using (7) as follows: If Z is a random variable of inter-arrival time following Pareto distribution, the probability distribution can be given by: where z m is the minimum possible value of Z, and α is a positive shape parameter.The mean of Z can be given as: The mMTC has different traffic patterns.These patterns can be integrated into a Markov structure with S number of states [32].
Let P be the state transition matrix, and p i,j be the transition probability from state i to j.The stationary state probabilities of the embedded Markov chain Π (e) can be obtained by the following eigenvalue problem: If the mean sojourn time in state i is L i , the actual state probabilities can be computed by using the following formula: We assume that the arrival rate of packets in each state is a fixed process.If W i is the arrival rate at state i, the mean arrival rate W can be calculated as follows:

D. Objective Function
The objective of our proposed scheme is to schedule the packets of multiple slices in a non-sequential manner such that every slice gets optimum performance while strictly maintaining the QoS constraints.We have introduced some parameters, like H ij , denoting the data efficiency of ith packet in jth RB, δ i denoting the priority of ith packet, x ij denoting a binary decision variable whether jth RB is selected for ith packet or not, y i denoting a binary decision variable whether ith packet is selected for scheduling or not and b i to denote the required data capacity of ith packet.As given in [33], H ij can be defined as follows: where θ j denotes the size of the jth RB, η(0 η 1) is the attenuation factor accounting to implementation, and p and P (awgn) ij are representing received, intra-cell interference and AWGN power at the UE having ith packet and located at the BS containing jth RB.Hence the objective function can be written as follows: The above problem is NP hard and the proof is given in Appendix A. In (16), to include a packet into our selection grid, b i must be satisfied by one or more number of RBs from the resource pool.Let us assume that the amount of radio resources, allocated to satisfy ith packet's capacity constraint minus its actual requirement be a random variable, ψ i and the total capacity of the resource pool is R unit (R > 1), taken as a shape parameter.Then the amount of resources allocated to the ith packet becomes a random variable, denoted as Z i , which can be defined as follows: Now the above optimization problem can be converted into a Chance-constrained problem [6].According to the definition of Chance-constrained problem [6], the above problem can be approximated as: where p is the probability of the occurrence that the total resource requirement of the selected packets exceeds the capacity of the resource pool and S * p is the optimal value of the objective function with parameter p.The above problem would give us the solution vector of the selected packets i.e., Y = y 1 , . . ., y |U | with a risk factor of p which is considered as small as possible.
The priority vector δ = {δ 1 , . . ., δ |U | } can be modeled as an input to the objective function.Both static and dynamic prioritizations are possible.In case of static prioritization, a packet k ∈ U , can be scheduled before any packet of the subset Û ⊂ U if the following inequality holds, We have used the above inequality for all packets k ∈ U i T +1 and ∀i ∈ C i.e., the set of packets which will expire if not scheduled at current DTI.The dynamic prioritization is used to determine the priority of packets among slices.Instead of satisfying the above inequality, δ k is set at a lower value such that a slice will also get some room for its packets to be scheduled before the packets of a higher priority slice.Note that, in this case δ k depends on the objective function itself.Therefore, we can obtain δ k for the packets of different slices in case of dynamic prioritization through simulation only.

IV. PROPOSED SCHEME
We have developed an efficient algorithm to compute the optimum allocation of available RBs to all the packets in Section IV-A.In the Section IV-B, we have discussed the minimum transmission need at the current DTI.The complexity of the proposed scheme is discussed in Section IV-C.

A. The Allocation Algorithm
The main idea is to convert the objective function, defined in (20), (21), and ( 22) into a deterministic knapsack problem by calculating the effective bandwidth [7] of the random variable Z i , given in (19).Let the effective bandwidth of the random variable Z is denoted as Γ p (Z) with overflow probability p.The standard effective bandwidth of Z is defined as follows [7]: Proposition 1: Let Z 1 , . . .,Z n be independent random variables and Proof: If i Γ p (Z i ) h, then from the standard definition of effective bandwidth, given in (24) we can say that In the inequality, proved in Proposition 1, if we set g = R, h = g − 1 and use the condition given below: we can get, P r[ To satisfy the condition, Γ p (Z i y i ) (R − 1), we have considered sample average approximation approach [34].Let Z (1) , . . ., Z (N ) be N number of independent Monte Carlo samples of the random variable Z.Let us denote the optimal value of the objective function with parameters γ and N as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Ŝ(N )
γ , where γ is the overflow probability while sample average approximation approach is used [34].For γ = [0, 1], the problem can be redefined as: where I[.] is the indicator function, such that I[.] = 1 if .> 0 and I[.] = 0 otherwise.Here, we have taken slightly more risk, i.e., γ > p. But, we have shown that the solution from this problem is no worse than the optimal.If the previous problem is denoted as P rob * p and the current problem is denoted as P rob (N )  γ , the following theorem proves that by taking a risk parameter γ > p in our problem the optimal value, Ŝ(N ) γ , will be an upper bound to the true optimal, S * p , with the probability approaching 1 exponentially fast as N increases.
Theorem 1: Let γ > p and the P rob * p has an optimal solution, then Proof: The proof is given in [34].
If we take the value of N sufficiently large such that the value of the right-side of the inequality (29) becomes close to 1, it could be said that the solution of P rob (N ) γ is equivalent to solution of P rob * p .Taking a specified a confidence probability of (1 − ξ), we could determine the value of N in the following proposition.
Proposition 2: The following choice of N ensures Ŝ(N ) γ S * p with a confidence probability of (1 − ξ), where γ > p.
Proof: If the confidence probability is (1 − ξ), from Theorem 1, we can write: Hence, we can determine the lower bound of N , and by increasing it such the sample values satisfy the condition, given in (27), we can find an optimal solution for the decision vector.Assuming a considerable value of the confidence interval, (1 − ξ), the above procedure ensures that the condition in (25) will be satisfied as proved in Theorem 2.
Theorem 2: Using the capacity constraint, given in (27), we can prove that P r( Proof: Applying Proposition 1 in Theorem 1, we have proved the optimal solution in P rob (N ) γ is equivalent to the optimal solution obtained from P rob * p with a confidence probability (1 − ξ).So, the effective bandwidth of these two problems are equal with a confidence of (1 − ξ).For k ∈ N , if ξ 1, we can write, By using Jensen's inequality, we can write, Now, the expected value of Z (k) i y i over all the samples of N is at most (R − 1), as γ 1.So, we can write, According to the Proposition 1, we can write, P r( Our problem can now be converted to a 0/1 Knapsack problem: To solve the above problem, we may use dynamic programming approach.But, due to pseudo polynomial time complexity of this kind of approach, we shall use a modified Linear Programming relaxation-based approach as given in [35], which has the time complexity O(|U |).

B. Determination of Transmission Need
The transmission need, denoted as d T , is the minimum number of packets to be selected at current DTI from the set U , such that no packet of this set will be dropped in future DTIs due to QoS constraints.So, we can write, where T is the transmission need of the ith slice.Let ϕ (i) j,k , j < k j + M be the partial average of packets, i.e., the average number of packets per subset calculated up to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
kth subset at jth DTI for ith slice.ϕ (i) j,k can be expressed as: As the arrival process of packets in different subsets are dynamic with respect to time (DTI), the derivation of d (i) T is complex.So we divide the derivation in two consecutive steps.

1)
Step 1: Here, we have assumed no arrival of packets in future DTIs.Now, if all the packets of the subset U (i) are scheduled, i.e., d (i) T = |U (i) |, our scheme would become sequential, i.e., higher priority slice will be scheduled fully before going for lower priority slice.The novel purpose of non-sequential scheduling cannot be achieved by this way. As T +1 | is the number of packets that will expire at (T + 1)th DTI, they must be scheduled in T th DTI.So, the following inequality must hold: To achieve the goal of minimum packet selection, we could select the average over all the subsets of the set U (i) , i.e., ϕ (i) T,T +M .However, this may be less than |U i.e., the partial average up to (T + 2)th subset is greater than what we have selected in the current DTI.Hence, it is obvious that the selection at (T + 1)th DTI will be greater than the selection at T th DTI.Thus, if the value of ϕ (i) T,T +2 is very high, we may not be able to accommodate all the packets due to shortage of resources.Thus, the selection may increase in future DTIs, like at (T + 2), (T + 3), and so on.In this case, to make our scheme smoother, a bigger allocation can be started from the current DTI.
The above selection becomes invalid because of different values of the partial averages, ϕ is also different.Hence, it is obvious that the partial average will be different and may be greater than the overall average.As a consequence, choosing the maximum partial average would be a better option than the previous choice.We need to prove that there is no packet loss in future DTIs using this selection process.This proof considers dynamic arrival of packets with respect to time.
Lemma 1: If transmission need is the maximum partial average, then for k T , where β (i) m,n is the number of new packets arrived up to nth subset at mth DTI for ith slice.
denoting the maximum partial average, is always greater than or equal to any partial average, evaluated at that DTI.Hence, the following inequality must hold: For l = T , l = T + 1 and l = T + 2, ϕ l,k+1 can be written as follows.
Deriving up to kth DTI, we can write, By substituting l with k in (42) and using (46), it can be proved that Step 2: In this step, we consider dynamic arrival of packets and their distribution in different subsets.It is proved in the above Lemma 1 that the total number of packets scheduled up to kth DTI, where k T , is always greater than or, equal to the total number of packets that must be scheduled within kth DTI before their expiration.So, it is proved that maximum partial average can be a suitable parameter as it fulfils the goal of not dropping any packet in future.
Lemma 2: If transmission need is the maximum partial average and mth indexed partial average is selected at current DTI T (m > T ), then d l , where T l < j m.Proof: Let us assume that at lth DTI, sth indexed partial average is selected, where s > l.Now, at (l + 1)th DTI, the new selection metric is always greater than or equal to any partial average, computed at that DTI.The sth indexed partial average at (l + 1)th DTI exist if s > (l + 1) and it can be computed as follows: The following inequality must hold.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
As s > (l + 1) and β (i) l+1,s 0, using ( 48) and ( 49), we can write l where (l + 1) m.Similarly, it can also be proved that d l+1 when (l + 2) m and so on.As j > l, it is now proved that where T l < j m.
If we select the above metric where mth indexed partial average is maximum at T th DTI, it has been proved in Lemma 2 that the function d l,m .Hence, serving equally these future packets along with the maximum partial average in each DTI can make fairer scheduling than only maximum partial average selection.To determine average number of packets that will arrive in future, we use a prediction mechanism using the basic probability distributions as given in Section III-C.The mean number of packets that will arrive between T th and (k − 1)th DTIs and will be included up to kth subset will be denoted as f (T, k).

Accordingly, d (i)
T can be written as follows: In Appendix B, we have determined the value of f (T, k) on the basis of the arrival process of the packet in the ith slice, which is tied to the kind of traffic the slice is carrying.According to (57), ( 58) and (59), we can rewrite f (T,k) k−T for voice, video, and MTC applications respectively as: for voice traffic for video traffic for MTC traffic (52) Note that this process can be extended to other traffic types too.
In Proposition 1, provided in Appendix C, we have demonstrated that the transmission need parameter calculated using (51) is a decreasing function with respect to time under the assumption that the same number of packets are included in each subset in each DTI.The case of dynamic traffic arrival is illustrated by simulation in Section V.

C. Complexity Analysis
In our proposed scheme, we have suggested to use the algorithm, given in [35], to solve the basic 0/1 knapsack problem whose time complexity is O(|U |).The transmission need computation as given in Section IV-B is taken as input to this problem.The per slice transmission need computation, given in (51).The number of logical operations is M times the number of computations to determine the per-slice metric, f (T, k).Any object-oriented design will follow a recursive process to compute it.First, We fetch the size of a subset.Then the number of computations needed to calculate the partial average, ϕ where r is the partial average up to (k − 1)th subset, s is the size of the kth subset and v is the parameter as provided in (52).
The computation of r and s can be performed when the queue is updated, and can be neglected in the complexity analysis.The above equation contains 11 arithmetic operation and 2 fetching operations.So, the selection metric determination requires 13 × M number of operations in total which equals to O(M ).The physical allocation of RBs takes O(R 2 N ) times [12] where R and N are total number of RBs and total number of users respectively.Hence, the total time complexity becomes O(M + |U | + R 2 N ), which is close to the basic PF algorithm [12].

V. PERFORMANCE ANALYSIS
We have analyzed the effectiveness of our proposed scheduler in 5G-LENA [8], a New Radio (NR) network simulator, designed as a pluggable module to ns-3.In the following, we have demonstrated the simulation scenario and model.Then, we determine the values of the configuration parameters N and δ.We have analyzed the performance analysis of our scheme based on the 3GPP recommended 5G RAN slicing scenarios [36].

A. Simulation Scenarios
The simulation scenario and the cell-level typical parameters are shown in Fig. 2. The scenario is compatible with the 3GPP specified 5G Urban macro and Dense urban use cases [36].Here, 80% users are at indoor coverage with a speed of 3 km/h and the remaining 20% are at outdoor coverage moving with 100 km/h speed.A UE can connect with more than one gNB while running multiple flows of different types, e.g., either MTC, or VoIP, or Video streaming.Hence multi-gNB scheduling becomes an effective way to increase the spectral efficiency of the system.This kind of cell-planning is generally seen in the Urban macro and Dense urban scenarios [36] to reduce the capital expenditures of the tenant operators.However, a scenario such as integrated terrestrial and non-terrestrial 5G and beyond networks [37] may not be appropriate for the study of our proposed scheme, since satellite links remain primarily unstable, making it impossible to calculate the data efficiency parameter (H ij ) correctly at each time interval.The QoS parameter values (such as delay) between cellular and satellite networks also differ significantly, making it impossible to accurately calculate the expiration time threshold (Δ).Other typical 5G system parameters considered are shown in Table III.
We have considered three slices: Slice1 having MTC flows, Slice2 having VoIP flows and Slice3 having Video flows.For video flows, we have adopted H.265 video codec with frame rate of 30 fps and resolution of 3840 × 2160 4 k UHD, requiring 15 Mbps bandwidth.The MTC flows run TCP echo application, implemented in ns-3.The service period for both VoIP and video flows are constant (120 s), whereas MTC service period is continuous throughout the entire simulation time.The delay threshold for all kind of flows is set to 100 ms and the inter-packet delay threshold for VoIP and video flows are set to 40 ms while for MTC it is set to 1 ms.The PLR thresholds for MTC, VoIP and video flows are set to 10 −2 , 10 −2 and 10 −6 respectively, where the throughput threshold are set to 20 Kbps, 12.65 Kbps and 15 Mbps respectively following constant packet generation at the source.The value of d Server,UP F is considered as exponentially distributed with mean 20 ms and d UP F,UE is fixed at 5 ms.To simulate the impact of delay variation of each packet due to diverse routing path, a delay variation of ±5 ms is also considered.The traffic pattern of each MTC device can be defined by four states: OFF (no packet transmission), Periodic Update, Event-driven and Payload exchange [32].The values of MTC traffic parameters, used to derive the mean packet arrival rate (W ), are S = 4, P = [(0, 0.5, 1, 1), (0.5, 0, 0, 0), (0.5, 0.5, 0, 0), (0, 0, 0, 0)] T , L = [1 s, 0.5 s, 0.5 s, 0.5 s] and W = [0, 55/s, 50/s, 0].
To measure the QoS performance of each slice, we have defined a new parameter, named service rate.Service rate of a flow is the ratio of the number of default averaging window (2 s, as recommended by 3GPP) in which QoS requirement of the flow is met over the total number of default averaging window in the entire service duration of the flow.Service rate of a slice is the average of service rates of all the flows under that slice.The QoS performance of a flow is measured as the average goodput performance of the flow in the averaging window, where goodput = throughput × (1 − P LR).In the PLR calculation, we have considered both lost and expired packets in the network.
In order to eliminate the initial transient states, we have taken all simulation results after 1200 s of simulation run.The FLS The configuration parameters of iRSS and CBAP schemes are taken from in [10] and [11] respectively.The value of M is taken as the maximum between the end-to-end delay and inter-packet delay threshold in the corresponding simulation.

B. Determination of N and δ
In the objective function, given in Section III-D, we need to take samples of the random variable Z i , ∀i, which has one-toone dependency with the random variable, ψ i .We take samples of Z i , ∀i by varying ψ i .To simplify the sampling process, we have assumed same packet size for a particular flow/application and ψ i follows an uniform distribution from 0 to one-forth size of the packet of that flow.Every sample has |U | number of elements.Samples are generated in such a way that the inequality (27) is satisfied.
We have assumed the configuration parameters of (30) as ξ = e −2 , γ = 2p which gives us N p −2 .So the number of samples actually defines the upper bound of the overflow probability.
In Fig. 3, we have shown the radio resource utilization in terms of spectral efficiency for different number of samples.If we increase N , p will decrease, resulting in more accuracy in determining packet size in our objective function and more number of packets can be scheduled with the available resources.As a consequence, the spectral efficiency is increased while increasing N .To increase the randomness of Channel Quality Indicator (CQI) reporting by UEs, we vary the traffic density in the small cell area.Instead of diverse values of Z i in different cases, our algorithm accurately predicts the required amount of resources for every packets.Decreasing the traffic density decreases the number of CQI reporting as the chance of physical distribution in the larger rectangular area becomes higher and so the overall spectral efficiency also decreases as shown in Fig. 3.The spectral efficiency remains almost unchanged even we increase N beyond 10 4 , irrespective of the traffic density.Therefore we can choose this value as a reference in our following simulation.
The lower bound of service rate of a slice can be fixed by tuning δ for its packets.The values of δ for different slices in case of different service rate requirements are given in Table IV .After setting δ to an appropriate value for all packets, our algorithm ensures that if we increase the load in the system, the service rate Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.  of a slice will not fall below its required level unless the service rates of all low priority slices become zero.In the following simulations, we set δ as given in Table IV dynamically according to the slice priorities and their service rate requirements.If the system load increases beyond its capacity, it first affects the service rate of the lowest priority slice i.e., Slice3.We can prevent the service rate of Slice3 from going down its required level by setting δ of Slice1 and Slice2 as given in Table IV such that the service rates of Slice1+2 still remain above the required level.This dynamic setting of δ during scheduling gives us more flexibility to support heterogeneous service demands of multiple slices.In an online system, the priority vector δ can also be dynamically adjusted to match the required service rate as given in Table IV though online control systems.

C. Comparative Analysis
We have considered Scenario1 of Table IV and compared our proposed scheme with iRSS [10], CBAP [11] and FLS [9] schemes with respect to load of slices.In Figs. 4 and 5, we keep the load of Slice3 and Slice1 constant while varying the load of other two slices such that total radio resource of the system is fully utilized by all the slices.The iRSS scheme performs better than CBAP and FLS in terms of maximum load support.But, iRSS does not consider QoS of slices and static size of RBs while scheduling.It allocates a chunk of radio resources for an entire slice only, avoiding the mapping of its selected radio resource into physical RB units, resulting into more resource demand than its predicted value.Whereas we have considered both of these issues in our system model and therefore our scheme performs much better than the iRSS scheme.CBAP and FLS are both QoS-aware scheme, but they are not designed to support optimum resource utilization in multi-gNB scenario.Also they consider only end-to-end delay as QoS parameter in their algorithm.Although FLS consider end-to-end delay, CBAP considers only framing delay of the packets and therefore, FLS performs slightly better than CBAP scheme.
A more realistic scenario in the perspective of 5G tenant-level operator is also considered.The required QoS of different slice flows are listed in Table V.Here a UE may subscribe to a particular slice to achieve the mentioned QoS requirements, independent of number of flows and types (VoIP/Video/etc.).The URLLC traffic is generated in a nearby cloud server and to maintain its end-to-end delay within 1 ms range we use the frame level delay reducing technique, CBAP [11].In terms of supported load, our scheme outperforms other three schemes and the performance is almost same as shown in Figs. 4 and 5.As the Internet is highly dynamic, the delay and inter-packet delay also become highly unstable.Hence, measuring the performance of a slice in terms of service rate while varying the mean of exponentially distributed delays, would be very interesting.In Figs. 6 and  7, we have shown the comparative performance of three schemes namely, proposed scheme, CBAP and FLS.We do not consider iRSS as it is QoS-insensitive.As can be seen from Figs. 6 and 7, our proposed scheme outperforms CBAP and FLS.This is due to the fact that CBAP and FLS do not consider the inter-packet  delay constraint.Besides FLS is a probabilistic approach and no further packet loss is guaranteed while scheduling current packets.The CBAP scheme only considers framing delay of packets without taking into account the end-to-end delay.
The above analysis shows that in a medium/high dense urban scenario, our scheme outperforms the existing scheduling schemes of 5G RAN slicing.We are able to support spectral efficiency of approximately 5.6 bits/s/Hz, which is higher than the recommended value in the 5G 3GPP specification.The service rate of every slice can be controlled using δ while the load of different slices vary dynamically with respect to time.The QoS of a higher priority slice can never go down a pre-defined service rate, even when the load of a lower priority slice is increased.Additionally, our scheme can handle packets having high delay and inter-packet delay where these two delays are highly dynamic due to diverse internet condition.In static condition with a low end-to-end delay and inter-packet delay, the proposed scheme performs almost same as others, but in the dynamic condition it performs much better than others due to their insensitivity to these delays.

VI. CONCLUSION
Our proposed scheme is ideal for the scenarios which need to ensure a minimum QoS for every slice, even though traffic arrival is random.In such scenarios, current literature adopts static resource reservation strategies; but these are costly approaches, as the overall spectral efficiency of the system is reduced.Our non-sequential allocation strategy gives every slice a fair chance to access a minimum amount of resources at each scheduling interval, even though the slices are unevenly loaded.The priority of a slice can also be changed during the scheduling process in order to provide more flexibility in service provisioning.For example, a video slice subscriber might need higher bandwidth while watching a live video than while downloading a video.Hence, the priority of the live video sub-slice can be upgraded, while the priority of the video sub-slice can be downgraded.Similarly, road safety messages may need to be prioritized over other vehicular applications within a Heterogeneous Vehicular Network (HetVNet) Slice from time to time.Thus flexible prioritization technique gives more options to the tenant-level operators to use fine-grained policies for their designated slices.
Currently, the new orientation of the flexible resource block configuration technique is becoming inefficient due to high power consumption, delay, and control overhead.In an overlay network environment, our proposed scheme can smartly map a packet to a resource block of appropriate size and increase resource utilization.In this context, in scenarios like massive and critical cellular IoT networks where most of the traffic flows contain packets of sizes much smaller than a radio resource unit, testing the performance of our proposed scheme can be an interesting future work.
On the other hand, while roadside units (RSUs) take charge of resource allocation in a HetVNet environment, dedicated resources are allocated for vehicle-to-vehicle (V2V) side links.It results in a poor data rate for V2V links.With the cooperation of RSUs, the cellular base station can apply our proposed scheme to allocate resources dynamically to V2V links according to their demands, thereby improving the data rate.To accommodate more radio resources for a single V2V link, reusing the same radio resources for different V2V links under the same cellular region can be a good option.Our proposed scheme does not take this possibility into account, and hence it will be an interesting future work if we also consider the reuse factor in our system model.During a handoff period, resource scheduling of vehicular traffic can be done by the nearby cellular base station instead of the short-range RSUs.Our proposed scheme can instantly schedule them by prioritizing them over other cellular traffic.Hence, a seamless handoff experience can be realized in a HetVNet environment.Therefore, the proposed resource scheduling scheme can be considered a significant contribution towards fulfilling 5G RAN slicing requirements.
However, some challenges may appear while deploying the proposed scheme in a real-world scenario.One of the key issues is the seamless mobility management requirement for UE between two SDN controllers while the allocation of radio resources is going on from both of them at different time intervals.This issue may become more complex if the RANs under the SDN controllers use different radio access technologies.Another issue to overcome is that the security and privacy policies of a particular slice must not be breached while allocating radio resources to another slice.

APPENDIX A
To prove that our problem is NP hard, we reduce the known NP hard subset sum problem to our problem.We use proof by contradiction.A special instance, Q of our problem can be obtained if we suppose that each RB has the same size independent of packet identity, i.e., H ij = H, ∀i, j.The capacity

APPENDIX B
Here, Our goal is to determine the value of f (T, k).The distributions given in the Section III-C is used to compute f (T, k).
Moreover the end-to-end delay and inter-packet delay vary depending upon routing path, hop distance, packet generation rate, queuing or processing delay, etc.In the 5G network, these delays are evenly distributed within a certain range and therefore they can be modeled as a Uniform Distribution [38].Following (2), it can be stated that Δ has a Uniform Distribution with a discrete probability distribution function: Traffic arrival from a single voice application follows modified Poisson distribution [29].As the sum of independent modified Poisson-distributed random variables also follows a modified Poisson distribution, the mean of the random variable for multiple voice applications, Ŷ (= n i=1 y i ) is nμ, where μ is the mean of the random variable, y i .The value of μ can be obtained from (8).
At T th DTI, a packet will be included in between (T + 1) to kth subset if (Δ) is less than or equal to (k − T ).Using (54), the probability distribution of number of packets included up to kth subset in T th DTI can be derived as follows: Pr{T, k} = The inter-arrival time of video packets follows Pareto distribution [39].As Pareto process does not hold summation and multiplication property like modified Poisson, it would be very difficult to determine the joint probability distributions.So, first we calculate the mean number of packets that will be included up to kth subset at T th DTI as k−T M 1 μ .Following up to (k − 1)th DTI, f (T, k) can be calculated as follows, where the Pareto parameters z m and α can be estimated directly from the data samples (see [40] for example).
Similarly, f (T, k) can be calculated for the SMM as: So, the mean number of packets, which will arrive up to kth subset in between T and (k − 1)th DTI can be calculated using (57), ( 58) and (59) for voice, video and MTC applications respectively.

APPENDIX C
Proposition 3: If mean number of packets arrive in each DTI and mean number of packets included in each subset, then the transmission need (d (i) j ), calculated using (51), becomes a decreasing function with respect to j.
Proof: Let us assume that kth and sth indexed subsets are the highest selection metrics according to (51) at T th and (T + 1)th DTI respectively, where k > T and s > T + 1.We can also assume the mean number of packets included in each subset at each DTI as m i .According to (52), m (voice) = nμ and m (video) = 1 μ .So, we can write the following equations.
As kth indexed subset is the highest metric at T th DTI, it will be greater than or equal to the selection metric calculated for sth subset at that DTI.The inequality can be written as, the following inequality is proved.
Similarly, we can also prove, d T +3 and so on.It concludes that the transmission need d (i) j , is a decreasing function with respect to j under the above-mentioned assumption.

Manuscript received 16
February 2023; revised 3 July 2023 and 23 September 2023; accepted 12 October 2023.Date of publication 30 October 2023; date of current version 14 March 2024.This work was supported by the European Union through the Italian National Recovery and Resilience Plan (NRRP) of NextGen-erationEU, partnership on Telecommunications of the Future (PE0000001program "RESTART").The review of this article was coordinated by Dr. Benedetta Picano.(Corresponding author: Manoj Kumar Rana.)

T
+1 |.Hence, it would lead to packet loss at the current DTI.To avoid such losses, we can select the maximum between ϕ (i) T,T +M and |U (i) T +1 |.Now, let us assume the following inequality,

j
becomes a non-decreasing function for T j < m.The non-decreasing nature of the function occurs because of the future arrival of packets up to mth subset, i.e.,

Fig. 3 .
Fig. 3. Effect of number of samples on spectral efficiency

Fig. 6 .
Fig. 6.Effect of end-to-end delay on slice performance.
|U |i=1 δ i y i D which takes polynomial time in proportion to the size of the input.Second, we can show that there is a polynomial time reduction from the subset sum problem to Q.The subset sum problem can be given as c 1 , c 2 , . . ., c |U | and V where we need to find y i , ∀isuch that |U | i=1 c i y i = V .If there exists an efficient algorithm to solve Q, the total capacity constraint, i.e.,|U |i=1 b i y i RH, must be met.Now let us assume the following process of converting the subset sum problem to Q in polynomial time:δ i = c i , ∀i, b i = c i , ∀i, D = V and H = V /R.Now the solution of Q must satisfy these inequalities: |U | i=1 c i y i V and |U | i=1 c i y i V which implies that |U | i=1 c i y i = V .Therefore, y i ,∀i is the desired solution of the subset sum problem.This establishes the NP completeness of our problem.

T
+1,s can be calculated as follows:ϕ (i) T +1,s = (s − T )ϕ (i) T,s + (s − T − 1) m (i) M − d (i) T s − T − 1 (63)By putting the value of ϕ (i) T +1,s in (61), we can calculate the value of d (i) T +1 .Applying the inequality (62) and s > (T + 1), Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I CLASSIFICATION
OF EXISTING RADIO RESOURCE SCHEDULING SCHEMES Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.constraint of Q can be expressed as H R j=1 x ij b i , ∀i.Now, a decision version of Q can be stated as follows: Does there exist a solution of Q such that R j=1 x ij b i , ∀i and |U | i=1 δ i y i D, where D is an arbitrary profit value?First, we can prove that Q is NP.The verification process is to compute H R j=1 x ij b i , ∀i and