Performance Assessment of an ITU-T Compliant Machine Learning Enhancements for 5G RAN Network Slicing

Network slicing is a technique introduced by 3GPP to enable multi-tenant operation in 5G systems. However, the support of slicing at the air interface requires not only efficient optimization algorithms operating in real time but also its tight integration into the 5G control plane. In this paper, we first present a priority-based mechanism enabling defined performance isolation among slices competing for resources. Then, to speed up the resource arbitration process, we propose and compare several supervised machine learning (ML) techniques. We show how to embed the proposed approach into the ITU-T standardized ML architecture. The proposed ML enhancement is evaluated under realistic traffic conditions with respect to the performance criteria defined by GSMA while explicitly accounting for 5G millimeter wave channel conditions. Our results show that ML techniques are able to provide suitable approximations for the resource allocation process ensuring slice performance isolation, efficient resource use, and fairness. Among the considered algorithms, polynomial regressions show the best results outperforming the exact solution algorithm by 5–6 orders of magnitude in terms of execution time and both neural network and random forest algorithms in terms of accuracy (by 20–40 %), sensitiveness to workload variations and training sample size. Finally, ML algorithms are generally prone to service level agreements (SLA) violation under high load and time-varying channel conditions, implying that an SLA enforcement system is needed in ITU-T's 5G ML framework.


INTRODUCTION
T HE introduction of the 5G cellular architecture not only drastically enhances the amount of resources at the air interface but enables flexibility of the end-to-end resource control and management [1], [2].One of its advanced functionalities is network slicing providing the tools for efficient resource management including the radio access network (RAN) [3].
Following 3GPP [4], a network slice is a logical network that provides specific network capabilities and network characteristics.The specification also demands slice isolation, although defined in a broad sense encompassing multiple levels such as security, performance, etc. Providing performance isolation of slices along with efficient use of system resources and fairness of their allocation is a difficult task since these requirements are largely contradictory [5], [6].The problem is even more challenging when slicing is extended to the RAN, where dynamic channel conditions need to be accounted for when designing isolation schemes.
To date, a number of algorithms have been proposed for network slicing in RAN with various performance isolation criteria taken into account.As most of those approaches formalize and solve an optimization problem, the solution complexity becomes a critical issue for practical implementation of the proposed algorithms.On top of this, many of the formulated slicing problems do not account for specifics of wireless propagation by abstracting the cell capacity.In dynamically changing wireless channel conditions, ensuring both isolation and fairness of resource allocation may lead to inability to timely redistribute resources among slices and flows that belong to different slices.As a result, lightweight approximations of exact solutions are of special importance for practical implementations.
Machine learning (ML) has recently become a viable alternative to conventional optimization and prediction techniques in various applied fields of science.Industry, transportation, healthcare, and many other fields utilize ML to solve a wide range of problems.There are numerous use cases of ML in communication networks [7], network automation and optimization [8], anomaly detection, network traffic prediction [9], [10], traffic optimization adjustment, etc.In the context of 5G systems, ITU-T has recently standardized a framework for ML integration by specifying the corresponding architecture and data handling [11], [12].
This study aims at improving practical applicability of an existing RAN slicing scheme by enhancing it with a lightweight ML-based approximator capable of providing a solution fast enough to follow the evolution of radio channel conditions.The investigated approach is applicable to other RAN slicing schemes that imply solving one or more optimization problems under very strict time constraints typical for RAN resource allocation.
In this paper, we assess the use of ML for RAN slicing within the ITU-T architecture framework.We first formalize a model of resource slicing in RAN aimed at fair priority-based isolation of slices.Then, we apply supervised ML techniques to reduce the solution complexity while accounting for specifics of wireless propagation and a realistic slice content composition.Particularly, we analyze performance of four candidate ML models -the linear regression, the polynomial regression, the random forest regressor, and the artificial neural network (ANN) -chosen primarily for their prediction and training speed.We consider their online and offline implementations and discuss how to integrate them into the 5G ML architecture standardized by ITU-T.
The main conclusions of our work are: accuracy and implementation: (i) the polynomial regressions of degrees 2 and 4 show the best results in the online learning setting outperforming other models in terms of execution time, accuracy, generalization capability, and training sample size, which makes them suitable for online implementation with frequently changing traffic distributions across slices; (ii) when trained offline and tested on simulation data, the models show an accuracy level inferior to the online training, however, for random forest regressors, this could be improved via larger training datasets and increased model complexity; sensitiveness: (i) ML algorithms are sensitive to the composition of slices and workload distribution across them, but not to the overall workload; (ii) prediction accuracy does not decrease with the number of slices, which makes ML enhancement particularly suitable for over 5-7 slices due to slow exact computation; channel and traffic impairments: ML algorithms are prone to service level agreements (SLA) violation under high load and time-varying channel conditions, implying that an SLA control system in real time is needed in the ML pipeline.The rest of the paper is organized as follows.In Section 2 we discuss the ITU-T standardized framework for ML implementation in 5G networks and also briefly review the recent work related to ML enhancement of network slicing.Section 3 presents the system model and its components.In Section 4 we describe the adopted slicing scheme and provide an exact solution algorithm.Section 5 introduces the ML enhancement framework and techniques for speeding up resource arbitration.In Section 6 we present the adopted numerical evaluation scenarios and study stochastic process representing the cell capacity.Numerical results and their interpretation are provided in Section 7. Conclusions are drawn in the last section.

BACKGROUND AND RELATED WORK
In this section, we discuss the use of ML in 5G systems including the ITU-T architectural framework for ML integration and applications of ML techniques for network slicing.We refer the reader to [13] for a survey of 5G network slicing enablers, architectures and deployment strategies, and to [5], [14] for recent reviews on RAN resource allocation to slices.

Machine Learning Integration in 5G Cellular
ML is defined in [11] as a process that enables computational systems to understand data and gain knowledge from it without necessarily being explicitly programmed.Fig. 1 shows the high-level architecture for enabling ML in a telecommunication network which was proposed in ITU-T Y.3172 [11].Its main components are the ML pipeline, the ML function orchestrator (MLFO), and the ML sandbox.
The ML pipeline is a set of logical nodes with specific functionalities.The nodes are combined to form an ML application.The source node (SRC) provides input data for the ML pipeline.The collector node (C) is responsible for collecting data from one or more source nodes.All data preprocessing, including data cleaning and aggregation, is performed by the preprocessor node (PP).One of the key nodes, the model node (M), is in charge of executing the chosen ML model.The output of the model can be revised by the policy node (P) and some rules can be applied to it.Thereafter the distributor node (D) manages the output and delivers it to one or more sink nodes (SINK).The sink node represents the target of the ML output where it takes action.
The ML pipeline can be overlaid on an existing network infrastructure with its nodes positioned and chained throughout several network levels (e.g., UE, the access network and the core network).The placement of ML functionalities is governed by such factors as the specifications of ML applications, their latency constraints and availability of data, but also performance and resource constraints of network functions (NF) and levels.The placement and chaining of the ML pipeline nodes are controlled by MLFO, a logical node that manages and orchestrates the nodes of ML pipelines.The input for MLFO is ML Intent, which represents a declarative description specifying an ML application.Based on this information and network conditions MLFO can control, arrange and change the nodes of ML pipelines.To implement an ML application, MLFO, in coordination with other management and orchestration functions, instantiates nodes of an ML pipeline with specific roles (e.g., SRC, C, M) and associates them to Fig. 1.ITU-T high-level architecture for ML integration (adapted from [11]).
technology-specific NFs of the underlying network based on their capabilities and corresponding requirements of the ML application.
The ML sandbox is an isolated domain used for training, testing and evaluating ML pipelines before deploying them in a live network.The ML pipelines hosted here are separate, but the data can come from both simulated and live underlay networks.If any changes occur in ML Intent or new specifications are added, MLFO updates the ML pipeline nodes in the ML sandbox so that they correspond to the modified scenario.

Machine Learning for Network Slicing
Applications of ML for network slicing enhancement can be manifold [15], however, resource allocation among slices receives the most attention from researchers.Indeed, in the majority of policies proposed so far for network slicing in RAN, the resource shares allocated to slices/slice users are determined as a solution to a linear or non-linear optimization problem [5].However, considering the numerous constraints and the dimension of the problem, obtaining such a solution fast enough for a real-time adaptive resource reallocation can be challenging.A possible way to tackle this issue is by using ML techniques.
The survey [15] discusses the automation of numerous network functions involved in control and management of slices.The authors provide a list of 5G network slicing scenarios and suggest several ML techniques that can be adopted to enhance various slicing-related tasks.The authors of [16] develop a three-stage hybrid learning algorithm to classify network traffic into three slice categories: eMBB, mMTC or URLLC.The classification is based on the data such as user device type, session duration, packet delay budget, etc.The study in [17] investigates how traffic of one slice affects traffic of another slice.The authors develop a data-driven ML-based slicing and allocation model which intelligently assigns and redistributes resources among network slices with respect to certain quality of service (QoS) parameters.Similar problems are addressed in [18], where the authors build an experimental prototype of the 5G network architecture and embed ML solutions to configure radio resources for network slices.Their results show that despite the growing computing resource utilization the throughput of the network was increased.The authors in [19] focus on the problem of resource allocations with changing channel characteristics.They suggest implementing ML to correctly model the wireless channel.
A number of deep reinforcement learning (DRL) solutions, namely [20], [21], [22], [23], have been proposed for network slicing in RAN as an alternative to explicit optimization-problem-based and algorithmic resource sharing schemes.The authors of [20] investigate the feasibility and efficiency of applying the DRL framework to resource allocation among slices and consider two scenarios: priority-based core network slicing and radio resource slicing.In the latter, a weighted sum of spectral efficiency and QoS in the three slices under study is adopted as the reward function for the decision process.The authors of [21] propose a combined solution including a deep recurrent neural network to predict traffic volume in large timescales and a reinforcement learning algorithm for performing the resource scheduling in small timescales.Here, the reward function aims at minimizing the resource consumption while guaranteeing a certain degree of slice performance isolation and includes a bonus for slice reconfiguration.In [22] the authors propose a system for dynamic reservation of unused resources in virtualized RAN based on DRL algorithms.They show that by tuning the objective function, which represents a weighted sum of average slice-specific QoS utilities and per slice resource utilizations, significant improvements in resource utilization can be achieved, however, no guidance is provided on the choice of these functions.
Although DRL is considered a promising approach to resource sharing in context of network slicing, it is characterized by high training complexity [20] and a certain arbitrariness in the choice of the reward function resulting in impeded tractability.Moreover, most studies consider workload, where there is no high competition for resources among slices and do not investigate the efficiency of the ML model under workload bursts, which might be compromised.The rationale behind our proposed framework is to combine the tractability of an explicit slicing policy covering all workload ranges with the time efficiency of the simplest supervised ML techniques, which can approximate the solution when its computation by the exact algorithm takes too long.

SYSTEM MODEL
In this section, we introduce our system model and its components.We start with the overall system design and then proceed with the radio part providing an abstraction of resources at the air interface.Then, we specify the traffic process for each slice and introduce the slice performance isolation policy and our approach to its ML enhancement.

Base Station With ML-Enhanced RAN Slicing
We study the downlink transmission of a 5G base station (BS) providing virtualization of radio access resources and network slicing (see Fig. 2).The uplink direction and the mixed uplink/downlink case can be addressed in a similar manner.The BS may have several radio access technologies (RAT A, RAT B, etc.), whose resources are controlled in the radio resource management (RRM) subsystems.The RRMs are collectively controlled and coordinated by a common resource manager (CoRM), which combines the data rates provided by each RAT into the aggregated time-varying BS capacity, CðtÞ ¼ c A ðtÞ þ c B ðtÞ þ . . ., t !0, and manages its allocation to user sessions.The slicing manager is a separate entity responsible for dynamic distribution of the aggregated capacity, CðtÞ, among S instantiated network slices based upon the adopted slicing policy and demand.The resulting capacity allocation to slices, C 1 ðtÞ; C 2 ðtÞ; . . .; C S ðtÞ, is communicated back to CoRM and translated to resource allocation constraints at each RAT scheduler.The considered network structure corresponds to a heterogeneous network of a single operator -the infrastructure provider (InP).
The slicing manager's operation is enhanced by ML.The ML pipeline implemented therein receives data characterizing the aggregated capacity and the slices' demand therefor from CoRM.The data is fed to an ML model which computes C i ðtÞ, i ¼ 1; . . .; S, and returns them to CoRM in terms of shares of the total capacity for further coordinated resource allocation among RATs.CoRM instructs the RAT-specific RRMs to provide appropriate capacity to UEs.The slicing manager also uses parameters from the SLAs between the InP and the slices' tenants, which are accessible through the SLA management functions in the Operations/ Business Support System (OSS/BSS).
An MLFO supervises the ML pipeline's performance and the accuracy of the output.It calls for retraining if the accuracy is insufficient (e.g., due to a substantial change in demand) or the slicing parameters have changed (e.g., a new slice has been instantiated or SLA parameters modified).Retraining is performed in an ML sandbox (see Fig. 1), which has access to the data provided by CoRM and OSS/ BSS.

Radio Specifics
Each considered RAT has an assigned frequency band.As opposed to many previous studies of RAT network slicing, we adopt a joint methodology and combine the slice-level resource allocation with an explicit account for wireless channel dynamics.To this aim, we utilize computer simulations to obtain the time-varying RAT capacity.A 3GPP-compliant radio channel modeling procedure accounting for propagation, antenna, user mobility, human-body blockage, and line-of-sight obstruction specifics is detailed in Section 6 while the radio part sub-models are introduced below.

Propagation Model
Throughout the paper we consider the most complex RAT, mmWave, as an example.We represent propagation losses using the Urban-Micro (UMi) Street-Canyon model.
Let I LoS ¼ 1 if the UE is under the line-of-sight (LoS) and I LoS ¼ 0 under non-line-of-sight (nLoS) conditions (Table 1).
Similarly, let I nHB ¼ 1 if the UE is not blocked by human (nHB) and I nHB ¼ 0 otherwise (i.e., human-blocked, HB).According to [24], the path loss for the frequency band 0.5-100 GHz can be expressed in dB as where d is the three-dimensional (3D) distance in meters between the NR BS and the UE, aðI LoS Þ is a coefficient being 2.1 under the LoS and 3.19 under nLoS conditions, bðI nHB Þ is 32.4 dB when the UE is not blocked by human and 52.4 dB otherwise, f c is the carrier frequency measured in GHz, and x s SF ðI LoS Þ is the shadow fading in dB, which is normally distributed with zero mean and standard deviation s SF ðI LoS Þ.Note that the value of s SF ðI LoS Þ also depends on I LoS [24].By converting the path loss (1) to linear scale we can write the received signal-to-interference-plus-noise ratio (SINR) as where P BS is the emitted power, G BS and G UE are respectively the BS and UE antenna gains, N 0 is the thermal noise with M I being the interference margin, F N the noise figure, and L C the cable losses.The cable losses L C depend on the UE implementation and we assume L C ¼ 2 dB, see Table 2 [25].The noise figure, F N , also relates to the UE implementation and characterizes the amount of noise generated by the device itself with no signal present.We also set it to a typical value, 7 dB, but it may vary from device to device in the range 2-10 dB [25].Finally, interference is usually represented as a random variable depending on many factors including the type of deployment (random or cellular-like), operational frequency, deployment density, the use of resource blocks at neighboring cells, etc.However, interference models developed for mmWave systems have shown that the use of directional antennas greatly reduces interference as compared to microwave systems.This is why, and also to simplify the propagation model, we utilize the interference margin to capture inter-cell interference.Although we set the margin to 3 dB, one can utilize the models in [26], [27], [28] to estimate the mean interference in a specific setup and use it as M I .
As the timescale of interest in this paper is not less than a few tens/hundreds of transmission time intervals (TTIs), we exclude the fast fading phenomena from consideration in (2).

Blockage Conditions
Following [24], for a user at a 2D distance d meters from the BS, the LoS probability is In our model the user's LoS/nLoS state, which is indicated by I LoS , is chosen randomly according to (4) and remains unchanged for a time period exponentially distributed with mean t LoS .The latter is interpreted as the time to cross a building block at a pedestrian speed v.
An attenuation of 20 dB induced by human-body blockage [29] is reflected in the value of bðI nHB Þ in (1).Following [30], we assume that human blockers are represented by cylinders with a base radius of r B and a height of h B meters.Then, the human-body blockage probability is given by [30] where z HB is the density of blockers per square meter and Dh ¼ h BS À h UE with h BS and h UE being the BS and UE heights, respectively.Similarly, we sample the user's HB/ nHB state by ( 5) and it does not change for a random time interval exponentially distributed with mean t HB .

Antenna Model
Linear antenna arrays are assumed at both UEs and the BS.
To model the radiation patterns, similarly to [31], we utilize cone models with a constant gain over the main lobe.We denote by K X , X 2 fBS; UEg, the number of antenna elements and assume the distance between the neighboring elements to be =2, where is the wavelength.The phase excitation difference between the elements is assumed zero.Then, following [32], the half-power beamwidth (HPBW) of the main lobe for a symmetrical pattern is given by 2ju m À u AE 3dB j, where u m ¼ p=2 is the array orientation and u AE 3dB ¼ arccosðÇ 2:782 K X p Þ are the half-power points' angles.The mean gain over the main lobe can be obtained, for X 2 fBS; UEg, as [32]

Cell Size and User Mobility
It is assumed that, upon session arrival, the user is randomly placed in the cell coverage area according to the uniform distribution.The coverage area is specified by the BS service radius such that under the worst-case blockage conditions -nLoS and HB -a cell edge UE experiences an outage (i.e., a SINR below a threshold value SINR thre ) no more than fraction p out of time.It can be obtained as follows where M SF is the slow fading margin given by with erfc À1 ðÁÞ denoting the inverse complementary error function, p out is the outage probability at the cell edge coinciding with the fraction of time UE is in outage conditions.We track only users having active sessions and stop tracking a user as soon as his or her session terminates.To represent user movement, we adopt the Random Direction Mobility (RDM) model [33].Accordingly, each user selects a random direction from the interval ½0; 2pÞ and moves in this direction at a fixed constant speed v during an individual run time exponentially distributed with mean t RDM .As soon as this run time period elapses, the procedure for the user is repeated.Whenever the user reaches the cell's boundary, the movement direction is reflected.
The mobility of users is assumed homogeneous and independent of each other.The flow of users across the cell boundary is assumed stationary.

Cell Capacity
We approximate the total cell capacity CðtÞ at time t !0 by assuming equal bandwidth sharing between the N active users as follows: where B denotes the bandwidth and h i ðtÞ is the spectral efficiency of UE i at time t.The user's spectral efficiency, h i ðtÞ, is obtained by mapping SINR (2) to the NR modulation and coding scheme (MCS) [34].Note that ( 9) is essentially an approximation and builds upon two assumptions: (i) the whole set of available resources is utilized and (ii) resources are equally partitioned among all users.We then employ CðtÞ computed by (9) in Section 6.1 to characterize the cell capacity's dynamics, which is further used in Section 7. In practice the exact cell capacity's behavior depends on the exact slice resource allocations determined by the employed slicing policy, as well as on resource allocations between sessions within each slice, which in turn may depend on the tenants' proper policies.Thus, the adopted approach permits a decomposition of the overall problem into separate tasks.

Traffic and Slices
The BS serves heterogeneous traffic and the network slicing technique is employed to efficiently accommodate sessions with substantially different QoS requirements (voice, video streaming, gaming, etc.).We denote the set of all instantiated slices by S, jSj ¼ S, and assume each slice intended for one type of service, which makes it homogeneous in terms of session characteristics and QoS parameters.
Since slices are service-specific, for each of them we can define a minimum data rate per user, R min s > 0, needed to meet the QoS requirements of the service provided in the slice, R min ¼ ðR min s Þ s2S .It is assumed that a user cannot receive proper service if the data rate is below this value.Furthermore, following GSMA NG.116 [35], for each slice we can specify a maximum user data rate.It is denoted by , and corresponds to such a value that allocating a data rate higher than this will not result in any gain in QoS or quality of experience (QoE) for the user.
Let N s be the number of ongoing user sessions in slice s and denote the row vector containing the numbers of users in all slices by N ¼ ðN s Þ s2S .Each user is assumed to have only one connection in only one slice.If a user has multiple connections, it is considered and served as multiple users, one per connection.We assume that users arrive into the system according to a Poisson process of rate n, are directed to slice s 2 S with probability q s and leave the system upon session completion.Session durations in slice s are exponentially distributed with mean u s .
We assume that resources are shared among slices using the slicing scheme with equitable-priority-based performance isolation of slices proposed in [5].Let C s !0 represent the capacity of slice s 2 S and let it follow the number of users in the slice in the form C s ¼ N s R s , where R s is the user data rate in slice s 2 S to be determined by the slicing scheme.Note that R s is the ensemble average user data rate in the slice, and the actual data rates perceived by slice users may differ depending on their channel conditions and the resource allocation policy applied by the slice's tenant.Let R ¼ ðR s Þ s2S be a column vector.Considering that C is the total capacity of the BS, the capacities of slices must satisfy P s2S C s C.
For our system, we demand that as long as the number of users in the slice, N s , does not exceed a contracted number N cont s , N cont ¼ ðN cont s Þ s2S .Indeed, due to capacity limitations, slice performance isolation cannot be guaranteed for unrestricted traffic in all slices, so it is assumed that slice isolation is ensured as long as the number of users in the slice does not exceed its contracted threshold, i.e., N s N cont s .The InP thus guarantees performance isolation of slice s by providing it with at least a capacity of Any remaining capacity is distributed among all slices on the basis of fairness, but so that R s R max s , s 2 S. Note that we allow for overbooking, i.e., the sum of the contracted slice capacities, P s2S N cont s R min s , can be larger than C. The considered slicing scheme provides a flexible and dynamic partitioning of the total BS capacity among slices based upon (i) the parameters R min , R max and N cont , and (ii) the demand expressed in terms of the number of users N. It is assumed that the parameters R min s , R max s and N cont s are agreed upon between the InP and the slice s tenant and stated in the corresponding SLA, with N cont s set either directly or in the form of the contracted resource share Flexibility is assured by the fact that when some slices do not use all their contracted capacity N cont s R min s , the remaining capacity ðN cont s À N s ÞR min s becomes available to other slices if they need it.Thus, each slice has priority to its contracted capacity over other slices.
Computation of R is specifically discussed in Section 4 and involves solving a convex programming problem, which can prove computationally challenging under the time constraints characterizing radio resource scheduling.
The proposed ML enhancement addresses this challenge by providing time-efficient approximations using supervised ML techniques.

ML Enhancement
The architecture proposed by ITU-T and described in Section 2.1 is generic enough to be adopted in a multitude of scenarios [36].In this work, we propose to employ supervised ML to approximate C ¼ ðC s Þ s2S based upon a labeled sample obtained by applying the exact solution algorithm.Moreover, as the parameters N cont , R min and R max change at a much larger timescale than the demand N, they can be assumed constant, thus restricting the system's variability.

Online Learning
Two options for ML enhancement are investigated: online and offline.The online learning setting implies training on live data and repeatedly switching between the training and prediction phases.Each time a slice is instantiated, removed, or modified, a training phase begins.Here the slice capacities C are computed using the exact algorithm.The observed system states given by N along with the exact solutions are collected into a training dataset.Once the training set is populated, the implemented ML model is trained and the process moves on to the prediction phase.Here the ML technique is used to predict C from N. The accuracy is constantly monitored either through periodical comparison with the exact solution or by assessing relevant performance measures.Whenever the system detects insufficient accuracy due to a change in demand yielding population vectors N substantially differing from the training data, the process starts over from the training phase.

Offline Learning
Whereas in the online learning setting the model is trained for a specific, current range of workloads, in the offline scenario training and validation data are sampled from the uniform distribution on the feasible space of N. Labels for the data are computed using the exact algorithm with the average BS capacity.As a result, the trained model must be suitable for any workload regime and no accuracy monitoring is needed.The model has to be retrained only upon changes in the parameters N cont , R min and R max .

Performance Assessment
Four supervised ML models are investigated in the paper: a linear regression, a polynomial regression, a random forest regressor and an ANN.However, other ML techniques can also be employed in the proposed settings.Our goal is to evaluate and compare these ML algorithms in terms of approximation accuracy, generalization capability and time efficiency under the online and offline learning scenarios.We also assess the impact of channel variability on the performance of these techniques.Specific performance criteria will be discussed in Section 5.7.
The timescale of interest for resource allocation considered in this paper is at least a few tens/hundreds of TTIs.Practically, it corresponds to the resource reallocation procedure invoked at either (i) regular intervals or (ii) at time instants when the slice conditions (e.g., the number of active users) change.

EQUITABLE-PRIORITY-BASED SLICING POLICY
In this section, we first formalize a model of multi-tenant resource sharing at the air interface aimed at fair prioritybased isolation of slices.Then, we proceed to presenting the exact solution algorithm and motivating the need for ML approaches for the overall problem solution.

Resource Arbitration Scheme
Since the demand if given in terms of the number of users in slices, the state of the BS is described by vector N 2 V ¼ N S .We partition the state space as , where contains all states in which the available capacity suffice to allocate the corresponding maximums to all users, contains all states in which the maximums cannot be allocated to all users yet minimums can, and contains all states in which even the corresponding minimums cannot be allocated to all users.The slicing scheme determines the slice capacities C s ðNÞ 2 R þ , s 2 S, for each N 2 V so that P s2S C s ðNÞ C. For N 2 V max , we can assign the maximum data rate to all users in all slices, which yields For N 2 V opt , we seek to allocate resources in a way to (i) satisfy the minimum and maximum constraints (10), (ii) to make use of the whole available capacity and (iii) to provide max-min fairness to users taking account of whether the contracted number of users is exceeded and by how much.Such an allocation, for N 2 V opt , can be found as a solution to the optimization problem [5] with the weight functions defined as The target function UðRÞ in ( 17) is a utility function of the log type proposed for proportionally fair resource sharing in [37], which in our case coincides with the max-min fairness [38].The weight functions (20) ensure a max-min fair resource allocation to users as long as their number in the corresponding slices does not exceed the contracted quantity, and penalize the "violating" slices by decreasing their weights.The constraint (18) ensures not only that the total allocation does not exceed the available capacity C, but also that all available capacity is allocated.Finally, the box constraints (19) ensure that the minimum and maximum datarate requirements in slices are satisfied.
We now extend our policy to congestion states, N 2 If, conversely, N min R min < C, then we can allocate the due capacity N min R min and then need to distribute the remaining capacity C À N min R min to N À N min users.We will do this proportionally to the requested capacity as well, which yields The main features of the scheme, namely fairness and slice performance isolation, are illustrated numerically in Subsection 7.1 weights W obtained by (20) Output: R solving ( 17)-( 19) Since the objective function in SLICE N;C ðÁÞ is differentiable and strictly concave and the feasible region is compact and convex, there is a unique maximum for UðRÞ in the feasible region, which can be found by Lagrangian methods.For finding the exact solution of the problem we propose to employ Algorithm 1.It uses a recursive function, FUNCðN; R; C; sÞ, which populates the set of solution candidates, R, by considering all possible combinations of active constraints (19).
The algorithm operates as follows.The unique solution to the problem ( 17)- (18), i.e., with the box constraints (19) lifted, can be easily found as If the stationary point, R SP , satisfies (19), then it is the sought-for optimum and jRj ¼ 1. If, conversely, R SP is not feasible then FUNCðÁÞ is run recursively with one additional constraint -either R s ¼ R max s or R s ¼ R min s -activated at each call for all s 2 S corresponding to non-zero N s .If, for instance, the boundary R s in the solution candidate while its other entries are searched for as the solution to the problem under study with C À N s Ã R max s Ã in place of C and N s Ã set to zero, hence the recursion.Once all possible combinations of active constraints have been considered and R populated, the vector maximizing the objective function ( 17) is chosen among the members of R.
Note that whenever (23) does not provide a feasible solution, the time complexity of Algorithms 1 is exponential in S. The problem was tackled by iterative methods, namely the Gradient Projection method, in [5], however, it implied matrix inversion, which brings its complexity to OðS 4 Þ in the worst case.Under high traffic conditions, when the number of sessions in slices may change on sub-second timescales, and when the number of slices is rather high this could be problematic for implementation.For this reason, we need faster algorithms that can be found in the ML field.In the next section, we consider several such approximations to speed up the resource arbitration process.

SUPERVISED ML TECHNIQUES
In this section, we discuss the motivation and specifics of using supervised ML to enhance the considered resource arbitration procedure.Then, we formulate the corresponding ML regression problem and introduce the supervised ML techniques under analysis.Finally, we present the performance criteria that we use for comparing the techniques.

ML Enhancement Motivation and Specifics
Under high traffic conditions when the number of sessions in slices may change on sub-second timescales and when the number of slices is relatively large, obtaining the slice allocation C ¼ ðC s Þ s2S via the procedure described in Section 4 could be problematic for N 2 V opt as it implies solving the optimization problem SLICE N;C ðÁÞ, whose exact solution is of exponential time complexity.To quicken the procedure for this range of system states, we propose employing time-efficient supervised ML techniques permitting to approximate the solution based upon a number of exact solution samples.
Recall that the SLICE N;C ðR min ; R max ; N cont Þ parameters are of different nature.The SLA thresholds, R min , R max and N cont , change far less frequently than the demand N (e.g., when a slice is added or removed).We can hence assume them constant during prolonged periods of time and retrain the model each time they change.The BS capacity C, on the other hand, may vary much more frequently than N, but, as it will be shown in the next section, remains concentrated around its mean value, which we denote by C. We can thus assume that its fluctuations merely induce noise in sample labels and not consider it as an explicit parameter.
Finally, although SLICE N;C ðÁÞ yields a solution in the form of data rates R, we formulate the ML techniques for predicting slice capacities C and, consequently, use as sample labels C ¼ N R T , where denotes component-wise multiplication.This is due to the fact that C s are much smoother mapping functions of N compared to R s , which tremendously improves regression accuracy.
while C ðkÞ ¼ N ðkÞ ðR ðkÞ Þ T , where R ðkÞ is the solution to SLICE N;C ðÁÞ with N ¼ N ðkÞ and C ¼ C þ k .Quantities k , k ¼ 1; . . .; K, are realizations of a random variable representing random deviation of the BS capacity from its mean.Now, our task is to use D to build such a vector function fðNÞ ¼ ðf s ðNÞÞ s2S that e C ¼ fðNÞ represents a suitable approximation for a slice allocation obtained from solving SLICE N;C ðÁÞ for any N satisfying (25).
For training the models we adopt the common approach and rely on the quadratic loss function (see, e.g., [39]), which in our case takes the form Then, for each type of ML algorithm, the function f that minimizes the loss ( 26) is found and evaluated.In what follows, having training and execution complexity in mind, we specifically concentrate on two simple approaches (the linear and polynomial regressions), one with medium complexity (the random forest regressor) and the most complex one -ANN.

Linear Regression
The simplest technique we use is linear regression.Here, we predict the vector of slice capacities in state N as e C ¼ xB ¼ ð1; N 1 ; . . .; N S ÞB; (27) with the regression coefficients, B, to be determined from the training data D to minimize (26).The matrix of regression coefficients can then be computed as [39] B ¼ ðX T XÞ À1 X T Y; where X has the form and Y ¼ ½y 1 . . .y S with column vectors y s ¼ ðC ðkÞ s Þ k¼1;K .The time cost to train and to query a linear regression model is, respectively, OðS 2 K þ S 3 Þ and OðSÞ [39].In our case, we have S such model, one for each C s .

Polynomial Regression
In a polynomial regression of degree 2 the prediction is computed by (27) with vector x appended on the right with entries N i N j for all i; j ¼ 1; . . .; S such that i j.The matrix of regression coefficients is obtained by (28) in which matrix X is appended on the right with respective columns ðN ðkÞ i N ðkÞ j Þ k¼1;K for all i; j ¼ 1; . . .; S such that i j.A polynomial regression of degree m > 2 is constructed from a regression of degree m À 1 by the same procedure.Namely, the prediction is obtained by (27) in which vector x of a regression of degree m À 1 is appended on the right with products of the form S for all i s ¼ 0; 1; 2; . . .such that P S s¼1 i s ¼ m.Matrix X in ( 28) is appended by respective columns of the form ððN ðkÞ for all i s ¼ 0; 1; 2; . . .such that P S s¼1 i s ¼ m.Obviously, the linear regression is polynomial of degree 1.

Random Forest Regressor
The random forest algorithm constructs multiple decision trees on various sub-samples of the dataset and, in the case of the regressor, returns the average prediction of the individual trees.A decision tree is built by recursively partitioning the feature space so that the samples with the same or similar labels are grouped.
More specifically, let D m be the training data at tree node m, jD m j ¼ K m .For each candidate split u ¼ ðs; n m Þ consisting of a feature s and its threshold value n m , the data The quality of a candidate split of node m is computed using a loss function LðÁÞ as [40] gðD m ; uÞ ¼ The commonly used loss function for regression problems is, again, the mean squared error (MSE) [40], defined as The split minimizing (30), say u Ã , is then applied, and the algorithm recurses for subsets To predict the label of a sample N the algorithm starts at the root node of the decision tree and moves down the tree until a leaf is found.The sample is then associated with the label of this leaf.
According to [40], in general, the run time cost to construct and to query a balanced binary tree is, respectively, OðKS log KÞ and Oðlog KÞ.The time complexity of a random forest with T balanced trees is OðTKS log KÞ to train and OðT log KÞ to query, attaining respectively OðTK 2 SÞ and OðTKÞ in the worst case.

Artificial Neural Network
The last ML technique investigated is a fully connected ANN with M ReLU-activated hidden layers of size J m , m ¼ 1; . . .; M, and a linear activation output.More specifically, the approximation is computed as The matrices B ðmÞ , m ¼ 0; . . .; M, are obtained from D via backpropagation using the sum quadratic loss.

Performance Criteria
We evaluate and compare the considered ML techniques in terms of accuracy and performance within the settings described in Section 3.
the mean absolute error the maximum residual error To evaluate performance of the ML techniques, we consider their time efficiency and the capability to satisfy the data rate constraints.The former is assessed via: (i) the prediction time and (ii) the sample dataset size needed for training, which is particularly relevant in the online learning setting.
The capability of the approximations to satisfy the constraints is estimated using the probability that the approximate resource allocation to slices results in SLA violation for capacity C, that is,

EVALUATION SCENARIOS
In this section, we detail our simulation approach for estimating the total cell capacity and specify representative traffic characteristics.The resulting scenarios are then utilized to numerically assess performance of the ML techniques discussed in the previous section.

Dynamic Resource Characterization
We first determine the statistical characteristics of the stochastic process representing the cell capacity for different fixed numbers of users N.With each user we associate three independent Poisson processes (PP) of events.The first, of rate t À1 RDM , yields a sequence of times at which a new movement direction is selected randomly from the interval ½0; 2pÞ.The second, of rate t À1 LoS , provides a sequence of times at which the user's LoS/nLoS state is selected according to the probability (4) in the current position of the user, and a new value for the shadow fading factor, x s SF ½dB $ Nð0; s SF ðI LoS ÞÞ, is sampled with the standard deviation corresponding to the selected LoS/nLoS state.The selected LoS/nLoS state and the shadow fading factor remain unchanged until the next event of this PP.Finally, the third PP is of rate t À1 HB and yields a sequence of times at which the user's HB/nHB state is selected according to probability (5).
Simulation time is divided into intervals of length Dt.At the beginning of each interval ½t j ; t j þ DtÞ, j ¼ 1; . . .; T , t 1 ¼ 0, the new coordinates of user i ¼ 1; . . .; N are calculated according to the RDM model as described in Section 3.2.4.If a user reaches the boundary of the BS service area defined by (7), its trajectory is reflected.At each time t j , j ¼ 1; . . .; T , from the coordinates of each user i we calculate its 2D distance to the BS, r i;j , and compute by (2) the user's SINRð ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r 2 i;j þ Dh 2 q ; I LoS ; I nHB Þ.Then, the user's spectral efficiency, h i ðtÞ, is obtained from the NR MCS to SINR mapping [34].The total BS capacity for t 2 ½t j ; t j þ DtÞ is approximated as CðtÞ ¼ Cðt j Þ by (9).
The system parameters utilized for cell capacity characterization are provided in Table 2.In this work, we deal with a relatively long-term resource allocation spanning the time of a few tens/hundreds of TTIs (the assumed TTI is the mmWave NR subframe corresponding to the numerology m ¼ 3, i.e., 1 ms).Because we are concerned with the approximate total cell capacity obtained by averaging-out users' spectral efficiencies, the channel sampling interval, Dt, is directly set to a rather large value of 1 s.If, however, individual users' data rates are of interest, then one should track SINR at a much greater time resolution, e.g., 1 ms.Fig. 3 shows the time series of the total cell capacity while Fig. 4 reports the cumulative distribution function (CDF), the probability density function (pdf) and the autocorrelation function (ACF) of CðtÞ for 30, 40, 50 and 60 UEs.The data in Fig. 4 show that, regardless of the number of users, the distribution is close to log-normal reflecting the effect of a power-decaying propagation model.Furthermore, for all the considered numbers of UEs in the system, the average total cell capacity remains unchanged at C % 2000 Mbps.As one may observe in Figs.4a and 4b, although the variance of the cell rate decreases with more UEs in the system, it still remains relatively large even for 60 UEs.This implies that the constant cell rate assumption, which is often made for simplicity reasons in slicing algorithms, may naturally require timely recalculation of the slice shares.The latter places strict constraints on the algorithms execution time, thus requiring lightweight ML solutions irrespective of the implementation type (online or offline).As further shown in Fig. 4c, although the memory of the cell rate process is rather short, it remains very high over 2-4 s intervals providing a lower bound on the re-slicing frequency.

Traffic Characterization
A realistic assessment of traffic conditions is the key to creating network slicing models meeting modern requirements.To offer an insight into a typical traffic composition and its fluctuations, Table 3 provides traffic shares of the most significant application categories at different periods of the day based upon statistics collected in North America in 2021 and summarized in [41].Furthermore, according to the same report, in the afternoon, the total traffic is about 20 % higher than during the morning hours.Also, video streaming traffic clearly predominates in downlink where its average share attains 48.9 %.
Using these data as a starting point and taking into account the projected growth in cloud gaming and virtual and augmented reality (VR/AR) applications [42], for our we consider a composition of slices with the SLA and workload parameter values provided in Tables 4 and 5 respectively.Recall that the considered SLA parameters are the minimum and maximum user data rates R min s and R max s , and the contracted slice resource share g s (12), to which the slice's sessions have priority over other slices.The workload parameters include the mean slice session durations u s , the user arrival rate in the cell n, and the probability q s for an arriving user to be served by slice s 2 S. The interarrival times and durations of sessions are assumed exponentially distributed.
The voice slice in our list is intended for classical cellular telephony services.Unlike all other slices, we assume it non-elastic in the sense that its size does not scale in and out with the demand due to the low data-rate requirement of a voice session; its capacity share is thus fixed.The best effort slice serves the traffic of social networking, web browsing, messaging, file sharing and other services that usually do not require strict QoS guarantees.The video streaming slice receives traffic of video streaming applications that provide quality guarantees to users.Gaming and VR/AR services are the most resource-demanding with strict latency requirements [43], and therefore have dedicated slices.Finally, corporate slices are assumed to be client-specific and each serve a mix of traffic including resource-consuming video conferencing, but also file sharing, cloud, messaging, etc.
The workload parameters for the scenarios in Table 5 are chosen to yield the following distributions of traffic shares among slices, assuming the user data rates in slices R ¼ ð1; 20; 10; 25; 50; 25; 25; 25Þ.In scenario SceMo6, which roughly corresponds to the morning hours, the average distribution of traffic shares among slices in percentages is ð5; 40; 20; 5; 10; 20; 0; 0Þ with the average resource utilization of 56 %.Scenario SceMo8 is different from scenario SceMo6 only in that the workload of the corporate slice is distributed among three corporate slices 6, 7, and 8 in ratios 1/2, 1/4, and 1/4.Scenario SceAN6 represents the afternoon and yields the average traffic percentages of ð3; 30; 30; 12; 20; 5; 0; 0Þ with the utilization of 80 %.Finally, scenario SceAN8 is similar to scenario SceAN6, but the workload of the corporate slice is equally distributed among three corporate slices 6, 7, and 8.
The contracted capacity shares, g s , for slices 1-6 are set approximately corresponding to the 99-percentiles of their traffic shares under n ¼ 3, the maximum q s over all the scenarios under consideration, and the user data rates R min s .The contracted shares of all corporate slices are set g 6 for simplicity.Thus, in this work, we assume that only about a half of the total capacity is contracted and the remainder is shared among slices without prioritization.Other scenarios can be designed based upon different assumptions as to the InP's business model and pricing strategies.

NUMERICAL RESULTS
In this section, we first numerically illustrate the main features of the considered slicing scheme.Then, an accuracy and performance assessment of the ML models under study is provided, preceded by a short discussion on the utilized data.Lastly, we address the impact of the cell capacity's variability on the efficiency of the slicing scheme.

Slicing Policy Features
We start with Fig. 5 providing insight into the considered slicing policy's features, namely slice isolation, data rate requirement enforcement and fairness.We consider capacity C and state N ¼ ð58; 22; 23; 2; 2; 9Þ corresponding to the average number of users in slices under scenario SceMo6, and vary the number of users in the video streaming slice.The voice slice receives a constant capacity share, i.e,.C 1 ¼ g 1 C, and the slicing schemes is applied to e S ¼ f2; . . .; 6g and e C ¼ ð1 À g 1 Þ C. It can be observed in Fig. 5 (bottom) that as N 3 grows, capacity is fairly allocated to users accounting for their minimum and maximum data-rate requirements.For instance, for N 3 2 ½10; 40 the data rate of slice 3 is R max 3 , while the rates in other e S slices, being greater than R max 3 , are equal to each other.As soon as the data rates in slices e S n f3g go below R max 3 , all the rates decrease altogether, but not below their respective minima.However, as soon as N 3 grows beyond the contracted value, N cont 3 , the data rates in other slices do not change and only R 3 keeps decreasing so that the capacity of slice 3 remains equal to its value for N cont  (see Fig. 5

top). Thus, performance in slices e
S n f3g is isolated in the case of traffic increase in slice 3 beyond the contracted capacity.

ML Assessment Data and Tools
To numerically evaluate the accuracy and performance of the considered ML models, we rely on two types of labeled sample datasets.The majority of data is sampled via the slicing analysis simulator [44] with cell capacity values simulated via the procedure in Section 6.1.Here, separate datasets are obtained for the four scenarios using the parameter values in Tables 4 and 5. Another type of datasets -uniformly sampled -are populated with vectors N where each N s is randomly sampled from f0; . . .; b C=R min s cg, and average capacity is assumed.These are used to evaluate the offline learning setting.
Data were sampled and labelled with the voice slice omitted, i.e., assuming e S ¼ f2; . . .; Sg and e C ¼ ð1 À g 1 ÞC.All datasets have been filtered so as to consist of samples satisfying P To implement the considered ML models we utilized the corresponding functions from the scikit-learn library [45], namely LinearRegression with PolynomialFeatures preprocessing in the case polynomial regressions and RandomForestRegressor with 10 and 50 decision tree estimators.As for the ANN, we employed Keras' Sequential model with two ReLU-activated hidden layers of size 64 and 128.Such a  configuration resulted from the hyperparameter optimization on a SceAN6 dataset using the KerasTuner framework.

Accuracy of the ML Models
Accuracy assessment results are provided in Tables 6 and 7 and also in Figs. 6 and 7. Specifically, Table 6 shows the average accuracy for different evaluation settings.Row 1 deals with offline learning.Here, the models are trained on uniformly sampled data with averaged cell capacity and then tested on SceMo simulation datasets.Rows 2 and 3 illustrate the online learning setting.In row 2 the models are trained on SceMo datasets different from the test ones, whereas in row 3 the models are trained on SceAN datasets and then tested on the test SceMo datasets.The same SceMo6 and SceMo8 datasets are used for testing in all rows.
It can be observed that the online learning setting yields substantially better results than the offline one as long as training and testing are performed on data sampled with the same workload parameter values.However, when tested on data sampled with differing workload parameters, the accuracy is generally quite poor, although some models generalize better than others.
We specifically look into the sensitivity of the models' accuracy to the change in the workloads in Figs. 6 and 7.As one may observe, the accuracy of the models is hardly impaired by variations in the overall workload level (Fig. 6), but is greatly affected by changes in the workload distribution among slices (Fig. 7).We clearly see that the polynomial regressions closely followed by ANN demonstrate the best generalization capacity among the considered models.
A comparison between S ¼ 6 and S ¼ 8 results provided in Table 6 shows that an increase in the number of slices improves the prediction accuracy in the online training setting.This result is particularly welcome since it is for a larger number of slices that the exact solution algorithm becomes slow and an alternative solution is needed.Whereas the metrics in Table 6 are averaged over all slices, Table 7 provides accuracy assessment by slice.It can be observed that all the models simultaneously exhibit substantial variations in prediction accuracy from slice to slice.Unfortunately, these variations cannot be explained and hence predicted from the slices' SLA parameters: e.g., the video streaming slice yields the best accuracy on SceMo data and the worst on SceAN.On the other hand, we notice that for each workload range, significant errors come from about two slices, while approximations for others are satisfactory.Thus, in the ML pipeline it may suffice to monitor and adjust predictions for just a few slices.

Performance of the ML Algorithms
Besides accuracy, we evaluate performance of the considered models by looking into their prediction time (Table 8), trade-off between accuracy and prediction time (Fig. 8), training dataset sizes (Fig. 9) and capability to respect the data rate constraints (Table 9).
According to Table 8, the regression models are substantially faster than the random forests and ANN.However, when increasing the number of slices from e S ¼ 5 to e S ¼ 7, the execution time of the regressions approximately doubles, while that of RF and ANN barely changes.It should be noted that the prediction time of regressions and ANN is determined by the model dimensions, whereas that of RF also depends on the data and varies from one dataset to another.
Fig. 8 reveals the trade-off between accuracy and prediction time based on the data in Tables 6 and 8 for the online learning scenario.Here, the XGBoost regression model [46] is added as a benchmark.It can be observed that PolyReg2 provides the best compromise between execution time and accuracy.
Fig. 9 shows the learning curves in terms of MAE for the models under study.We observe that the polynomial regressions perform the best in the online learning setting, with PolyReg2 potentially slightly less accurate than Poly-Reg4, but reaching its maximum accuracy faster, with less than 1500 samples.The accuracy of the random forest regressors and ANN keeps improving for much larger training dataset sizes, especially when learning offline.However, as we can see from Table 6, good cross-validation results on a uniformly sampled dataset with a fixed capacity value, exhibited by ANN and the RF regressors, do not guarantee satisfactory prediction accuracy on a test dataset simulated with varying cell capacity and hence noisy.
Finally, Table 9 shows that the approximations via all the considered ML models are quite prone to SLA violation, especially under higher workloads.Therefore, whatever the adopted model, a specific SLA guarantee control must be Fig. 6.Accuracy sensitivity to the overall workload level.For each run, the models are trained on a SceAN6 sample (n ¼ 2:7) of size 6Â10 4 and then tested on samples of size 250 generated for n 2 ½2:5; 2:9 under the SceAN scenario assumptions.Lines show MAE averaged over 10 runs versus n.Fig. 7. Accuracy sensitivity to workload variations among slices.For each run, the models are trained on a SceAN6 sample of size 6Â10 4 and then tested on samples of size 250 generated for q    6 and 8.
implemented in the ML pipeline by means of a policy node.Such a node might first check whether the inequality where C s is the size of slice s returned by the model node, holds for all s 2 S, and if this is not the case then set Either of these would provide a suboptimal yet SLA-conform allocation.In our modeling setup adding such operations would increase the execution time by about 1:32 Â 10 À2 s for S ¼ 6 and 1:4 Â 10 À2 s for S ¼ 8 per 1000 data points.

Impact of Varying Traffic and Channel Conditions
The considered slicing scheme yields the optimal resource allocation to slices provided that it is executed to the current cell capacity value.A delay in re-slicing may result in issues illustrated in Fig. 10.Here user data rates for N corresponding to the average numbers of users in SceMo (top) and SceAN (bottom) are plotted versus C. Solid lines show the data rates computed for each plotted value of C and dashed lines indicate the data rates computed for C and then scaled for each C with factor C= C. It can be seen in Fig. 10 (top) that the scaled data rate in the video streaming slice for C > C is above the maximum, R max 3 ¼ 25, which results in a waste of resources.A bigger issue can be observed in Fig. 10 (bottom).Here, the scaled data rate in the VR/AR slice goes below the required R min 5 ¼ 20, which leads to SLA violation.To investigate further the impact of capacity variations, in Fig. 11 we plot versus the arrival rate the probabilities that the minimum data rates, although maintained under C, are violated under a current capacity value provided that re-slicing is done in real time.Note that here we do not account for the contracted numbers of users, and this can hardly result in SLA violation, for which we need We observe that under the SceAN user distribution this probability becomes significant when the workload initially assumed in the scenario is multiplied by 3.7, and under SceMo the workload has to be multiplied by 7.Both probabilities attain about 6-7 % and then go down as the workload grows further because C also becomes insufficient for the growing traffic.
The data provided in Figs. 10 and Fig. 11 emphasize that the variable cell capacity may lead to significant degradation of contracted rates when re-slicing is not performed timely.Otherwise, its impact is limited to very high workload regimes and barely affects SLA.Complementing these illustrations is Fig. 12 showing the system degradation probability for different re-slicing triggers.By the   degradation probability we understand the probability that the data rate requirements of at least one slice are violated.Separate simulations are performed for (i) the variable cell capacity sampled at 1 s intervals, (ii) the approximation via the constant cell capacity C ¼ 2000, and (iii) the variable cell capacity for degradation control and C for re-slicing.By analyzing the presented data, one observes that the least degradation is induced by invoking re-allocations upon both session arrival and departures.Re-slicing upon arrivals only shows a performance comparable to that of the timerbased approach with 1 s intervals in the depicted SceAN scenario characterized by higher workloads.Quite naturally, an increase in the re-slicing interval results in a higher performance degradation.Importantly, we observe that relying upon a constant cell capacity leads to overly optimistic results in high load conditions.At the same time, for lower load conditions corresponding to the SceMo scenario (not shown in the figure) no significant difference can be seen between the three considered parameterizations.Thus, one can conclude that capturing the cell capacity dynamics timely is critical in high load conditions.

CONCLUSION
Motivated by critical time and accuracy constraints for the slicing process in future 5G NR systems, in this paper we have compared and evaluated performance of ML algorithms for enhancing a RAN slicing scheme as standardized by ITU-T.By accounting for a realistic channel model and slice workload distributions, we have assessed their accuracy and efficiency as well as sensitiveness to cell rate variations operating in online and offline learning regimes.
We have shown that the online implementation exhibits improvements over the offline for the same slice composition, overall workload level and workload distribution among slices.Furthermore, the polynomial regressions are potentially our best choice for online learning, since in this setting they outperform both neural and random forest algorithms in terms of accuracy, execution time, sensitiveness to workload variations and the size of training data needed to achieve the optimal accuracy, which makes them suitable to accommodate frequently changing traffic distributions across slices.However, the random forest regressor is a close competitor capable of achieving a better accuracy when trained offline, although with a much larger training dataset.
By assessing the effect of the overall workload level and channel variations we have demonstrated that the latter may lead to significant degradation of the contracted data rates, especially in high workload regime, if re-slicing is not performed timely, which drives the need for ML enhancement.Although when using an ML technique a monitoring and adjustment mechanism is needed in the ITU-T standardized ML pipeline (i.e., the policy node) to enforce SLA constraints.
We note that the proposed approach can be adapted to other RAN slicing schemes formulated as optimization problems with constraints.The presented framework, results and discussion could hence provide guidance for such an adaptation.C þ < NR min Cg for user distributions q Mo6 (dashed) and q AN6 (solid).

4 .
The predictive accuracy of an ML model f in a labeled dataset D of size K is assessed via the root mean square error RMSEðf; DÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi MSEðf; DÞ p ;

Fig. 3 .
Fig. 3. Time series of the cell capacity.
states belong to set V opt and require the time-greedy solution of the optimization problem SLICE N;C ðÁÞ.

Fig. 8 .
Fig.8.Trade-off between accuracy and prediction time for the online learning scenario based on data in Tables6 and 8.

Fig. 9 .
Fig. 9. Learning curves via 10-fold cross validation over data sampled from SceAN (left and center) and uniformly with the averaged capacity (right), S ¼ 6. Solid lines show MAE from validation data, dashed lines represent MAE from training data.

Fig. 10 .
Fig. 10.User data rates versus cell capacity in the system states corresponding to the average numbers of users in the SceMo (top) and SceAN (bottom) scenarios.Solid lines represent the data rates computed by Algorithm 1 for each plotted capacity value C; dashed lines show the data rates computed by Algorithm 1 for C ¼ 2000 and then, for each C, scaled with factor C= C.

Fig. 11 .
Fig. 11.Probabilities of the minimum data rates' violation due to capacity variations versus n.The lines represent PfN; :C þ < NR min Cg for user distributions q Mo6 (dashed) and q AN6 (solid).

Fig. 12 . 1 T R T 0 1f9s 2 S
Fig. 12.Estimated probabilities of the minimum data rates' violation for different re-slicing triggers under the SceAN6 scenario.Colored bars represent 1 T R T 0 1f9s 2 S : C s ðtÞ < N s ðtÞR min s gdt, where N s ðtÞ is the number of sessions in slice s at time t and C s ðtÞ ¼ C s ðt Ã Þ CðtÞ Cðt Ã Þ with t Ã being the time of the last re-slicing before t.The values are estimated by simulation for T % 8 hours and averaged over 10 runs.Black bars indicate STD over 10 runs.

TABLE 1 Notation
Utilized in This Paper

TABLE 7
Accuracy of the ML Models by Slice via 10-Fold Cross Validation

TABLE 9
Probability That the Approximation Leads to SLA Violation, %