Confidence Aware Deep Learning Driven Wireless Resource Allocation in Shared Spectrum Bands

Deep learning (DL) driven proactive resource allocation (RA) is a promising approach for the efficient management of network resources. However, DL models typically have a limitation that they do not capture the uncertainty due to the arrival of new unseen samples with a distribution different than the data distribution available at DL model-training time, leading to wrong resource usage predictions. To address this, we propose a confidence aware DL solution for the robust and reliable predictions of wireless channel utilization (CU) in shared spectrum bands.We utilize an encoder-decoder based Bayesian DL model to generate prediction intervals which capture the uncertainties in wireless CU. We use the CU predictions to design a novel metric score which in turn is utilized to make an adaptive RA algorithm. We show that a DL model capturing uncertainty in CU can achieve higher data rates for a wireless network. Both DL driven predictions and RA models are tested using synthetic data as well as real CU data collected in the University of Oulu. Using analytical and simulations results, we also study the stability of the proposed RA algorithm and show that it converges to a Nash equilibrium (NE). Our results reveal that the proposed algorithm converges to an NE under 2N iterations where N is the number of network access points.


A. BACKGROUND AND MOTIVATION
For the sixth-generation (6G) of wireless networks, the prediction of their resource utilization variations using sophisticated deep learning (DL)-driven techniques can enable the network to proactively schedule resources for those services/network elements which have higher resource demands [1]- [3]. This approach is opposed to the reactive resource allocation approaches mainly adopted by existing networks which allocate resources based on current resource requirements. A proactive resource allocation solution which optimizes the resources beforehand is able to cater to the demands better and effectively avoid congestion of resources.
Efficient and robust management of network resources is a critical aspect for the success of the next generation of wire-less networks [4]. Judicious resource allocation policies in wireless communication networks can guarantee the required quality of service (QoS), and can also maximize a wireless operator's revenue through optimal and efficient operation of its network [5]. Recently, cloud management of wireless networks in both licensed and unlicensed/shared spectrum bands have attracted the attention of both the industry and the research community. This interest is due to the cloud's ability to optimize and manage resources of an entire network with enhanced computation and analytics capabilities [6], [7]. To make the most out of a cloud managed network, tracking the right metrics using descriptive analytics and combining them with DL models for realistic predictions are significant for their efficient design. For example, the works in [8]- [10] used DL based predictions for network resource allocation.  Most of the existing studies using DL methods for proactive resource allocation in wireless networks such as [8], [9] have implicitly assumed that training and target datasets have the same distribution. Hence, they produce erroneous resource usage predictions when the target datasets have different distribution than the training dataset. This can lead to unreliable predictions as the changes in users' resource utilization in a network over time give rise to uncertainty in the DL models. Wireless datasets can be dynamic which means distribution of these datasets can change from time to time. In the context of DL, the uncertainty arising due to the change in the data distribution from training to testing is called model misspecification [11].
To provide an example of model misspecification in wireless networks, in Fig. 1(a) and 1(b), we illustrate wireless channel utilization (CU) 1 collected by us over an unlicensed shared channel in the University of Oulu. The details of the data collection process is given in section III-A. CU is given as a percentage value between 0% and 100% and it represents the amount of wireless channel usage by various users and access points (APs) within the measuring time interval t. Fig. 1(a) shows the collected CU data values in percentage for nine days. It can be seen from the figure that there is a daily pattern for weekdays (Feb. [11][12][13][14][15] and weekends (Feb. [10][11]. Fig. 1(b) which presents the CU data for the next nine days shows that while there is still a daily pattern for weekdays, however, the weekend pattern does not hold due to high CU on Sunday, Feb. 24. A DL model may not be able to predict this kind of behavior as it might not have seen this behavior before in the training data. Other uncertainties associated with DL models are model uncertainty [11] which arises due to missing training data covering certain areas of the input domain and inherent noise [11] which arises due to uncertainty in the data generation function. Capturing these kinds of uncertainties associated with DL models are crucial for the correct operation of proactive resource allocation systems which rely on them.
Handling uncertainties using deep neural networks (DNNs) are problematic due to their inherent design. DNNs have a tendency to overfitting which can unfavorably impact their generalization capabilities. Moreover, they are overconfident about their predictions even for out-of-training distribution data [12]. Therefore, it is difficult to employ DNNs in estimating uncertainties. Bayesian Neural Networks (BNNs) on the other hand produce predictions by the aggregation of predictions from a large set of independent and averageperforming predictors [12]. This can allow BNNs to make predictions better than DNNs and also enable them to estimate uncertainties in a meaningful way. This has motivated us to use a DL model based on BNN in our design.

B. MAIN CONTRIBUTIONS
In this paper, we first focus on the design of a DL model that can perform robust and reliable predictions of wireless CU for proactive resource allocation in unlicensed shared spectrum bands. Our DL model also incorporates uncertainty estimation of CU predictions that is utilized for real-time change detection in CU data distribution. Our DL model is based on encoder-decoder framework of [11] which uses BNN to handle uncertainty in models. We use Algorithm 1 of [11] as an application in improving the predictions in wireless network resource utilization. We also use it to design a more efficient resource allocation algorithm which is not only proactive, but also reactive to the generated alarms via Algorithm 1 for resource utilization.
The main contributions of this paper can be summarized as: 1) The proposed DL model addresses the problem of model uncertainty and model misspecification. The DL model uncertainty which arises due to insufficient training samples is reduced by collecting a large number of real CU samples using our FPGA based radio frequency (RF) data processing module (see [13] for details of its implementation). Moreover, the DL model misspecification is quantified by using an encoderdecoder model on real wireless CU data. Rather than making only the point predictions, the DL model also estimates the degree of uncertainty in predicted wireless CU values via the use of prediction intervals (PIs). A PI is a type of confidence interval used with predictions; it is a range of values that predicts the value of a new observation, based on an existing model. 2) We present a methodology for characterizing and measuring the robustness of the proposed prediction method and show that our DL model outperforms other models in terms of robustness. We also evaluate the prediction performance of the developed model using real CU data in terms of mean average percentage error (MAPE) and compare its performance with a baseline method (naive), a long short term memory (LSTM) model, and a gated recurrent unit (GRU) model. 3) We utilize the DL prediction values for a given time period to design novel resource metric scores which in turn are used to find an automated, stable and efficient channel allocation solution for multiple wireless APs. In particular, the uncertainty estimate from the prediction model is used by the channel allocation algorithm to adapt the allocation decisions when the real observed CU values fall outside of the defined PI. Our results show that taking into account the uncertainty estimate from the prediction model can improve the channel allocation performance as compared to when no uncertainty estimate is utilized. 4) We evaluate the performance of the proposed algorithm in terms of average sum metric and average sum rate which reflect the effectiveness of all the APs for a given channel allocation. To evaluate the stability of the proposed algorithm under various scenarios, along with simulations, we also focus on analytical game theoretic concept of stability called Nash Equilibrium (NE). An (pure) NE represents an individually agreeable, or stable, allocation for scenarios where no wireless AP has an incentive to unilaterally deviate from the proposed resource allocation solution. We also show that the proposed algorithm requires no more than 2N resource allocation steps to stabilize, where N is the number of APs.
Rest of the paper is organized as follows. The next section presents related work including the application of DL in wireless networks. In Section III, we present the system model. In Section IV, we present the theoretical basis and the implementation details of the proposed DL model including the prediction performance results. Section V presents the novel metric scores and the proposed channel allocation algorithm. In Section VI, we evaluate the proposed algorithm under several different scenarios and discuss the simulation results. Section VII concludes the paper and also provides the future research directions.

II. RELATED WORK
DL is a subset of recent machine learning methods which is based on artificial neural networks (NNs). These NNs learn from historic data which construct input-output mappings for the impending problem [14]. Recently, use of DL driven techniques to efficiently address key problems in wireless networks have generated interest in the research community. For example, the work in [10] has focused on DL based multicast traffic demand predictions to perform resource allocation for broadband networks. DL is used in [8] to predict average rates of non-realtime service users to assign radio resources in advance in a mobile network. In [15], a DNN has been used by the authors for subchannel and power allocation in non orthogonal multiple access (NOMA) networks. The authors in [16] have used a DNN for subcarrier assignment for users in an orthogonal frequency division multiple access (OFDMA) system. The work in [17] proposes a feed forward NN based resource and power allocation scheme for 4G LTE heterogeneous networks. The work in [18] proposes a DNN to learn optimal policy for predictive resource allocation with interference coordination for cellular networks. The authors in [19] present a DL based method to solve the problem of sub-band and power allocation in a multi-cell network. All of the preceding works are based on the assumption of similar distribution for the training and target datasets which is not always true in reality [20]. Therefore, those works can exhibit suboptimal performance when deployed in a real wireless environment [21].
Transfer learning (TL) is a concept in ML which can be used to make more effective predictors in a domain with limited training data availability by training the model beforehand in a domain where it is readily available [22]. TL can be used to address the problem of data distribution change over time. Nevertheless, TL has been typically performed offline which has limited its usage in online and real-time applications [23]. It is important to note that unlike our work, most of the DL based wireless resource allocation solutions in the literature leverage only synthetic datasets rather than using both synthetic and real world wireless network dataset. Furthermore, most of the works use single point predictions on the time series of considered key data metric without capturing any uncertainty in the predictions. The work in [24] presents a theoretical framework based on Bayesian inference for determining model uncertainty in NNs using dropout. The work in [11] further explores other kinds of uncertainties associated with NNs and proposes a framework to capture them systematically. Capturing uncertainties via VOLUME 4, 2021 FIGURE 2. Proposed proactive resource allocation driven network Architecture the use of a PI is better in the sense that it gives extra freedom to determine up to which level the predictions from the model can be trusted. Besides, our work shows that uncertainty aware predictions can be used to improve a wireless resource allocation algorithm as the uncertainty estimate from the prediction model can be used by the algorithm to deploy alerts for possible reallocation of channel resources when resource utilization at an AP exhibits unusual behavior.
Recently, cloud managed networks have established themselves as efficient players in the operation and management of medium to large-scale deployment of wireless networks [25], [26]. A cloud managed network can collect descriptive analytics to track the right metrics and use DL on metrics data to make predictions that can be used to improve resource allocation in such networks. Our proposed DL driven resource allocation algorithm for cloud managed enterprise networks in unlicensed shared spectrum bands, such as a network deployed in a university campus, a large office building, or an airport etc. The proposed algorithm not only performs proactive resource allocation for multiple APs based on predictions from the developed BNN, but also keep tracks of uncertainty due to model misspecification in realtime. The uncertainty estimate from the prediction model is used to improve the performance of the resource allocation algorithm. To the best of our knowledge, this is the first time uncertainty aware predictions have been applied in a wireless resource allocation scenario.

III. SYSTEM MODEL
We consider a set of N APs denoted by N = {1, 2, · · · , N} and a set of M unlicensed channels denoted by M = {1, 2, · · · , M}. The M channels are utilized by the APs of the enterprise wireless local area network (WLAN). Each AP in the network is denoted by α i which represents the i th AP. The network's resource allocation is managed by a cloud managed resource controller. The basic system model illustrating a wireless network with predictions/resource allocation modules is presented in Fig. 2.
Enterprise WLANs often exhibit patterns over certain time periods, such as over a length of a day, a week, etc. in terms of CU. However, although the CU often has recurring patterns, they can be affected by uncertain events, such as abrupt increase in channel usage by wireless users within a short period of time. Our goal is to use a confidence aware deep (CAD) predictions technique that can not only predict wireless CU values but can also reliably estimate uncertainty in predicted values. The uncertainty estimates allow us to quantify how much to trust the predictions produced by the DL model. Our goal is also to present an application of the CAD predictions to a proactive channel allocation algorithm.  In our work, along with synthetic data we also use real wireless CU data. To consider a real enterprise WLAN, we collected CU data over a period of 5 weeks in the busiest parts of the University of Oulu. CU is an important wireless physical layer resource utilization metric to get information about the health of an enterprise WLAN [6], [27]. We measured the CU using hardware-accelerated spectrum analytics device implemented by us on a Xilinx's Zynq-7000 system on chip (SoC) devices [13]. Each implemented device outputs every 20 seconds a measured CU value for that time duration. Hence, our datasets are CU time series.

B. PREDICTIONS/RESOURCE CONTROLLER
The cloud managed resource controller system shown in Fig. 2   utilizes predictions and their uncertainty estimates to not only perform proactive allocations for multiple APs periodically, but also to adjust allocations to any significant changes in wireless CU instantly.

IV. ENCODER DECODER BASED DEEP LEARNING MODEL
We use an encoder-decoder based recurrent BNN to build the DL model as it is a type of NN well-suited to predict not only time series values, but also uncertainty estimates with the prediction values. The encoder-decoder network processes a time series step-by-step, maintaining an internal state summarizing the information it has seen so far. Over a period of time, it tries to learn what to keep and how much to keep from the past, and how much information to keep from the present state, which makes it so powerful as compared to the other NNs.

A. MODEL INPUT
Consider a univariate CU time series s = {s[0], s [1], s [2], · · · , s[T − 1]}. To formulate the prediction problem as a supervised machine learning problem, we adopt a sliding window approach. A regressor vector x t is composed by sliding a fixed window of size n H across the time series which generates sequences of time lagged data as shown in Fig. 3 and 4. The generated sequence is given as input to a predictor f θ which is parameterized on θ which aims to forecast the next n F values of the time series.
The regressor vector at discrete time t is defined as The predictor f θ needs to infer the next n F samples represented by the vector y t = s[t + 1], · · · , s[t + n F ] ∈ R n F . Similarly, we can denote the inference from the predictor asŷ t = f θ (x t ) ∈ R n F .
Let's assume the exact function which maps input vector x t to the output vector y t isf , then, ∀t, y t =f (x t ). Then, when training the model, the learning algorithm would adapt the parameter θ to approximate f θ tof as far as possible based on some performance metric. By using mean average error (MAE) as the performance metric in training the DL model, the MAE loss function for the learning problem is given by where X denotes the set of regressor vectors. Supervised learning always solves an optimization problem to find the optimal function which minimizes the loss function given by (1). Let the optimal function be fˆθ, then using (1), the optimization problem for the training can be formulated as (2)

B. MODEL UNCERTAINTY AND INHERENT NOISE
Let fθ be the trained DL model withθ representing the fitted weights. For a new sample point x * , the prediction from the model is given by y * = fθ(x * ). By computing the standard error σ of the predictions, the uncertainties associated with the model can be captured. Then, the resulting PI can be con-  structed as [y * − z α/2 σ, y * + z α/2 σ] where z α/2 corresponds to the upper α/2 quantile of the standard normal distribution.
The Bayesian probability theory provides a robust approach to address and quantify the uncertainties associated with a DL model. Let X,Y denote the observations used to train the model. Using Bayesian probability theory, the predictive probability density of the model for a new data point x * can be obtained by, Estimating the posterior density p(θ|X,Y ) is important in accurately quantifying the prediction uncertainty. Various inference methods are available to approximate the posterior density in DL models. Due to its simplicity in implementation, we use Monte Carlo (MC) dropout to approximate the model uncertainty. Dropout is the process of randomly dropping out hidden units in a DL model with certain probability p. By applying dropout stochastically for K times at testing, the uncertainty associated with the predictions can be quantified as follows.
whereŷ * (k) is the model output at the k th stochastic run with dropout applied andŷ * denote the mean of K outputs.
The variance in (4) consists of two terms which correspond to model uncertainty and inherent noise respectively. The inherent noise term, σ 2 can be estimated by an independent validation set. Let the validation set be X = {x 1 , · · · , x V },Y = {y 1 , · · · , y V } and the trained model on the training data be fθ(.), then σ 2 is estimated by

C. ENCODER-DECODER FRAMEWORK FOR MODEL MISSPECIFICATION
We capture the model misspecification by using an encoderdecoder framework formed using LSTM layers. When the encoder-decoder framework is trained on the training data, a latent embedding space is created by the encoder which extracts different features from the timeseries. If the test data have patterns different from training data, the encoder would not be able to correctly map them to the latent embedding space. Therefore, by pre-training the encoderdecoder framework, we can quantify the uncertainty due to model misspecification. Fig. 5(a) shows the encoder-decoder network used at pre-training phase. The uncertainty in variance calculation is assimilated by connecting the encoder with a prediction network and treating the resulting network as a single network which we call as inference network. Let f (.) be the encoder model and g(.) be the prediction network, then the resulting inference network h(.) can be written as the composite model h(.) = g f (.) . Fig. 5(b) shows the inference network created in this way. Let the input sequence vector to the model be x = (x 1 , · · · , x n H ), then the encoder forms the vector e = f (x) in the latent embedding space and the prediction network g generates the final output taking the vector e as the input to the network. In each forward pass, MC dropout is applied stochastically to all layers both in the encoder and the prediction network for K times. Applying dropout randomly in the encoder captures the uncertainty due to model misspecification. The dropout applied in the LSTM layer in the encoder is for both the input and the recurrent states.

D. MODEL IMPLEMENTATION FOR REAL CU DATASET
The encoder-decoder network is formed for the real CU data using two LSTM layers which consists of 32 LSTM cells in the first layer (which gives a dimension of 32 for the latent embedding space) and 10 LSTM cells in the second layer with tanh activation in all layers. The prediction network consists of three fully connected layers with 32, 16 and 10 hidden units with tanh activation in each layer respectively. The number of layers, LSTM cells and hidden units are selected heuristically to obtain the best prediction performance. We use a sliding window as shown in Fig. 3 to generate x t and y t vectors and use them to train the network. The steps involved in the model implementation are presented in Algorithm 1.
// Calculate prediction and PIs 21:ŷ ← In f erenceNetwork(x * ) 22:ŷ pi_lb =ŷ − z α/2 σ // lower boundary 23:ŷ pi_ub =ŷ + z α/2 σ // upper boundary 24: Output:ŷ,ŷ pi_lb ,ŷ pi_ub The training of the network takes place in two phases. In the first (pre-training) phase, the encoder-decoder network is fitted to the training data. Let the input sequence vector of the univariate time series at time t be x t = s[t − n H + 1], · · · , s[t] , the encoder takes x t and maps that to a low dimensional vector e t in the embedding space. Decoder then learns to recreate the output sequence vector y t = s[t + 1], · · · , s[t + n F ] from e t . This way, the encoder learns to extract relevant features present in the input time series. In the second phase, we use the encoder to encode the input vector x t to e t and we train the prediction network to predict y t using e t as the input. Once the training of the prediction network is finished, we cascade the encoder with the prediction network and form the inference network as shown in Fig. 5(b). Note that for the same inputs, the outputs from the encoder-decoder network and the inference network would be different as shown in Fig. 5. We use MC dropout as discussed in section IV-B to quantify the uncertainty and generate the PIs. Selection of values for K and p is heuristic. K should be selected in such a way that it generates a smooth PI. Nevertheless, having a very large value for K greatly increases computational time. In our case, we used K = 100. For p, 0 < p ≤ 0.5 is held. The value of p can be selected heuristically by calculating the resulting empirical coverage. In statistics, empirical coverage of predictions is defined as the proportion of the samples of interest which would be contained in the PI. p is selected such that the correct empirical coverage is obtained.
We set the sliding window size to cover 7 days of CU data and use subsampling on the data with a factor of 32. The models are trained to predict a time horizon of 1.5 hour. We train all the models using RMSprop optimizer with MAE selected as the loss function. All the models are implemented using TensorFlow 2.0.0 machine learning platform with Python 3.7 environment.

E. PREDICTION MODEL RESULTS
In this section, we use real CU data to evaluate the performance of the CAD model with respect to prediction and uncertainty. Prediction performance is evaluated by comparing the results with several other prediction models.

1) Uncertainty estimation
We measure the performance of the model in estimating the uncertainty in the predictions by calculating the empirical coverage. We use the MC dropout probability, p and calculate the resulting empirical coverage of the calculated PI. For p = 0.5, the resulted empirical coverage for different standard score values is given in Table 2. Standard score is the number of standard deviations a sample value is above or below the mean value. For a sample value x, it is calculated as (x−µ)/σ where µ is the mean and σ is the standard deviation. In the table, we can see that the empirical coverage values calculated for PIs from our proposed model are very closer to the expected PIs. Next, we define single and multiple timepoint predictions.   Fig. 6(a) and Fig. 6(b) show the real CU values, predicted CU values and calculated PIs using MC dropouts for the single time-point predictions (the point at 1.5 hour in the future) and multiple time-point predictions for the test data set. In single time-point predictions, the CAD model makes predictions every sample period where as in multiple timepoint predictions, the CAD model makes predictions every 1.5 hours. In Fig. 6(b), we can see that in multiple timepoint predictions, prediction values can fluctuate compared to single time-point predictions. In Fig. 6(b), it is apparent that the PI widens when going from the closest to the furthest point in a multiple time-point prediction. Moreover, we can observe in the figures that most of the time, the actual value falls inside the 95% PI calculated by our model. Also, in Fig. 6(a), we can see that our proposed model gives a broad PI at the peaks of the CU time series. It makes sense because at peaks, the prediction uncertainty is high which can be explained by the phenomena of model uncertainty and model misspecification.

2) Prediction performance of the proposed model
To evaluate the prediction performance, we compare the results to three different models; i) a daily naive predictor: the forecasts for a given day are equal to the values of a full To compare the prediction accuracy of the models, we use the measure called MAPE which is given by   To compare the model performance, we use test data as shown in Fig. 8(a) and new test data which includes new unseen samples with a different distribution than the data distribution available at DL model training time as shown in Fig. 8(b). The duration of the introduced new test data is around two hours. According to Table 3, it is seen that the MAPE of the proposed model for original test data is comparable to LSTM and GRU models. For new test data with unseen samples, we can see that the proposed model outperforms other models in terms of MAPE. Furthermore, we see that the performance degradation of the proposed model in the presence of unseen samples is the lowest which results in the best robustness measure across all the tested models. This concludes that the proposed model has the most stable and robust predictive performance compared to other benchmark models. Fig. 9(a), 9(b), 9(c) and 9(d) show how each model behaves in the presence of new unseen samples which represent change in CU distribution. Although the proposed DL model showed improvement in terms of MAPE, it is clear from the figures that no model can correctly predict the future in the presence of test data which represent the change in distribution. It makes sense due to the fact that new test samples were not present in the training data. This shows that a change in data distribution significantly affects the performance of DL models.
Further, it can be observed in Fig. 9(d) that the observed values lie outside the PI generated by our proposed model. This helps us to identify the change in the new test data as a change in the CU distribution. In the next section, we use this feature to trigger alarms for the improvement in channel reallocation in a real enterprise WLAN.

V. APPLICATION OF CAD PREDICTIONS TO WIRELESS RESOURCE ALLOCATION
In this section, we present an application of the developed CAD predictions model in frequency channel resource allocation for a cloud managed WLAN operating in an unlicensed spectrum. The proposed resource allocation framework jointly addresses two QoS criteria: 1) channel quality by taking into account signal-to-interference-noise ratio (SINR); and 2) the amount of airtime required by an AP for a variety of wireless applications by taking into account its CU demand. Based on the individual and group preferences, we design two metric scores that take into account the CU predictions from the CAD model. The metric scores are utilized to perform channel allocation decision in the proposed algorithm called ProReact. The steps involved in the algorithm are presented in Algorithm 2. The main concept behind the proactive channel allocation of the proposed algorithm can be summarized as follows: The cloud controller periodically collects the CU data from the APs in each channel k. The CAD predictions model calculates CU PI upper bounds denoted byŷ k pi_ub for the next allocation period. The controller then utilizes a metric score which takes into account the maximum of the obtainedŷ k pi_ub and the data transfer rates at APs denoted by R k i to generate a new proactive channel plan S for the next allocation period and delivers the updated configuration to the APs. This proactive channel allocation process is denoted by line numbers 4-6 in Algorithm 2.
In wireless networks, CU may exhibit different behavior than usual due to the dynamicity in the usage demand of the users connected to the AP. Due to this reason, a proactive channel assignment algorithm based only on predicted CU values (we use the maximum ofŷ k pi_ub , see line 4-5 of Algorithm 2) may not be sufficient to ensure a good real-world performance. To overcome this limitation, our proposed algorithm incorporates the maximum of PI estimate denoted by A k to raise an alert for possible channel reallocation in realtime. In each allocation period, the cloud controller observes whether the real-time CU denoted by A k in a channel exceeds A k . If the real-time CU is outsideȂ k , the cloud controller allocates channels reactively by generating a new channel plan S using the real-time CU and delivering the updated configuration to the APs. This reactive channel allocation process is denoted by the line numbers 8 and 9 in Algorithm 2.
Basically, there are two major advantages in utilizing the PIs provided by BNN in channel allocation; PIs allow the channel allocation algorithm, 1) to take into account various kinds of uncertainties associated with the predictions when performing channel allocation and, 2) to adapt its channel allocation decisions to anomalous CU levels.

A. PROPOSED METRIC SCORES
We present the following individual metric which is utilized by the cloud controller to evaluate the effectiveness of an AP when allocated to a channel k. The individual metric score is meaningful from the perspective of the cloud performing while True do // Monitor for any alerts 8: if A k >Ȃ k then // Reactive channel allocation based on real-time CU AP_list ← Random AP order 17: for each AP, i ∈ AP_list do 18: for each channel, k ∈ M do 19: Calculate A k i , See (8) // Compute metric score 20: if INDIVIDUAL_METRIC_SCORE then 21: return S 30: end function channel selections in a wireless system as it captures usefulness in terms of CU and rate an AP gets from the allocated channel action. Higher values of the metric imply that the AP will have a better wireless experience. The presented metric I k i estimates the effectiveness of α i on a channel k as where B i and SINR i denote the bandwidth and the SINR of α i on channel k respectively,Â k i denotes the CU demand of α i on channel k (which is generated by the applications running on the devices connected to α i ), and A k i is the CU obtained 10 VOLUME 4, 2021 by α i allocated to channel k. A k i can be calculated as [28], where A k is the total obtainable CU on channel k and C k represents the set of APs which are present in channel k. A k i can be explained as follows. When the sum of total CU demands of APs in a channel k is less than or equal to the total available CU, then the obtained CU of α i is equal to its CU demand. However, in a channel, when the total CU demand of APs exceeds A k , then α i can still expect to get its fair share of the CU which is at most 1/|C k |.
The individual metric score I has two important properties: i) If an AP experiences high interference on channel k, then its rate will decrease and hence the score I will decrease; ii) I can only increase in terms of obtained CU as long as it is less than the CU demand. When the CU demand is satisfied, then I cannot further increase in terms of obtained CU.
Let the current channel of α i be k and any other channel which is available for the cloud controller to select for α i bẽ k, then the individual metric scores of α i on each channel can be given as I k i and I˜k i respectively.
The second metric score we consider is called the marginal metric score which is denoted by MS k i . The marginal metric score has an important property that it takes into account the sum of individual metric scores of all APs present in the channel. The marginal metric score of α i with respect to channel k is defined as the difference between the sum of the individual metric scores of APs with α i present in the channel and without α i in the channel. The marginal metric score of α i can be given as

B. ALLOCATION DECISION RULES AND BEST RESPONSE
Before presenting the allocation decision rules, we first present some definition related to the concept of best response (BR) updates.
Definition 4. Let us consider an allocation update which involves cloud controller changing the channel of α i , while all other APs' allocations are kept unchanged. Then we say that an allocation for α i is a better response update when its metric score is strictly increased due to the change, i.e., I˜k i > I k i or MS˜k i > MS k i . Moreover, an allocation update is a BR update if it improves the α i 's metric score to the maximum possible value among all better responses for α i .
We consider and compare two different decision rules for the cloud controller to perform BR updates in the proposed algorithm 2: 1) Individual metric score based decision rule; and 2) Marginal metric score based decision rule.

1) Individual metric score based decision
The BR update using this decision rule is calculated based only on an AP's own metric score. Under this decision rule, the BR update for α i , BR i can be given as 2) Marginal metric score based decision Under this decision rule, a BR update is performed by taking into account not only the individual metric score of α i , but also the metric scores of all other APs which are affected by the BR update. Under this decision rule, the BR update for α i can be given as Based on which type of decision rule is utilized in the cloud controller, the channel allocation plan is updated by performing the BR updates by the cloud controller. In section VI, we will evaluate the performance of the proposed algorithm through simulations.

C. CONVERGENCE ANALYSIS
In this section, we analyze the convergence of the proposed cloud-based channel allocation algorithm when using the marginal metric score based decision rule. A network of wireless APs can be seen at each time instant as an undirected graph in which the nodes represent wireless APs, and there is an edge between two nodes if the nodes are within the transmission range of each other. The resulting connectivity graph G is undirected because interference and airtime competition among two APs in an unlicensed channel in general form bidirectional link. We focus on a game theoretic concept of convergence called NE. The use of NE concept is suitable as it ensures that no AP has an incentive to unilaterally deviate from the given allocation. We next present some theorems and definitions related to game theory to support our claims.
Definition 5. A pure strategy NE is an action profile of players in a game in which each player's action is a BR to the rest of the other players' actions. A formal definition of NE outcome corresponding to channel allocation can be given as follows: A channel allocation profile a = (a 1 , a 2 , · · · , a N ) of an N AP cloud allocated solution is a NE if for each α i , we have That is, the allocated channel a i of each α i is a BR to the allocations a i of all other APs. VOLUME 4, 2021 Theorem V.1. When the BR updates for the APs are performed by the cloud, then, the channel allocation under the decision rule based on marginal metric score is an exact potential game. Strictly speaking, the channel allocation algorithm based on marginal metric score results in an NE channel allocation profile.
Proof. The work in [29] has shown that marginal contribution metric results in an exact potential game with the potential function W which holds the following mathematical relationship.
Moreover, there is finite best-response improvement property associated with every potential game which means if the cloud continues to perform the BR updates for the APs, then it would ultimately lead the system to a pure NE. Each AP's improvement in a step is finite and such a sequence of steps by APs ends in an NE.
For the proposed channel allocation algorithm, its potential function W is given as Using the BR updates, when the cloud selects a new channelk for α i currently in channel k, then the change in the potential function for this update only happens due to the allocation change in k andk. Therefore, the difference in potential function when the cloud selects a new channelk can be given as Hence, (15) shows that the allocation decisions based on the marginal metric score leads to an exact potential game. Moreover, based on finite BR improvement property for every potential game shown in [30], we can conclude that the cloud-based channel allocation algorithm using marginal metric score guarantees convergence to an NE.
Note that it is not the NN which is enabling the convergence, but the output of the NN given to the carefully designed resource allocation algorithm which is leading to the convergence. Not any NN based resource allocation solution will converge. We need to come up with a carefully designed metric score and a resource allocation algorithm with carefully designed decision steps. The proposed resource allocation algorithm utilizes the potential function property (refer to Theorem V.1) and the best response dynamics (refer to (11)) to ensure that it will converge. An NN which can only give predictions cannot be used directly in the proposed method.

VI. PERFORMANCE ANALYSIS OF CAD PREDICTION BASED WIRELESS RESOURCE ALLOCATION ALGORITHM
For the simulations, we consider N APs to be allocated in M channels. We evaluate the performance of the proposed cloud-based channel allocation algorithm in terms of average sum metric and average sum rate of the APs. We simulate the proposed algorithm under two scenarios, with low CU demands and high CU demands in the network, where low CU demands take the valuesÂ k i ∈ (0, 0.6] and high CU demands take the valuesÂ k i ∈ (0, 0.7], respectively. Our results focus on showing convergence of the proposed method using individual and marginal metric scores given by (7) and (9) utilizing different initial channel allocation routines. We also show that the proposed channel allocation algorithm which optimizes the average sum metrics of the APs will also leads to the optimized average sum rates of the APs.
To assess the effectiveness of the proposed method in different scenarios relative to the optimal solution, we compare the convergence of the proposed method with optimal average sum metric obtained via the best NE. Note that it is shown in [30] that convergence to the best NE of a marginal contribution based solution leads to the optimal solution.
We also show the performance gain of the proposed CAD predictions based channel allocation method by using test data with different CU distribution in a WLAN.

A. CONVERGENCE RESULTS
We evaluate the performance of the proposed method under four scenarios; 1) initzero individual: where initially no AP is allocated to a channel and channel allocation is based on individual metric score; 2) initzero marginal: where initially no AP is allocated to a channel and channel allocation is based on marginal metric score; 3) initrandom individual: where initially APs are allocated to channels randomly with uniform distribution and channel allocation is based on individual metric score and 4) initrandom marginal: where initially APs are allocated to channels randomly with uniform distribution and channel allocation is based on marginal metric score. Fig. 10(a)-(d) show the average sum metrics and average sum rates of the APs under the aforementioned scenarios. It can be seen from Fig. 10(a)-(d) that the better response updates lead to increases in both the average sum metric and the average sum rate as a function of steps until the equilibrium.
For both the low demand and the high demand cases shown in Fig. 10(a)-(d), we can see that the marginal metric 0 10 20 30 40 Step number Step number Step number Step number Step number score based method always converges closer to the optimal solution than the individual metric score based method. This can be explained as follows. In individual metric score based method, although an AP selects a channel which gives a better metric score for itself than the current channel, it might affect the performance of other APs in the selected channel which results in a lower sum metric whereas in marginal metric score based method, an AP only selects a channel if that results in a higher sum metric for all the APs in the selected channel. Accordingly, the channel allocation method based on the marginal metric score always outperforms the channel allocation method based on the individual metric score.
It can be seen from the Fig. 10 that irrespective of initial channel allocation method, the algorithm almost converges to the same solution for both metric scores. Increasing the number of APs and the number of channels result in increased average sum metrics and average sum rates in the system. In Fig. 10, we can see that the average sum metric and the average sum rate plots have the same shape. This implies that optimizing the average sum metrics results in optimized average sum rates in APs.
Finally, from the results, we can conclude that the marginal metric score based method has the best overall performance.

B. COMPLEXITY ANALYSIS
The complexity of the proposed Algorithm 2 in terms of single channel allocation step is within O(NM). Additionally, from Fig. 10, we can identify that the number of channel allocation steps taken by the algorithm to converge is less than 2N for all the considered scenarios. Therefore, the total complexity of the channel allocation algorithm is within O(N 2 M).

C. THE PRICE OF STABILITY AND PRICE OF ANARCHY
As there can be more than one NE leading to convergence in the channel allocation solution, we need to evaluate the efficiency of the obtained NEs. There are two measures in game theory which correspond to the best and the worst achieved NE. The best and the worst NE are evaluated by the the price of anarchy (POA) and the price of stability (POS), respectively. The POA and POS can be given as

POS =
value of best NE value of optimal solution , POA = value of worst NE value of optimal solution .
For the proposed channel allocation game, we have the following observation.
Observation 1. The proposed cloud-based channel allocation always achieves POS=1 when using marginal metric score. This is in accordance to the result shown in [30] that convergence to the best NE of a marginal contribution based solution leads to the optimal solution.
As POA compares the worst NE to the optimal solution, it effectively evaluates the largest performance gap which is incurred by the channel allocation solution. This means that POA can be considered as a lower bound for the convergence performance of the channel allocation solution. Always the channel allocation solution would converge to an NE equal to or better than the NE corresponding to POA.   Table 4, we can see that in all the cases, the POA is greater than 0.85. This conveys the fact that even when the channel allocation converges to the worst NE, the degradation in performance (i.e. performance gap) compared to the global optimal solution is less than 15%. The other observation in Table 4 is that the POA when using marginal metric score is greater than the individual metric score. In Table 4, it can be seen that the calculated POS for individual metric score is very close to 1 which means that the best NE payoff when using individual metric score is close to the optimal solution. From the results in Table 4, we can conclude that utilizing the marginal metric score for BR updates gives the best performance in the channel allocation method.

D. PERFORMANCE EVALUATION USING CU DATA WITH NEW PATTERNS
We also evaluate the performance of the proposed CAD predictions based ProReact channel allocation method through simulations using real CU time series data. We consider a WLAN which consists of 8 APs and 3 unlicensed channels (e.g. N = 8, M = 3). To show the impact of CAD predictions on the resource allocation method, the real CU data set also contains a different CU pattern to the regular CU. The different CU pattern is used to model a change of CU distribution in the APs and its impact on the channel allocation. We test the algorithm by changing the duration of the different CU pattern under two cases (as shown in table 5). We use marginal metric score based channel allocation method as it gives the best performance which we established in section VI-A and VI-C. We evaluate the performance of different CU pattern driven channel allocation feature of the proposed Algorithm 2 by comparing it relative to channel allocation where this reallocation feature is not utilized (i.e. steps 7 to 11 in Algorithm 2 are ignored). Table 5 shows the gain in the average sum metric obtained for our CAD predictions based channel allocation relative to the algorithm with no reaction to the different CU patterns. In Table 5, it can be seen that the proposed CAD predictions based channel allocation method which takes into account different CU patterns achieves significant performance gains over the algorithm which does not take into account different CU patterns for reallocation. The gain in the average sum metric can be explained as follows. When a certain AP in a channel has high CU demand which cannot be satisfied in that channel, it results in a reduced metric score of itself and other APs in the same channel. The proposed CAD predictions based channel allocation algorithm detects this behavior and moves the affected AP to a channel which can better satisfy its CU demand by performing channel reallocation. This improves its own metric score and the metric scores of other APs in the previous channel which helps to increase the overall sum metric. It is also worth noting that our algorithm either performs equally well or it improves the performance of a WLAN over channel allocation which does not take into account different CU patterns. Hence, there is no penalty in performance in existing systems by incorporating the proposed algorithm.

VII. CONCLUSION
In this paper, we have presented an uncertainty-aware DL model for robust prediction of wireless CU and real-time change detection in CU distribution. To account for the uncertainty in the DL model, we have used an encoderdecoder framework based DL model using BNNs. Our results have shown that the prediction performance of the proposed model is as good as other models. However, in addition to predictions, an additional feature of our proposed model is that it can consistently quantify the uncertainty in predictions using PIs. By using computed PIs, we have shown that we can perform change detection in CU distribution accurately. We have also developed a channel allocation algorithm for WLAN called ProReact which utilizes the predictions from the DL model to compute a novel metric score which is used to find efficient channel allocation plan for the APs in the network. Moreover, the proposed algorithm also utilizes the change detection in CU feature of the proposed DL model to perform channel reallocation when the CU distribution changes. Our results have shown that our channel allocation algorithm achieves fast convergence leading to high sum rates in the network. One possible extension of our work we envisioned for is to address the problem of coexistence of the licensed assisted