Unsupervised Learning-Based Joint Power Control and Fronthaul Capacity Allocation in Cell-Free Massive MIMO With Hardware Impairments

A deep learning-based resource allocation algorithm that maximizes the sum rate of a limited fronthaul cell-free massive MIMO network with transceiver hardware impairments is proposed in this letter. The sum rate maximization problem with user power constraints and total fronthaul capacity constraints for channel state information (CSI) and data transmission is considered. The deep neural network (DNN) PowerNet is proposed to learn solutions to the joint power control and capacity allocation problem in a low-complex, flexible, and scalable way. An unsupervised learning approach is used which eliminates the need of knowing the optimal resource allocation vectors during model training, hence having a simpler and more flexible model training stage. Numerical simulations show that PowerNet achieves close sum rate performance compared to the existing optimization-based approach, with a significantly lower time complexity which does not exponentially scale with the number of users and access points (APs) in the network. Furthermore, the addition of the online learning stage resulted in a better sum rate than the optimization-based method.


I. INTRODUCTION
C ELL-FREE massive MIMO utilizes a large number of distributed APs which are connected to a central processing unit (CPU) via fronthaul links to serve a much smaller number of users distributed over a wide area using the same time-frequency resources, without cells or cellboundaries [1], [2], [3]. In cell-free massive MIMO, the efficient utilization of the limited fronthaul links is essential in achieving improved performance [4], [5], [6]. Also, when looking from a practical implementation perspective with lower deployment costs, it is more realistic to consider cell-free massive MIMO systems with non-ideal transceiver components which may introduce hardware impairments to the network [6], [7], [8]. Furthermore, proper radio resource management (RRM) improves the performance of cell-free massive MIMO networks. Conventionally, RRM problems are solved Manuscript  using optimization methods, which are often infeasible in practice due to the processing complexities, timing overhead, parameter sensitivity, and lack of flexibility [9]. On the other hand, the universal function approximation property of artificial neural networks has enabled deep learning (DL)-based techniques to be used for RRM with lower complexity than traditional optimization-based approaches [10], [11], [12], [13].
In this letter, we propose a DL-based novel low-complexity RRM method in order to perform uplink power control and fronthaul capacity allocation for sum rate maximization in a limited fronthaul cell-free massive MIMO network with hardware impairments. We extend the DNN-based unsupervised learning approach in our previous work [13] and show its potential to solve joint optimization tasks, which is not investigated in prior work according to our best knowledge. Unsupervised learning allows the DNN to directly optimize the max-sum rate objective as a custom loss function during training to produce power control and fronthaul capacity allocation outputs. We show that the DNN converges into acceptable solutions even with a more complex loss function consisting of the joint optimization variables, distortions due to hardware impairments in the transceivers, and quantization errors due to the limited fronthaul. The performance, time-complexity, and flexibility of the proposed DL-based solution are compared with those of the optimization-based solution proposed in [6].

II. SYSTEM MODEL
We consider the uplink data transmission of a cell-free massive MIMO system where M single-antenna APs are distributed in a given area and simultaneously serve randomly distributed K single-antenna users. The mth AP, m ∈ {1, 2, . . . , M } is connected to a CPU via a fronthaul link with limited capacity C m [bits/s/Hz]. We assume that wireless channels between APs and users are modeled as block-fading ones with coherence interval T [samples]. Thus, the channel coefficient between kth user and mth AP over a coherence interval is modeled as g mk = √ β mk h mk , for m = 1, 2, . . . , M and k = 1, 2, . . . , K . Here β mk is the large-scale fading consisting of pathloss and shadowing, and h mk ∼ CN (0, 1) represents the small-scale fading between kth user and mth AP. We assume the network to have non-ideal transceivers at the users and the APs. In order to analyze the impact of (residual) hardware impairments on communication, we use the hardware impairment model proposed in [6] and [14], where the desired signal is scaled by a deterministic factor and an uncorrelated memoryless distortion term is added, which is Gaussian This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ distributed in the worst case. We assume all the users have the same transmitter hardware quality factor ξ t and all the APs have the same receiver hardware quality factor ξ r [6].
The users first undergo a pilot transmission phase in order to estimate the uplink channel coefficients. We assume that τ -length orthogonal pilots are assigned to the users, where τ = K ≤ T . Then, the uplink data transmission happens where all the users simultaneously send their signals to the APs. We assume the compress-forward-estimate (CFE) strategy for transmission from the APs to the CPU [6]. Specifically, upon receiving the pilot/data signals, each AP compresses the received vector element-wise and transmits the quantized version to the CPU. The CPU then performs maximum ratio combining using the received quantized data signals from all the APs and the estimated channel coefficients. The use-andthen-forget approach considered in [6] is used to obtain the effective received signal for the kth user. Detailed equations in each of the above steps are omitted due to the space constraints and could be referred to from [6]. Following the derivations and notations in [6], for the CFE strategy, the achievable uplink user rate can be obtained as R CFE where SINR CFE k is given by (1), shown at the bottom of the page.
III. DEEP LEARNING-BASED JOINT POWER CONTROL AND FRONTHAUL CAPACITY ALLOCATION As shown in [6], proper power control and fronthaul capacity allocation for pilots/CSI and data provide spectral efficiency gains in the uplink transmission. The proposed optimization-based solution, however, is sub-optimal and has a higher computational complexity to be feasible for practical implementation. In this letter, we propose a DL-based solution for the following sum rate optimization problem and compare its performance against the optimization-based solution proposed in [6].
In (2), the first set of constraints η k ∈ [0, 1], ∀k is for the power control coefficient of kth user. C p,m and C d,m are the fronthaul rates assigned for forwarding quantized pilot and data signals respectively which adds up to total fronthaul capacity C m . The pilot and data signal quantization noises Q p,m and Q d,m which are needed for channel estimation and data detection are linked to C p,m and C d,m respectively, by [6, (13) and (14)].
We design and implement PowerNet, a DNN to solve the non-convex and NP-hard optimization problem P1 in (2), where optimal user power allocations and fronthaul capacity allocations are learned in a data-driven manner via unsupervised learning. The network consists of L + 1 sequentially connected layers to produce the mapping f (x 0 ; θ ) : Large-scale fading coefficients vector x 0 = β ∈ R MK is used as the input. 1 Since we need to learn optimal values for both variables: power control coefficients η k , ∀k and the capacity allocations C p,m , ∀m, the hidden layers are constructed to enable learning their corresponding θ values independently. Thus, each hidden layer and the output layer consist of two components corresponding to power coefficients and capacity allocations as x l is the mapping performed by the lth layer corresponding to the power coef- l is the mapping corresponding to C p,m , ∀m, the fronthaul capacity allocation for the CSI. The output vector at the lth layer is constructed by concatenating those two components as is the bias vector in the lth layer corresponding to power coefficients vector η = [η 1 , η 2 , . . . , η K ] T . The set of learnable parameters is thus θ In the hidden layers, we have used eLU (exponential linear unit) for the activation function σ(·). 2 In the output layer, Sigmoid activation was used to guarantee that the power coefficients are in the range [0, 1]. Similarly, the capacity allocations C p,m , ∀m have the mapping f is the bias vector corresponding to fronthaul capacity allocations 1 Since the objective of the DNN is sum rate optimization, giving an input to the DNN which has a direct impact on the user rates is important. We have selected large-scale fading coefficients (which consist of both the pathloss and shadow fading) as the DNN input since the user rates are mainly dependent on the large-scale fading channel values. 2 ReLU activation can also be used for the hidden layer activations, but it is preferable to eLU activation since it avoids the dying ReLU problem associated with ReLU. Also, both positive and negative outputs produced by eLU help the network direct weights and biases in the right direction. From the simulations also it was observed that using eLU activations gave better results compared to ReLU.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
C p,m in the lth layer. The set of learnable parameters is thus θ Here also, eLU and Sigmoid activations were used for the hidden layers and output layer respectively. The model output x L ∈ R (M +K ) is constructed combining the two components as Since we use an unsupervised learning approach, we use the custom loss function given by loss is the sum rate of all the K users for a given channel realization with large-scale fading of β and given set of trainable parameters θ . This loss function is differentiable with θ which allows training the network via stochastic gradient descent (SGD). We adopt mini-batch gradient descent approach to reduce the complexity of the SGD. In each iteration of the training, a set of channel realizations are generated from its distribution. Thus, the training loss is approximated as where B denotes the set of channel realizations in each iteration and | B | is the mini-batch size. Thus, during training, the model learns parameters θ to minimize the loss given in (3) which maximizes the sum rate as expected.

A. Simulation Setup and DNN Model Training
We evaluate the performance of a cell-free MIMO system in a simulation area of 1 × 1 km 2 with different numbers of AP and user configurations. All the simulation parameters and dataset preparation for the DNN model training are similar to [13]. The three-slope channel model in [2] is used for pathloss. Full details of the channel model can be found in [6]. The one-dimensional line search algorithm along with the geometric programming (GP) solution proposed in [6] for the high-SINR approximation of problem P1 is used as the optimization-based baseline method to find fronthaul capacity and user power allocations respectively. It is implemented in MATLAB using CVX convex optimization software package [15]. The DNN model implementation, training, and testing are done in TensorFlow. Both implementations are done on the same platform with a 4-core Intel(R) Core(TM) i5-8250U CPU with a 1.6 GHz frequency. We used three different datasets for training, validation, and testing the DNN model, consisting of 10 5 , 1000, and 1000 different samples respectively. For each sample, different AP and user distributions were considered with randomly generated large-scale fading channel coefficients. Input to the DNN is normalized using the training dataset mean and variance. The network is trained for 20000 iterations, using mini-batch gradient descent along with the ADAM optimizer with a learning rate of 0.005. In each iteration, a random mini-batch of size 100 is selected from the training dataset. In every 50 iterations, the validation dataset is used to evaluate the model, where the model parameters corresponding to the minimum validation loss are preserved along the training. After training, performance is evaluated for the test dataset where the trained model is used to produce the power and capacity allocations for the test dataset and to calculate per-user rates. Furthermore, we exploit the unsupervised learning capability of the DNN and perform online learning to further improve the performance. Specifically, during the inference stage, in each channel realization, the trained model parameters are loaded to PowerNet and the model is retrained for 100 iterations with a learning rate of 0.005 using the custom loss function in (3) with minibatch size | B |= 1 (single sample) to maximize the sum rate for the given channel realization to produce power control and capacity allocation outputs. Thus, performing online learning allows further customization and fine-tuning of model parameters based on large-scale channel inputs in each channel realization, to further optimize the sum rate performance. Fig. 1(a) compares the sum spectral efficiency (SE) performance of the baseline and PowerNet, with different hardware quality factors. The results show that PowerNet achieves baseline sum-SE in both perfect and imperfect hardware scenarios. Furthermore, online learning enables outperforming the baseline approach which used GP for power control and the capacity allocation algorithm proposed in [6]. Better sum rate performance resulting from PowerNet with online learning doesn't necessarily mean it has achieved the global optimal solution for problem P1, but it means that it has outperformed the optimization-based baseline algorithm by converging to a better sub-optimal solution than the baseline solution. It can be explained as a result of the instantaneous model parameter optimization via online learning to achieve the sum rate maximization objective in contrast to the sub-optimal baseline solution obtained by splitting the optimization task into two sub-problems and solving them sub-optimally. Fig. 1(b) depicts the obtained optimal CSI fronthaul capacity allocations C p,m , ∀m from the three different methods. Note that according to the one-dimensional search algorithm proposed by [6], all the APs are allocated the same C p,m = C p , ∀m which is plotted in the figure. However, since PowerNet outputs different C p,m values for each AP, the plotted C p value for each iteration is the average CSI fronthaul capacity allocation per AP. Thus, PowerNet allows determining C p,m for each AP depending on its channel conditions which helps improve the sum rate performance as seen from Fig. 1(a). Fig. 2 and Fig. 3 show the flexibility of PowerNet to be used in either ideal or imperfect hardware conditions and in  varying system parameters such as the number of users and total fronthaul capacity. Fig. 2 shows the cumulative distribution of the sum rate for M = 50, K = 5 where the pre-trained  model with M = 50, K = 10 is used to obtain the user power allocations and fronthaul capacity allocations. Furthermore, Fig. 3 shows the average sum rate performance for different total fronthaul capacity values C m = C , ∀m. There, we have used the model trained for C = 1 bits/s/Hz scenario to generate the results of other instances. From the plots, we can see that PowerNet produces similar results to the baseline solution and online learning slightly improves the average sum rate across the whole fronthaul capacity range considered. This flexibility of the DL-based approach overcomes the challenge associated with conventional optimization techniques for resource allocation where it is needed to reformulate and solve the problem when the network parameters or system parameters change. Fig. 4 shows the sum rate performance comparison between PowerNet and baseline for M = 120, K = 20, showing that the proposed simple DNN structure can achieve expected results when the optimization problem is getting complex due to the increased number of users and APs. While the offline trained model performs slightly worse than the baseline, online learning outperformed the baseline, showing the scalability and learning capacity of PowerNet when problem complexity is increased.

B. Results: Performance, Flexibility, and Scalability
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

C. Computational Complexity Comparison
The GP algorithm for uplink power control has a O(K 7/2 ) algorithmic complexity which scales with the number of users [16]. The PowerNet only does a one-shot calculation performing a series of matrix multiplications and additions and function mappings in each layer to produce the outputs and hence has a fixed algorithmic complexity. The total floating point operations (FLOP) count for the proposed DNN model is 4M 2 K +8M 2 +2M (M +K ). Fig. 5 compares the time complexity of the GP-based power control algorithm implemented using the CVX solver and the proposed PowerNet for solving the sum rate maximization problem. A set of 100 channel realizations are used and different network configurations are used with a different number of users and APs, keeping M/K = 5. The processing time for the baseline method exponentially increases with increasing M and K, whereas the processing time in PowerNet is only slightly increased with M and K. This is advantageous when considering practical implementations with a large number of users and APs. Furthermore, it should be noted that PowerNet takes less than 0.01s per sample which is very low compared to the processing time of the algorithm in [6].

V. CONCLUSION
In this letter, we proposed PowerNet, a DNN trained in an unsupervised manner to perform joint power control and fronthaul capacity allocation to achieve sum rate maximization in the uplink of a limited fronthaul cell-free massive MIMO system. Simulations showed that the outputs obtained from PowerNet along with online learning result in better sum rate performance than the optimization-based solution in [6]. PowerNet has a significantly lower computational complexity than the baseline and it does not exponentially scale with the number of users and APs as the existing algorithm, which is an advantage when considering practical implementations with a large number of users and APs. The performance, low complexity, flexibility, and scalability of the proposed DL-based approach make it a potential candidate for resource allocation in a cell-free massive MIMO network considering practical scenarios.