Scalable Pilot Assignment Scheme for Cell-Free Large-Scale Distributed MIMO With Massive Access

With the explosive growth of mobile communication technology, the number of access terminals has increased dramatically, which will make the pilot contamination caused by pilot reuse extremely worse due to limited pilot resources. It is difficult to apply a traditional pilot assignment algorithm to serve the numerous access terminals simultaneously in real time with low complexity, so there is necessity to design a scalable pilot assignment scheme in the case of massive access. In this paper, we propose a scalable deep learning-based pilot assignment algorithm to maximize the sum spectral efficiency (SE) of cell-free large-scale distributed multiple-input multiple-output (MIMO) systems with massive access. The mapping between user locations and pilot assignment schemes is learned by a deep neural network (DNN). The training samples of the DNN are generated by a min-max algorithm, which minimizes the maximum interference to alleviate pilot contamination. The output of the pretrained DNN is used as the initial value of the min-max algorithm to achieve better pilot assignment schemes and reduce the algorithm complexity. The simulation results show that the proposed algorithm has better convergence with massive access and achieves a higher sum SE in near real time.


I. INTRODUCTION
A cell-free large-scale distributed multiple-input multipleoutput (MIMO) system comprises massive access points (APs), which simultaneously serve users over the same time/frequency resources based on directly measured channel characteristics [1]. The massive access scenario is the most common case of a cell-free large, distributed MIMO system, and the number of orthogonal pilots is limited in the coherence interval, which means that a single orthogonal pilot should be assigned to multiple different users located in the system. Such pilot reuse will lead to interference among users sharing the same pilot, which is called pilot contamination. Reference [2] studied the number of users and pilots to maximize the spectral efficiency and demonstrated that the The associate editor coordinating the review of this manuscript and approving it for publication was Zhenzhou Tang . changes in pilot reuse factor have major impact on the achievable performance. In practical systems, pilot contamination results in inaccurate channel state information (CSI), and the sum rate will not increase with an increasing signal-to-noise ratio (SNR) but tends to stabilize as a consequence, which proves that cell-free massive MIMO systems are limited to pilot contamination [3]. Hence, many efforts have been made to alleviate pilot contamination.
Exhaustively searching the possible pilot assignment schemes will ultimately lead to the optimal answer, but it is not practical because of its high complexity. In [1], random pilot allocation was proposed to decrease the complexity; however, it also greatly reduces the performance. Reference [4] derived closed-form expressions for the proposed relative channel estimation error (RCEE) metric expectation and adopted a pilot power allocation (PPA) algorithm for channel estimation and achievable uplink rate. Reference [ proposed an efficient pilot assignment (EPA) scheme to address the pilot contamination by maximizing the minimum uplink rate of the target cell's users, which takes advantage of large-scale characteristics of the fading channel to minimize the amount of outgoing inter-cell interference at the target cell. A a partial pilot allocation scheme (PPA) was presented in [6], which tackles the pilot contamination problem and consequently improve the uplink throughput of users by using the large-scale characteristics of the fading channel to keep users with a weak channel condition out of the effect of severe interference during the pilot allocation process. Reference [7] proposed a location-based greedy pilot assignment algorithm by using the location information to assign the pilot sequence before applying the greedy pilot assignment algorithm. Taking high-mobility users into consideration, a location-aware pilot allocation scheme was designed by exploiting the behavior of line-of-sight (LOS) interference among the users and allocating the same pilot sequence to the users with small LOS interference in [8]. Reference [9] proposed a graph coloring-based pilot assignment algorithm, and a coalitional game theory-based pilot allocation algorithm was proposed in [10]. However, there have been few studies on the scalability in the case of massive access for large-scale distributed massive MIMO systems.
With the rapid development of deep learning in image processing [11], many researchers apply it in the wireless communication field to solve difficult problems with low complexity in real time. In [12], coarse frequency offset estimation was implemented by a neural network. Reference [13] designed an unsupervised learning algorithm to optimize the power allocated to each pilot sequence for each user by minimizing the sum mean square error (MSE) of channel estimation for a distributed massive MIMO system. A deep neural network (DNN) was designed to learn the mapping from the input (channel large-scale fading coefficients) to the output (a pilot power-allocation vector). In [14], supervised learning was proposed to choose an optimal pilot allocation policy, and the labels are composed of the optimal pilot assignment schemes that are generated by exhaustive searching; however, there are limitations on the scalability in the case of massive access.
In this paper, we propose a scalable deep learning-based pilot assignment algorithm to maximize the sum spectral efficiency (SE) by alleviating pilot contamination. We design a DNN to learn the mapping between user locations and pilot assignment schemes and treat them as input and labels of the DNN. Since user locations are a 2-D matrix, we think of the input as a picture and adopt an image layer. Although generating samples and training the network will cost time, it is acceptable since they are performed offline. Different from the supervised learning scheme adopted in [14], where exhaustive searching was used to generate labels, we apply the min-max algorithm [3] to generate labels with much lower complexity. The output of the DNN is used as the initial value of the min-max algorithm to achieve better pilot assignment schemes and reduce the algorithm complexity.

II. SYSTEM MODEL
As shown in Fig. 1, we consider a cell-free large-scale distributed MIMO system with massive access that is composed of M single-antenna APs and K single-antenna users. There are N K orthogonal pilot sequences that are used to demonstrate the scalability of the proposed algorithm. K users are randomly located in the system. M APs are connected with a baseband processing unit (BPU) through a wireless channel, as shown by the dashed lines.

A. CHANNEL MODEL
We define K n as the number of users that use the n-th pilot. The channel vector from the i-th user among K n users to all APs is defined as λ n,m,i represents the large-scale fading from the i-th user among K n users (for simplicity, we call it the i-th user) to the m-th AP, d n,m,i represents the distance between the i-th user and the m-th AP, c is the mid-value of the mean path gain, α denotes the exponent of the path loss, and h n,m,i denotes the channel small-scale fast-fading coefficient between the i-th user and the m-th AP.

B. UPLINK CHANNEL ESTIMATION
Specifically, we take the n-th pilot as an example. The received pilot signals of the n-th pilot in the BPU can be written as where n p,n denotes the additive white Gaussian noise vector consisting of independent and identically distributed (i.i.d.) zero mean circular symmetric complex Gaussian (ZMCSCG) CN (0, σ 2 p ) elements and σ 2 p is the noise power. The minimum MSE (MMSE) channel estimation can be given by [3] where

C. UPLINK TRANSMISSION
We defineĜ n = [g n,1 , · · · , g n,K n ] as the channel estimation matrix between the user using the n-th pilot sequence and the APs.G n = [g n,1 , · · · ,g n,K n ] represents the corresponding estimation error matrix. Therefore, we can obtain the total uplink channel estimation and error matrixĜ = Then, the signal received at the BPU can be given by where s = [s † 1 , · · · , s † n , · · · , s † N ] † and s n ∈ C K n ×1 denotes the data signal that is transmitted by the K n users using the n-th pilot. n denotes the additive white gaussian noise whose variance is σ 2 d .

D. LOWER BOUND OF THE SUM SE
With the joint linear MMSE detection, the detection signal can be given by [15] s =Ĝ H (ĜĜ H + ) −1 y, The sum SE can be computed as where (a) exists because H(s|Ĝ) = H(s) = log 2 det(πeI K ) and H(s|y,Ĝ) ≤ log 2 det[πe( . Thus, we obtain the lower bound of the sum SE and its asymptotic expression can be given by [3] C LB,inf = log 2 det(I K + )

III. SCALABLE DEEP LEARNING-BASED PILOT ASSIGNMENT SCHEME
The scalability of a pilot assignment algorithm, which is often neglected, needs to be considered in the case of massive access. In this section, we propose a scalable deep learningbased pilot assignment algorithm for cell-free large-scale distributed MIMO systems with massive access. We design a DNN to learn the mapping between UEs' locations and the pilot assignment scheme, which is generated by the min-max algorithm. The output of the pretrained network then becomes the initial value for the maximum interference (min-max) algorithm [3], and all of the above is elaborately described in the following parts.

A. DNN STRUCTURE
We use a fully connected DNN to map UEs' locations to corresponding pilot sequences. The general structure is illustrated in Fig. 2. In this letter, because we treat the UEs' position matrix as an image matrix, there is one image input layer, five fully connected layers and one output layer, and each layer has n i nodes, where i represents the layer number. The output of each hidden layer is batch normalized to improve the performance and stability of the network and then activated by the leaky rectified linear unit (ReLU) function.
which introduces nonlinearity to the network. The output of the DNN is the pilot assignment scheme, and thus, it has K neurons, which is the number of users.

B. SCALABLE DEEP LEARNING-BASED ALGORITHM (SDLA)
Scalability is vital to meet the demand of a system when there is an increasing number of users; however, there have been few studies on pilot assignment algorithms for cell-free large-scale distributed MIMO systems with massive access. Motivated by this, we propose a scalable deep learning-based algorithm (SDLA) for pilot allocation policies. First, we train the DNN to learn the mapping between user locations and pilot assignment schemes, and the training procedure is illustrated as follows.
• Generate samples: We take advantage of the min-max algorithm to generate 2000 samples containing UEs' positions and the corresponding pilot sequences, which will be fully utilized in the training part.
• Train the network: After generating the numbers of samples, we put the UEs' positions and corresponding pilot sequences into the network as input and labels, respectively. After numerous iterations, the network converges and outputs a pilot assignment scheme.
• Test the network: We utilize the pretrained network to obtain a pilot assignment scheme and take the sum SE into consideration. To evaluate the performance of different algorithms, we should use the sum SE as a measure; the higher the sum SE is, the better the performance. Then, because the output of the pretrained network is passable, it is reasonable to further treat it as the initial value for the min-max algorithm. By obtaining a better initial value preprocessed with the DNN, the Min-max algorithm will converge to a better pilot allocation scheme with lower complexity; thus, the SDLA achieves scalability and better performance in the case of massive access. The detailed procedure of the proposed algorithm is presented in Algorithm 1.

C. COMPLEXITY ANALYSIS
The complexity of the training phase consists of forward propagation and backward propagation, both of which mainly depend on the complexity of matrix multiplication; however, as mentioned above, it is reasonable for us not to care about it because the training phase is offline. The complexity of minmax algorithm [3] is O((K + K N − 1)NK ), where K denotes the number of users and N represents the number of pilot sequences.
After the DNN has been trained, we can feed the new users' locations into the pretrained DNN to obtain an output, then the output of pretrained DNN is used as the initial value of min-max algorithm to obtain the final output; hence, the complexity of SDLA algorithm is O( where L represents the number of hidden layers, n i represents the number of neurons in the i-th hidden layer. The complexity of min-max algorithm applied in SDLA algorithm is reduced to ξ whose specific value needs to be displayed in an implicit way. The explanation is shown as follows: DNN is trained by the samples generated by minmax algorithm at first, it is reasonable to consider that minmax algorithm is learned by DNN, so the output of pretrained DNN will be similar to that of min-max algorithm. Note that we also use the output of pretrained DNN as the initial value of min-max algorithm to obtain the final output, so the minmax algorithm will converge quickly with the initial value and the complexity is reduced to ξ . Then we apply the elapsed time which is counted from the beginning of the algorithm to the end of the algorithm in section IV to further present the complexity of SDLA algorithm and show that ξ < (K + K N − 1)NK in another way.

IV. SIMULATION RESULTS
In this section, we utilize the average sum SE, normalized channel estimation error and the elapsed time to compare the proposed SDLA algorithm with other existing algorithms,

Require:
The number of users K The number of pilots N The distribution of users Ensure: The assignment of pilots ψ for users 1: generate the interference graph with weighted edges based on the distribution of users 2: initialize the pilot indicator vector, we assign the first pilot to all users, ψ,ψ i = 1 (i = 1,. . . ,K ) 3: initialize the interference indicator I max ,I max = max ψ i =ψ j =1,j =i W 1,j . 4: for For each user k from 1 to K do 5: for For each pilot n from 1 to N do 6: if max we refresh the value of I max ,n temp 8: Then, we assign the pilot n temp to the current users, ψ i ← n temp 13: end for 14: Repeat steps 1-13 2000 times to generate training samples 15: train the DNN to learn the map between UEs' locations and the corresponding pilot assignment schemes 16: feed the new input (x i ,y i ) into the pretrained network to get ψ 17: go to step 2 and implement steps 2-13 to obtain the final pilot allocation scheme ψ opt the normalized channel estimation error is defined as: We consider a cell-free large-scale distributed MIMO system with M = 360 APs with single antenna, N = 30 orthogonal pilots, and K = 360 single antenna users, both users and APs are randomly distributed in a circle with a radius of 0.375 (km). Bandwidth is set to 15000 (Hz), the exponent of the path loss α is 3.7, mid-value of the mean path gain c is 140.7 and thermal noise density is −174 (dBm/Hz). Because the user's location is composed of an x coordinate and a y coordinate, the locations of 360 users comprise a 360 × 360 matrix. Here, we can treat the matrix as a grayscale image whose size is 360 × 360 × 1; accordingly, we adopt an image input layer ([360, 360, 1]) as the input layer, followed by three hidden layers with 16 neurons and one output layer, the size of which is 360. The min-batch size is 100, and the training process achieves convergence after 50 iterations. For comparison, we examined 4 other different methodologies:   • Average Pilot Power Allocation (APPA): All UEs share the same pilot, for example, pilot 1.
• Min-max Algorithm: [3] The min-max algorithm tries to minimize the maximum interference among UEs that share the same pilot. It is similar to exhaustive searching; however, if we already have a passable pilot assignment scheme, the operation in the min-max algorithm will have a much lower complexity. Specifically, if there is already an optimal scheme, then the min-max algorithm will end in one iteration.
• DNN-based Algorithm: We only utilize the DNN to learn the mapping among the UEs' positions and the pilot assignment scheme offline and then feed the new data into the pretrained DNN to obtain a pilot assignment scheme online.
• Coalitional Game (CG) Theory Based Pilot Allocation Algorithm: CG based pilot allocation algorithm was proposed in [10]. Taking the case of massive access terminals into consideration, we choose CG based pilot allocation algorithm to make comparison because its complexity is only O(KL), where K denotes the number of users and L represents the number of orthogonal pilots. More details of CG based pilot allocation algorithm can be found in [10]. It can be seen from Fig. 3 that the proposed approach outperforms the other four methods when K = M = 360; it achieves the highest sum SE, followed by the CG, minmax, DNN, and APPA schemes. It is common that the average pilot allocation has the lowest sum SE of all because users share the same pilot. The performance of CG method is slightly better than min-max method. The DNN-only method achieves a sum SE that is approximately 54.5% of that of the min-max method and 200% of that of APPA, which indicates that the output of the DNN is passable but not good enough. Although such an output is not ideal, it is a perfect initial value for the min-max algorithm, so the proposed method further optimizes the output of the DNN and thus achieves an approximately 34.4% gain in the sum SE compared to the min-max method, which justifies that the proposed method is the best method. Since the proposed method utilizes the output of pretrained DNN as the initial value of min-max algorithm, it is reasonable to believe that DNN helps min-max algorithm to jump of the local optimal solution. Note that the sum SE does not increase with SNR, which proves that cellfree large-scale distributed MIMO system is limited to pilot contamination when the total number of AP antennas goes to infinity [3].We compared the channel estimation error of the SDLA algorithm with other 4 existing algorithms in Fig. 4. Although the performance of SDLA, min-max and CG algorithm is very close in general, the SDLA algorithm achieves the minimum NMSE, which indicates the superiority of the SDLA algorithm. Moreover, CG algorithm performs better than min-max algorithm, it is in line with the result in Fig. 3. Fig. 5 shows the elapsed time of the three schemes. The elapsed time is counted from the beginning of the algorithm to the end of the algorithm to show the complexity of different algorithms in another way. CG algorithm consumes the most elapsed time about 1417s to get a pilot allocation scheme, which confirm our point of view that it is difficult for traditional algorithms to deal with pilot contamination in the case of massive access. The DNN-only scheme requires the least amount of time, approximately 0.05s; although training the DNN takes 34.6s, it is performed offline. Although the proposed scheme combine both DNN and min-max algorithm, it requires slightly less time than the min-max scheme without having the output of pretrained DNN as the initial value, such interesting comparison indirectly demonstrates that the complexity of min-max algorithm applied in SDLA algorithm is reduced and ξ < (K + K N − 1)NK , it also brings the new inspiration to us to some extent that DNN can sometimes reduce the complexity of our algorithm. To obtain a trade-off amoung the schemes, the proposed scheme is the best choice. Here, APPA is omitted because of its low complexity O(1), which requires approximately 0s.

V. CONCLUSION
In this paper, we propose a scalable deep learning-based pilot assignment scheme to maximize the sum SE of a cell-free large-scale distributed MIMO system with massive access. The proposed SDLA algorithm is scalable for massive access by leveraging a DNN to learn the mapping between UEs' locations and the pilot assignment scheme, then the output of the pretrained DNN is further used as the initial value for the min-max algorithm to achieve better pilot assignment schemes and reduce the algorithm complexity. The simulation results show that the proposed SDLA approach has better convergence with massive access and achieves a higher sum SE in near real time.