Spectral Efficiency of Network-Assisted Full-Duplex for Cell-Free Massive MIMO System Under Pilot Contamination

A cell-free massive multiple-input multiple-output (MIMO) network-assisted full-duplex (NAFD) system has been proposed to satisfy the exploding demand of higher data transmission speed and more efficient communication. However, a growing number of users in a cell-free system inevitably leads to pilot contamination. In this paper, we analyze the ergodic spectral efficiency of cell-free massive MIMO NAFD system in the presence of pilot contamination. The cell-free massive MIMO NAFD system model has been investigated and both uplink and downlink channel state information (CSI) is estimated under spatially correlated channels. Under pilot contamination, the closed-form sum-rate expressions of the uplink with maximum ratio combination (MRC) receiver while downlink with maximum ratio transmission (MRT), and uplink with zero-forcing (ZF) receiver while downlink with ZF precoding schemes are derived based on large-scale random matrix theory. Numerical results show that under several environmental settings, the theoretical results match well with the simulated results and cell-free massive MIMO NAFD system has a better performance than time-division duplex (TDD) system. Moreover, simulation results show that the achievable sum-rate of using ZF/ZF could be more spectrally efficient compared to MRC/MRT because of its interference suppression capability.


I. INTRODUCTION
In twenty-first century, information technology has became an integral part of our human beings society. With the development of mobile data transmission, the demand of faster and higher coverage mobile broadband service leads academia and industry to explore a new way to break through the limitations of current standards [1], [2]. Massive MIMO is one of the critical 5G technologies, which has very high beamforming gain and user spatial multiplexing gain provided by large scale antenna arrays [3], [4]. Centralized massive MIMO has the advantage of low backhaul lost for all service antennas are centralized in a fixed position. On the contrary, in a distributed massive MIMO system, service antennas are in uplink [10]. Generally, pilot sequences are used to estimate CSI. Based on the channel reciprocity, both uplink and downlink CSI can be obtained by pilot sequences sent from uplink user equipments (UEs). In TDD frame structure, timefrequency resources are divided into frames that made up of T seconds and W Hertz, which allows S = TW symbols to be transmitted. According to TDD protocols, B symbols (1 ≤ B ≤ S) in each frame are used to transmit uplink pilot signals. The remaining S − B symbols are allocated to the payload data transmission, both uplink and downlink. Channel estimation techniques based on linear minimum-meansquare-error (LMMSE) are usually used to estimate CSI because of their near optimal performance and low complexity. In this paper, linear MMSE channel estimation method is used.
Recently, some researches have shown that the full-duplex (FD) mode may further improve the system capacity, even twice the capacity of the conventional half-duplex (HD) system [14]. In Co-Time Co-Frequency full-duplex (CCFD) system [15], self-interference cancellation techniques are used and thus a full-duplex wireless transceiver could transmit and receive simultaneously in the same frequency band. In [16], authors analyzes the advantages and disadvantages of full-duplex massive MIMO and proposed a new framework to mitigate the interference caused by uplink and downlink users. But CCFD's performance is not perfect when used in large-scale network system. A cooperative transmission scheme called COMP for in-band wireless full duplex was put forward (CoMPlex) in [17]. Its basic idea is using two cooperative half-duplex BS which are separated spatially to approximately simulate how a full duplex base station serves an uplink user and a downlink user at the same frequency simultaneously. CoMPflex was extended to two-dimensional cellular networks in [18]. The results showed that compared with CCFD, CoMPflex improves the outage performance of uplink and downlink in the network. In [19], a full-duplex cell-free massive MIMO system is introduced and its performance is analyzed. To maximize the spectral efficiency and power efficiency of FD CF massive MIMO system, a novel and comprehensive optimization problem was formulated in [20]. In [21], a FD secure unmanned aerial vehicle (UAV)aided communication network based on directional modulation (DM) was investigated.
In the NAFD system discussed in this paper, the remote antenna unit (RAU) and users are distributed in a wide area [22]. Network-Assisted means that the RAUs' working mode can be adjust based on the network status. For example, a region that uplink users are more than downlink users, we could transform some downlink RAUs in this region to uplink mode, which could help to improve the system's capacity. The scatterers of each channel are different in probability, which means small-scale fading is independent. In addition, for CF massive MIMO, the centralized processing of both uplink and downlink baseband signals at the CPU allows the CPU to acquire in advance the signals after downlink precoding of all users. As a result, the DL-to-UL interference mitigation can be realized in digital domain. This indicates that in-band full-duplex could be achieved under CF massive MIMO with existing half-duplex hardware devices, which could be the reason it is called NAFD [22]. It should be noticed that the interference between transmit RAUs and receive RAUs is unavoidable. Each RAU essentially still work in half-duplex mode in the NAFD system. Therefore, the self-interference cancellation could be ignored. Because all UEs and RAUs are randomly and evenly distributed in a circular area, the transmitting power of uplink UEs and transmit RAUs are at the same order of magnitude. In this way, not only the DL-to-UL interference can be reduced, but also the energy efficiency of NAFD system can be improved correspondingly. In [22], the spectral efficiency of CF massive MIMO NAFD system is introduced without pilot contamination. Joint downlink beamforming and power control for NAFD by maximizing the aggregated uplink and downlink spectral efficiency subject to quality-of-service (QoS) constraints and backhaul constraints are studied in [23]. To maximize the spectral efficiency subject to QoS constraints and backhaul constraints, a transceiver design is studied in [24].
Previous works assume that a UE will be assigned to a pilot sequence that orthogonal to other sequences in massive MIMO system. However, the number of orthogonal pilot sequences is limited due to the duration of coherence interval. The available orthogonal pilot sequences in a CF system can be exhausted easily. Therefore, when UEs sending pilots to RAUs, it is inevitable to share pilot sequences, which leads to pilot contamination [25]- [27]. Several schemes have been proposed to allocate non-orthogonal pilot sequences to UEs. A novel heap-based pilot assignment algorithm was proposed in [20], which not only can mitigate the effects of pilot contamination but also reduce the involved computational complexity. Dongming Wang et al. presented two pilot assignment algorithms to obtain a better spectral efficiency [28]. The first algorithm is to try every user for each pilot to calculate which pilot will obtain the largest capacity. The second algorithm takes d −2 as a weight and minimize the maximum weight, where d is the distance between two users that using the same pilot sequence. Adil Khan et al. proposed an effective scheme in multi-cell massive MIMO systems [29]. In this scheme, users are categorized into high interference (U H ) and low interference users (U L ) based on their large scale fading. After splitting, orthogonal pilots are allotted to U H and non-orthogonal or identical set of pilots are allotted to U L . Kumar Appaiah et al. studied the effects of shifting the location of pilots in time frames to reduce inter-cell interference [30]. It is pointed out that pilot contamination do have impact on the performance of massive MIMO system in [31]. As the number of BS antennas tends to infinity, [32] analyzed the convergence of signal-to-interferencenoise ratio (SINR) with or without pilot contamination. The sum-rate lower bound of massive MIMO system with pilot contamination was derived in [33]. Analysis of a practical channel model performance was proposed in [34], which considers the channel correlation for different users and is also available for scenes without massive scattering.
From this aspect, a standard unified cell-free massive MIMO NAFD system model is given in this paper. Considering the interferences and pilot contamination, the CSI that CPU obtained is imperfect. On this basis, we focus on spectral efficiency analysis and derive the close-form expressions for the uplink and downlink asymptotic sum-rate system respectively with MRC/MRT and ZF/ZF schemes. The simulation results indicate that ZF/ZF scheme has a better performance compared with MRC/MRT scheme, and the overall efficiency of CF massive MIMO NAFD system is better than TDD system. The main contributions are summarized as follows: • In this paper, pilot contamination is introduced into the NAFD system for the first time. Previous performance analyses of NAFD system assume perfect CSI, or sufficient number of orthogonal pilot sequences. However, with the increasing number of users, pilot contamination will be more and more inevitable. The conclusion of this paper shows that when the MRC/MRT signal processing method is adopted and the number of antennas per RAU tends to infinity, the spectral efficiency tends to be constant. When ZF signal processing method is adopted, the spectral efficiency of the system increases logarithmically with the number of antennas.
• Based on the reciprocity of TDD uplink and downlink channels, we estimate CSI by sending pilot sequences from uplink and downlink users to RAUs simultaneously. Considering that the base station CPU has more powerful signal processing ability than UEs, channel estimation is mainly done by the base station CPU, which is more achievable.
• The closed-form expressions for uplink and downlink asymptotic sum-rate of NAFD system are derived based on statistics and random matrix theory. Numerical simulations show that these expressions are accurate in both large-scale system and system with a finite number of antennas and users.
The reminder of this paper is arranged as follows. The transmission and channel model of cell-free NAFD system are given in section II. In section III, we give the asymptotic expressions of the sum-rates of downlink transmission with MRT and ZF precoders. The closed-form expressions of the sum-rates of uplink transmission with MRC and ZF receiver are derived in section IV. Section V presents the numerical simulation results and theoretical comparisons. Finally in section VI, we conclude this paper. Technical proofs are presented in Appendix.
The notation adopted in this paper confirms to the following convention. Lower case bold letters denote vectors while bold capital letters denote matrix. A −1 is used to denote the inverse matrix of matrix A. I N denotes a N × N identity matrix. Normal letters denote constants. C M ×N is a complex matrix (or vectors) set with the scale of M × N . |·| is the absolute value of a scalar.
[·] T and [·] H denote the transpose and conjugate transpose of a vector or matrix, respectively. We use [A] a,b to denote the (a, b)th element of the matrix A. tr (A) denote the trace of matrix A where A is a N ×N matrix. E {·} and cov (·) denote the statistical average and covariance respectively. diag (a) denotes a diagonal matrix with a along its main diagonal. and ⊗ denote Hadamard product and Kronecker product respectively. The distribution of a circularly symmetric complex Gaussian (CSCG) random variable with m mean and variance σ 2 is denoted as CN m, σ 2 .

II. SYSTEM MODEL
The uplink and downlink transmission models are given in this section, as well as the equivalent channel model with pilot contamination.

A. TRANSMISSION MODEL
A NAFD system based on cell-free massive MIMO architecture is illustrated in Fig.1.
This system consists of K single-antenna UEs, N RAUs and a central processing unit (CPU). We assume that all RAUs are evenly distributed in a circular area, which are connected to CPU by backhaul network. The receive RAUs transmit the uplink received signal to CPU for processing and CPU precodes and transmits the downlink signal to the downlink transmit RAUs which send it to the downlink receiving user terminals. This paper assumes that the backhaul network is perfect, which can provide error-free and infinite capacity transport channel for CPU.
The whole cycle of signal transmission scheme is shown in Fig.2. In pilot phase, uplink and downlink UEs send pilot sequences simultaneously. After receiving the pilot sequences, the base station start to estimates CSI, which is the estimation phase. In the next data phase, uplink UEs send uplink data signal to the receive RAUs and downlink UEs receive downlink precoding data signal sent by transmit RAUs. Assuming that there are N u uplink receive RAUs and N u downlink transmit RAUs. The number of uplink UEs is K u , and the number of downlink UEs is K d . They satisfy that N u + N d = N and K u + K d = K where N is the total number of RAUs and K equals the total number of users. The same time-frequency resources are used in both uplink and downlink channel. Considering the large number of users, the number of pilot sequences that a massive MIMO NAFD system can use is limited when using pilot to estimate CSI in practical situation. In this paper, the number of pilot sequences is less than N and K , and is a multiple of K .

B. CHANNEL MODEL
In cell-free massive MIMO NAFD system, six channels need to be analyzed. We use G d and G u to denote the downlink channel that from transmit RAUs to downlink user terminals and the uplink channel that from uplink user terminals to receive RAUs. G iui and G iri denote inter-user interference channel and inter-RAU interference channel respectively. The channel from downlink user terminals to receive RAUs is  denoted as G du and the channel from uplink UEs to transmit RAUs is denoted as G ud , correspondingly. For simplicity, it regulates that the pilot signal power of all UEs is the same as ρ p , the data signal transmission power of all uplink UEs is the same as ρ d , and all transmit antennas have the same downlink precoding signal power as ρ u .
Assuming that both uplink and downlink UEs transmit the pilot sequences with length L at the same time, which means the number of orthogonal pilot sequences is L. Each RAU has M antennas. We assume that the pilot sequences are unit vectors for simplicity, i.e., the l pilot sequence is According to the additive noise model, the pilot signal received by uplink RAUs can be expressed as where T u ∈ C K u ×L is the uplink pilot sequences matrix and T d ∈ C K d ×L is the downlink pilot sequences matrix. Z u ∈ C (N u * M )×L is the uplink noise matrix. Similarly, the pilot signal received by the downlink transmit RAUs can be expressed as where Z d ∈ C (N u * M )×L is the downlink noise matrix. Considering the characteristics of distributed RAUs, these channels can be expressed as In above expressions, D u , D d , D du , D ud , D iri , D iui are the large-scale fading matrices of their channel. H u , H d , H du , H ud , H iri , H iui are considered to be the CSCG matrix with zero mean unit variance and independent identically distributed, as small-scale fading matrices.
In channel estimation phase, the CPU estimates CSI based on the received pilot signals. Let the number of pilot sequence ϕ l used in uplink UEs be K ul and in downlink UEs be K dl . We can get following expressions by (1), (2) postmultiplying ϕ l .
where g u,l,i is the channel vector that from the i-th uplink UE using l-th pilot sequence ϕ l to receive RAUs and g u,l,i ∈ C (N u * M )×1 , g du,l,i is the channel vector that from i-th downlink UE using l-th pilot sequence ϕ l to receive RAUs and g du,l,i ∈ C (N u * M )×1 . g d,l,i is the channel vector that from i-th downlink UE using l-th pilot sequence ϕ l to transmit RAUs and g d,l,i ∈ C (N d * M )×1 . g ud,l,i is the channel vector that from i-th uplink UE using l-th pilot sequence ϕ l to receive RAUs and g ud,l,i ∈ C (N d * M )×1 . ρ p is the pilot signal sending power.
are the noise vectors of uplink and downlink respectively. We assume that the received noise power of uplink and downlink are the same and with value of ε. According to MMSE channel estimation method [35], uplink channel can be estimated aŝ where u,l,i = diag λ u,l,i,1 , · · · , λ u,l,i,N u and λ u,l,i,t is the large-scale fading factor that from i-th uplink UE using l-th pilot sequence to t-th receive RAU, which characterizes the path loss and the spatial correlations.
du,l,j + ε ρ p I N u where du,l,j = diag λ du,l,j,1 , · · · , λ du,l,j,N u and λ du,l,i,t is the large-scale fading factor that from i-th downlink UE using l-th pilot sequence to t-th receive RAU. We defineĥ u,l = − 1 2 u,l y u,l , and rewrite equation (6) aŝ It is easy to recognize thatĥ u,l ∼CN 0, 1 M N u I M N u and is equivalent to small-scale fading. This indicated that all uplink user terminals sending pilot sequence ϕ l have the same equivalent small-scale fadingĥ u,l , which is a significant difference between the equivalent channels and the original channels. This causes the pilot contamination and worsens system performance.
Similarly, we can get the estimated downlink channel as: λ d,l,i,N d and λ d,l,i,t is the large-scale fading factor that from i-th downlink UE using l-th pilot ud,l,j + ε ρ p I N d where ud,l,j = diag λ ud,l,j,1 ,· · · ,λ ud,l,j,N d and λ ud,l,i,t is the large-scale fading factor that from i-th uplink UE using l-th pilot sequence to t-th transmit RAU.
We defineĥ d,l = − 1 2 d,l y d,l , and we get The uplink and downlink estimation error channel is defined asg u,l,i = g u,l,i −ĝ u,l,i andg d,l,i = g d,l,i −ĝ d,l,i . We could get the channel estimation error covariance matrix as (10) Assuming that the channel of uplink UEs that sending l-th pilot sequence ϕ l to receive RAUs is G u,l = g u,l,i , · · · , g u,l,K ul . Then the uplink channel is G u = G u,1 , · · · , G u,L . The estimated uplink channel isĜ u = Ĝ u,1 , · · · ,Ĝ u,L whereĜ u,l = ĝ u,l,i , · · · ,ĝ u,l,K ul , and estimation error channel isG u = G u,1 , · · · ,G u,L wherẽ G u,l = g u,l,i , · · · ,g u,l,K ul . The definitions of G d and G d,l are similar. Obviously, we have

III. ASYMPTOTIC DOWNLINK SUM-RATE
By channel estimation, only the non-ideal CSI of the uplink and downlink channels can be obtained. CPU takes the estimated channels as real channels, and sends precoding signals to downlink UEs. In this section, the asymptotic downlink ergodic sum-rates are derived under MRT and ZF precoding method. A reasonable assumption is that the number of RAUs' antennas is far greater than the number of UEs, where Considering that MRT precoding is used in downlink, the normalized transmitting power is required under the limitation of overall power, which means that the precoding matrix the received data signal of i-th downlink user that using l-th pilot sequence ϕ l can be expressed as where is the downlink symbolic vector that transmitted by RAU antennas to downlink UEs, and satisfied E x d x d is the uplink transmission signal vector that transmitted by uplink UEs to downlink receive RAUs, and satisfied E x u x u H = I K u . ρ d and ρ u are downlink data transmission power and uplink data transmission power respectively. z d,l,i is the additive noise with covariance is ε. The statistical power normalization factor κ mrt satisfied We could rewrite (13) as where κ mrt ω l,i is the precoding vector of i-th downlink UE that using l-th pilot sequence. x d,l,i is the sending symbol that transmitted by RAUs to i-th downlink UE that using l-th pilot sequence. κ mrt W [l,i] is the precoding matrix that removing κ mrt ω l,i and x d, [l,i] is the transmitting signal vector that removing symbol l,i x u + z d,l,i which consists of multi-user interference induced by imperfect channel estimation, inter-user interference between uplink and downlink (UL-to-DL) UEs and additive noise. In our derivations, the following assumptions are required.
Assumption 1: For the large-scale fading factor matrix D d , D ud and D iui have uniformly bounded spectral norm. [36] Consequently, the asymptotic expressions of uplink SINR and sum-rate with MRT precoding scheme in the following theorem.
Theorem 1: The downlink SINR can be shown as (16)- (19), as shown at the bottom of the page, where and λ iui,l,i,n is the large-scale fading factor that from i-th VOLUME 9, 2021 uplink UE using l-th pilot sequence to n-th downlink user terminal. From equation (16), we could see that even if the number of antennas in RAUs tends to be infinite, the downlink channel SINR tends to be constant due to the pilot contamination as Based on previous formulations, we derived the downlink spectral efficiency of i-th UE using l-th pilot sequence with MRT precoding scheme as Also, the downlink sum-rate with MRT precoding scheme is Proof: The proof is given in Appendix A.

B. DOWNLINK SUM-RATE UNDER ZF PRECODING IN NAFD SYSTEM
When ZF precoding scheme is adopted in downlink transmission, like MRT precoding, the normalized transmitting power is required. The precoding matrix is κ zf W = is the statistical power normalization factor. Further we can get that where d,l i,j = ξ d,l,i,j .
The received data signal of i-th downlink user that using l-th pilot sequence can be expressed as Theorem 2: Similar to analysis of MRT precoding scheme, let Assumption 1 hold true and we can get the downlink SINR expression of i-th user terminal using l-th pilot sequence with ZF precoding scheme as (25) Based on previous formulations, we derived the downlink spectral efficiency of i-th user terminal using l-th pilot sequence with ZF precoding scheme as Based on equation (25), we get that the spectral efficiency of the system with ZF precoding scheme increases logarithmically with the number of antennas per RAU increases. Eventually, the downlink sum-rate with ZF precoding scheme is Proof: See Appendix B. In this section, the asymptotic expressions of downlink sum-rate with MRT and ZF precoding schemes in a NAFD system with pilot contamination have been derived. In next section, we will derive the uplink sum-rate expressions with MRC and ZF receiver schemes.

IV. ASYMPTOTIC UPLINK SUM-RATE
In this paper, when MRC receiver scheme is adopted in uplink, the downlink precoding scheme is MRT correspondingly. Similarly, when ZF receiver is adopted in uplink then downlink precoding scheme is ZF. In this CF massive MIMO NAFD system, both uplink and downlink baseband signals are centralized processed at CPU, which allows CPU to mitigate downlink-to-uplink (DL-to-UL) RAUs interference by MMSE detection. The same assumption with downlink is that the total number of antennas is far greater than the number of UEs. In this section, the asymptotic expressions of both MRC and ZF receiver schemes are derived.

A. UPLINK SUM-RATE UNDER MRC PRECODING IN NAFD SYSTEM
Considering the cancellation of the DL-to-UL interference at CPU, the received uplink signal with MRC receiver scheme can be written as where x u ∈ C K u ×1 is uplink transmission signal vector, is the downlink transmission symbolic vector and z u ∈ C K u ×1 is the uplink noise vector.G iri = G iri −Ĝ iri considered to be the channel estimation error. κ mrt W is the downlink MRT precoding matrix, where κ mrt = 1 √ E{Tr(WW H )} is the statistical power normalization factor and W =Ĝ d . ρ d and ρ u are downlink data transmission power and uplink data transmission power respectively. We focus on the data signal sent by i-th uplink UE using l-th pilot sequence: In order to obtain a asymptotic expressions for uplink sum-rate with MRC and ZF receiver schemes, the following assumptions are demanded.
Assumption 2: For the large-scale fading factor matrixs D u , D du and D iri have uniformly bounded spectral norm.
Theorem 3: Based on equation (30) and Assumption 2, we can get the SINR of i-th user terminal using l-th pilot sequence as where ξ u,l,i,q = Tr u,l,i −1 u,l u,l,q = N u n=1 λ u,l,i,n λ u,l,q,n In equation (35), the matrixˆ u satisfy u =ˆ u ⊗ I M and is a diagonal matrix with diagonal element σ u,n satisfy and λ iri,i,n is the large-scale fading factor that from i-th downlink transmit RAU to n-th uplink receive RAU. According to (31) ∼ (37), we derive the spectral efficiency of i-th uplink UE using l-th pilot sequence as Consequently, the closed-form expression for the uplink sum-rate with MRC receiver scheme is given as Proof: See Appendix C. VOLUME 9, 2021

B. UPLINK SUM-RATE UNDER ZF RECEIVER IN NAFD SYSTEM
By zero-forcing detection, the received uplink signal with ZF receiver scheme can be written as It should be noted that when ZF receiver scheme is adopted in uplink, the ZF precoding scheme is also adopted , and . We can derive SINR and sum-rate expressions of uplink with ZF receiver scheme as following theorem. Theorem 4: Similarly, based on Assumption 2 and equation (40), we derive the SINR of i-th uplink user terminal using l-th pilot sequence as ¯ u,l i,q =ξ u,l,i,q = N u n=1 λ u,l,i,n λ u,l,q,nσu,n K ul j=1 λ u,l,j,n + K dl t=1 λ du,l,t,n + ω u,l,i,q = diag µ u,l,i,q,1 , · · · , µ u,l,i,q,N d , where iri,p − iri,p −1 iri,p iri,p is a diagonal matrix and its n-th diagonal element is and e i indicated i-th column of I K u . Based on equations (41) ∼ (50), we derive the spectral efficiency of uplink i-th UE using l-th pilot sequence with ZF receive scheme as which leads to the closed-form expression for the uplink sum-rate with ZF receiver scheme as Proof: The proof is given in Appendix D.

V. SIMULATIONS AND ANALYSIS A. CONTRASTS OF THEORETICAL AND SIMULATED RESULTS
In this section, we will validate the accuracy of theoretical results presented in section III by Monte-Carlo simulations.
Assuming that the pilot sequences reuse scheme of user terminals is random, we set the UEs and RAUs are distributed in a region with radius R = 1 km. Then we uniform randomly select the distances and phase angles relative to coordinate origin where distances ≤ R, which brings the locations of these UEs and RAUs. We introduce the large-scale fading model as whereλ is the path loss of reference point andλ = −34.5 − 20log 10 (d 0 ) dB with d 0 = 10 m. Consider the RAUs setting altitude is higher than UEs, we suppose that α 1 = 3.7 to be the path loss exponent of UEs to RAUs, α 2 = 4.0 to be the path loss exponent of UEs to UEs, and α 3 = 3.5 to be the path loss exponent of RAUs to RAUs. In our system, the bandwidth is 10 MHz, noise figure is 9 dB and noise powers are −174 dBm/Hz. In addition, we assume that ρ p = ρ u = ρ d have the same value.   (52). This figure shows that the performance of ZF/ZF is better than MRC/MRT because of its interference suppression capability. It can be seen that as the number of RAUs increases, the sumrate increases logarithmically, which means the increase of sum-rate will be slow.   4 shows the theoretical and simulated sum-rate versus the number of antennas per RAU, which is M . We set K u = K d = 32, ρ p = ρ u = ρ d = 15 dBm, L = 8 and N u = N d = 256. We can see that theoretical results are consistent with simulation, and as the number of M increases, the sum-rate of ZF/ZF scheme increases logarithmically, while the sum-rate of MRC/MRT scheme tends to constant. This inspires us that it is not necessary to dispose a great large number of antennas in a RAU.   5 shows us the sum-rates against the number of orthogonal pilot sequences, which refers to L, with both theoretical and simulated results. Setting K u = K d = 32, ρ p = ρ u = ρ d = 15 dBm, N u = N d = 256 and M = 8. It shows the trend of sum-rates is increasing first and then decreasing. The reason is that pilot contamination matters when L is small. As L increases, pilot contamination is reduced and sumrates increase. However, when the number of pilot sequences become large, pilot overhead degrades the system sum-rates. We can see that the optimal number of orthogonal pilot sequences are different. MRC/MRT scheme's optimal point is around 5, while ZF/ZF is around 8.    transmitting power increases. Particularly, when transmitting power become large, the sum-rate of system with MRC/MRT tends to be saturated. The reason is the interferences also increase by the increasing of transmitting power.
Based on these above figures, we could conclude that ZF/ZF scheme has a better performance than MRC/MRT scheme. However, considering the system extensibility and implementation complexity, MRC/MRT scheme is often adopted in practice. Therefore, we need compromise between complexity and performance when designing a cell-free massive MIMO NAFD system.

B. CONTRASTS OF TDD HALF-DUPLEX SYSTEM AND NAFD SYSTEM
We compare the cell-free massive MIMO NAFD system with cell-free massive MIMO TDD system. In this comparison, K u = K d = 32, ρ p = ρ u = ρ d = 15 dBm, L = 8, and M = 8. N u = N d and N = N u + N d is a variable in this simulation. In our system, the bandwidth is 10MHz, noise figure is 9dB and noise powers are −174dBm/Hz. Both UEs and RAUs are distributed randomly in a circle area with radius R = 1 km.
In Fig.8, both NAFD and TDD systems are increase logarithmically as the number of RAUs increases, and NAFD system has a better performance than TDD-HD system. It shows that a full-duplex system is more spectral-efficient than halfduplex system.

VI. CONCLUSION
In this paper, we analyze the spectral efficiency of cell-free massive MIMO NAFD system with MRC/MRT and ZF/ZF signal processes under pilot contamination. Firstly we establish the system model and estimate the CSI of signal transmitted channels by pilot sequences. The asymptotic expressions of uplink and downlink SINR and sum-rate are derived, with both MRC/MRT and ZF/ZF schemes under pilot contamination. The numerical simulation results validate the accuracy of these equivalent expressions. We compare the two different schemes of MRC/MRT and ZF/ZF and conclude that ZF/ZF scheme has a better performance than MRC/MRT scheme. We also compare the performance of cell-free massive MIMO NAFD system and cell-free massive MIMO TDD system, which indicate the full-duplex system is more efficient.

APPENDIX A PROOF OF THEOREM 1
Before proceeding, we first introduce the following lemma which plays a basic role in the derivation of the asymptotic sum-rates of both downlink and uplink with MRT/MRC and ZF schemes.
Lemma 1: Assuming that matrix A satisfies A ∈ C M ×M has uniformly bounded spectral norm, vector x and y satisfy x, y ∼ CN 0, 1 M I M .
x and y are independent and both of them are independent with matrix A [36], [37]. Then we have This lemma is also used in Appendix B,C and D. From (16), considering that precoding matrix is κ mrt W = We use Tr d,l,i,j , and it is easy to know that Finally, according to the definitions of ξ d,l,i,q ,ξ d,l,i,p,q and β MRT d,l,i in (17)∼ (19), we could derive the asymptotic SINR expression of i-th downlink UE using l-th pilot sequence as (16) and downlink sum-rate with MRT precoding scheme as (22).

APPENDIX B PROOF OF THEOREM 2
We derive the statistical power normalization factor κ zf firstly. According to its definition, we have Using lemma 1, we can derive that 1 We can obtain that to be estimation error power, ρ u E g H iui,l,i g iui,l,i to be UL-to-DL interference power and ε to be noise power. ω d,l,i is the column of W corresponding to i-th downlink UE using l-th pilot sequence.

APPENDIX C PROOF OF THEOREM 3
According to equation (30), we could figure out that is the multi-user interference power, and g H u,l,i ( u )ĝ u,l,i is considered to be the combination of estimation error power, DL-to-UL interference power and noise power where According to the definition of estimated channel, we havê where l = p. Next we focus on covariance matrix u . We derive that We use τ k to represent the k-th diagonal element of matrix Then (76) can be expressed as u,l,i − u,l,i −1 u,l u,l,i where its n-th diagonal element is At last, combining (73), (74), (81) and the definitions of ξ u,l,i,q ,ξ u,l,i,p,q andξ u,l,i,q in (35) ∼ (37), the deterministic SINR expression of i-th uplink UE using l-th pilot sequence is derived as (31) and uplink sum-rate expression with MRC receiver scheme is derived as (39).

APPENDIX D PROOF OF THEOREM 4
We analyze the equation (40) and can easily find √ ρ u x u is the data signal, while is considered to be the combination of interferences and noise. Consequently, the SINR of i-th UE using l-th pilot sequence is expressed as where l = p and ¯ u,l i,q =ξ u,l,i,q . Now we focus onξ u,l,i,q . Using lemma 1, we have where u,l,i,q = diag µ u,l,i,q,1 , · · · , µ u,l,i,q,N d µ u,l,i,q,p = Tr iri,p − iri,p iri,p − iri,p −1 iri,p iri,p is a diagonal matrix and its n-th diagonal element is λ iri,p,n − λ 2 iri,p,n N d j=1 λ iri,j,n + ε ρ p and u,l,i,q,j is a matrix with its (a, b)-th element is u,l,i,q,j a,b = Tr d,j,a −1 d,j d,j,b u,l,i,q . We combine equation (92) and take it back to (90) we finally get theξ u,l,i,q . Easily, we can derive the SINR of i-th uplink UE using l-th pilot sequence as (41) and uplink sum-rate with ZF receiver scheme as (52).