Multipair Relaying With Space-Constrained Large-Scale MIMO Arrays: Spectral and Energy Efﬁciency Analysis With Incomplete CSI

In this paper, we study a multi-pair two-way large-scale multiple-input multiple-output (MIMO) decode-and-forward relay system. Multiple single-antenna user pairs exchange information via a shared relay working at half-duplex. The proposed scenario considers a practical case where an increasing number of antennas is deployed in a fixed physical space, giving rise to a trade-off between antenna gain and spatial correlation. The channel is assumed imperfectly known, and the relay employs linear processing methods. We study the large-scale approximations of the sum spectral efficiency (SE) and investigate the energy efficiency (EE) with a practical power consumption model when the number of relay antennas becomes large. We demonstrate the impact of the relay antenna number and spatial correlation with reducing inter-antenna distance on the EE performance. We exploit the increasing spatial correlation to allow an incomplete channel state information (CSI) acquisition where explicit CSI is acquired only for a subset of antennas. Our analytical derivations and numerical results show that applying the incomplete CSI strategy in the proposed system can improve the EE against complete CSI systems while maintaining the average SE performance.


I. INTRODUCTION
R ELAY-BASED communication systems have attracted great attention since relay technologies have the potential to enhance cellular coverage, improve network capacity and throughput [1], [2]. The one-way relay may cause spectral efficiency (SE) loss subject to energy efficiency (EE) improvements [3]. Therefore, a two-way relay system where users can exchange information via a shared relay with a shorter required time has been introduced to improve SE [4], [5], [6].
Several relay schemes have been widely studied, such as decode-and-forward (DF), amplify-and-forward (AF) [7], [8]. AF protocol is commonly used in most previous studies on multi-pair two-way relaying. However, the noise amplification problem may worsen the performance of AF relaying, while DF relaying could avoid this problem to achieve better performance at low signal-to-noise ratio (SNR) regimes [9]. Moreover, the DF two-way relaying can perform separate linear processing on each relaying communication direction [10], [11]. Ideally, the SE of full-duplex two-way massive multiple-input multiple-output (MIMO) relay systems is expected to double compared to half-duplex [6], [12]. However, in practice, due to the immense power difference between the self-loop interference and the desired signal and hardware deficiencies, perfect self-loop interference cancellation is hard to acquire. In this case, we focus on half-duplex, where the relay transmits and receives in orthogonal frequency or time resources, having the practical relevance and simplicity of implementation [2], [11].
Massive MIMO has been popular for next-generation wireless communications because of the ability to achieve higher data rates and improve link reliability by serving numerous users simultaneously and providing large array gain [13], [14]. In the massive MIMO regime, linear processing schemes can help achieve low-complexity transmission and improve system performance [15], [16]. Therefore, it is of great interest to incorporate massive MIMO into multi-pair two-way relaying. With massive arrays at the relay, the main factor limiting system performance, inter-pair co-channel interference, could be mitigated to improve the system performance and network capacity [1], [13], [17], [18].
Note that deploying a large number of antennas in a physically constrained space would increase the spatial correlation because of insufficient inter-antenna distance [13], [19], [20]. Although both spatial correlation and mutual coupling are widely studied in the MIMO literature [21], [22], we neglect the effects of mutual coupling with the assumption that impedance matching techniques compensate them for tractable analysis [23]. The impacts of spatial correlation on ergodic capacity and symbol error rate in one-way relay systems with single-antenna nodes are studied in [24], [25], [26]. References [27] and [6] derived asymptotic power scaling laws with the Zero-forcing reception/Zero-forcing transmission (ZFR/ZFT) and Maximum ratio combining/Maximum ratio transmission (MRC/MRT), respectively. Both works considered correlated channels while [27] illustrated a single-pair massive MIMO full-duplex relay with only two MIMO users, and [6] studied a two-way full-duplex multi-pair massive MIMO relay. Reference [28] derived the SE lower-bound for a spatially correlated massive MIMO two-way full-duplex AF relay, valid for an arbitrary number of relay antennas. Reference [29] studies a multi-pair two-way full-duplex AF massive MIMO relaying system with non-negligible direct links between all user pairs. Furthermore, the spatial correlation between adjacent antennas could lead to the similarity among channels. Exploiting spatial correlation may unexpectedly alleviate the requirement of channel state information (CSI) acquisition [30], [31]. Consequently, we propose a method that involves deactivating a subset of relay antennas and activating the remaining antennas to acquire CSI at the channel estimation stage. The CSI of deactivated relay antennas is obtained by averaging the instantaneous CSI of adjacent activated antennas by exploiting the similarity between spatially correlated channels. Moreover, the proposed analysis could trade off the computational complexity and power consumption of Radio frequency (RF) chains in the channel estimation stage against the resulting estimation accuracy.
To the best of our knowledge, no previous work has jointly studied incomplete CSI and spatial correlation in the DF multi-pair half-duplex two-way relay systems. In this case, we focus on a space-constrained multi-pair two-way half-duplex DF relaying system with linear processing and incomplete CSI acquisition [9], [32]. Although our studies can be applied in both Time-division-duplex (TDD) and Frequency-division-duplex (FDD) scenarios, we concentrate on TDD systems in this paper for their simplicity and practical importance in massive MIMO [13], [33]. A detailed analysis of the sum SE approximations with MRT and Zeroforcing (ZF) processing and a practical power consumption model to analyze the EE performance are illustrated to evaluate the proposed relaying system. Specifically, the main contributions of this paper can be summarized as follows • We study a multi-pair two-way DF relaying massive MIMO system deployed in a physically constrained space. This gives rise to an interesting trade-off between antenna gain and spatial correlation when the number of antennas grows large. The large-scale approximations of the sum SE with MRT processing and ZF processing are presented with a large but finite number of antennas and imperfect CSI. • We investigate a practical and common power consumption model derived from [7], [31]. It is employed to analyze the EE performance of the proposed system with different linear processing schemes and reveal the gains of the incomplete CSI acquisition. • We employ and analyze a low-complexity incomplete CSI acquisition scheme that exploits the spatial correlation in space-constrained massive MIMO inspired by [31]. We analyze the system performance and derive the computational complexity analysis from revealing the benefits of the incomplete CSI acquisition.
Based on the above contributions, our study has revealed a number of insights which we summarize below: • The space-constrained deployment of increasing antennas at the relay introduces a trade-off in spatial diversity whereby increasing the number of antennas as signal sources, diversity is enhanced, while the resulting diminishing inter-antenna spacing introduces spatial correlation and limits the diversity gains; • The increased spatial correlation due to the spaceconstrained deployment can be readily exploited through incomplete CSI acquisition, where significant complexity gains in CSI acquisition are traded-off with the impact of inaccurate CSI on the system SE; • The space-constrained antenna deployment results in saturating gains in SE as the number of antennas increases. As the consumed power persistently increases with the increase in antenna numbers, this results in a concave EE performance with antenna numbers and scenario-dependent optimal numbers of antennas; • Incomplete CSI provides higher EE performance concerning full CSI in such scenarios while reducing the CSI acquisition complexity and even the total power consumption. To the best of our knowledge, no other previous work has studied CSI relaxation with spatial correlation and a systemic complexity analysis in the proposed scenario. Our results first show that acquiring CSI for down to half the deployed antennas in space-constrained antenna deployment achieves negligible SE performance loss while significantly enhancing the system's EE, especially with moderate spatial correlation. The structure of this paper is organized as follows: Section II introduces the space-constrained multi-pair twoway half-duplex DF relaying system model with linear processing methods and imperfect CSI. Section III presents the large-scale approximations of the sum SE and characterizes the EE with a practical power consumption model, and Section IV demonstrates the incomplete CSI acquisition scheme with corresponding system performance and complexity analysis. Our numerical results are depicted in Section V. Finally, the conclusion is given in Section VI.
Notation: In this paper, H T , H H , H * and H −1 are used to represent the transpose, conjugate-transpose, conjugate and the matrix inversion, respectively. Furthermore, I M is an M×M identity matrix. Then, |·|, ||·||, ||·|| 2 and ||·|| F respectively denote the Absolute value, Standard norm, Euclidian form and Frobenius form. CN (0, ) stands for circularly symmetric complex Gaussian distribution with zero mean and covariance . Finally, E{·} is the expectation operator and vec{·} is the vector valued operator.

A. SPATIALLY-CORRELATED CHANNEL MODEL
As shown in Fig. 1, a space-constrained multi-pair twoway half-duplex DF relay system is investigated. K pairs of single-antenna users are defined as T A,i and T B,i , where subscripts A and B denote the communication nodes in any pair of i, i = 1, . . . , K. User pairs can exchange information with each other via the relay T R equipped with M K closelyspaced antennas. The spatially-correlated Rayleigh fading system under TDD protocol is adopted [34], [35], [36]. The uplink and downlink channels between T X,i (X = A, B) and T R can be defined as h XR,i ∈ C M×1 and h T XR,i , i = 1, . . . , K, respectively. Hence, the uplink channel matrix can be further given by H XR = [h XR,1 , . . . , h XR,K ] ∈ C M×K , where X = A, B. In particular, the uplink channel matrix from T X,i to the relay in the spatially-correlated system can be defined as [35], [37] where g XR,i ∼ CN (0, β XR,i I L i ), and L i denotes the number of signal propagation paths with different angles of departure. For simplicity, we consider L i = L (i = 1, . . . , K) in the following. β AR,i and β BR,i represent the slow large fading effect, composing path loss and shadowing effect. A i ∈ C L×M is the transmit steering matrix at T X,i given by [38] In this paper, the topology of uniform rectangular arrays (URAs) is used, and the respective horizontal and vertical array steering vectors can be characterized as [35] a h θ i,l , φ i,l = 1, e j2π [dhsin(φi,l)sin(θi,l)] , . . . , e j2π [(Mh−1)dhsin(φi,l)sin(θi,l)] , where θ i,l and φ i,l (l = 1, . . . , L) are azimuth and elevation angles of departure respectively [39]. The steering vector a(θ i,l , φ i,l ) for the specific departure direction (θ i,l , φ i,l ) at the i-th user can be decomposed as where the vector valued operator vec{·} can map a m × n matrix to a mn × 1 column vector. M h and M v represent the number of antennas deployed in the horizontal and vertical directions respectively. As shown in Fig. 1 Note that in practice, the interuser distance is greater than λ, it is readily assumed that there is no receive correlation [35].

B. MULTIPLE ACCESS BROADCAST TRANSMISSION PROCESS
In our study, the data transmission process consists of two equal time-slot phases, and this two-phase protocol is called Multiple Access Broadcast (MABC) protocol [11]. In the Multiple Access Channel (MAC) phase, all users transmit signals to the relay at the same time, and then the received signal at the relay is obtained as [7], [9], [40] where x XR,i is the Gaussian signal with zero mean and unit power transmitted by the i-th user T X,i . p X,i is the transmit power of T X,i (X = A, B). n R represents the Additive White Gaussian noise (AWGN) vector with (i.i.d) CN (0, 1) elements at the relay. We assume low-complexity linear processing at the relay, with which the transformed signal is expressed as where F MAC ∈ C 2K×M is the linear receiver matrix. Then, in the Broadcast Channel (BC) phase, the relay decodes the received signal and then encodes and broadcasts the information to users [9]. The linear precoding matrix F BC ∈ C M×2K is deployed to generate the transmit signal of the relay as displays the decoded signal and ρ DF is the normalization coefficient related to the relay power constraint E{||y t || 2 } = p r . To this end, the received signals at T X,i (X = A, B) can be given by and considering the standard AWGN at

C. LINEAR PROCESSING AND CHANNEL ESTIMATION
We consider two linear processing methods, namely, MRT and ZF processing. For brevity, we assume F MAC and F BC in Eqns. (7)-(8) as the general linear processing matrices in the MAC phase and the BC phase respectively. Further respective calculations of F MAC ∈ C 2K×M and F BC ∈ C M×2K for MRT and ZF processing are studied in Section III-B. Firstly, the MRT and ZF processing matrices for the proposed system are supplied by

1) MRT PROCESSING
The linear receiver matrix in the MAC phase can be given by while the linear precoding matrix in the BC phase can be defined as 2) ZF PROCESSING The linear receiver matrix in the MAC phase is presented as meanwhile, the linear precoding matrix in the BC phase can be written as here,Ĥ XR , X = A, B represents the estimated channel matrices which would be defined later. Furthermore, with the expression of the linear precoding matrix F BC , the normalization coefficient in the BC phase with Eqn. (8) can be given by In addition to linear processing methods, imperfect CSI is also considered as perfect CSI is not attainable in practical applications. To obtain channel estimation at the relay, transmitting pilot symbols is employed in TDD systems [9], [31]. In this case, τ p symbols are used as pilots for channel estimation in each coherence interval with τ c symbols. Accordingly, we apply an minimum mean square error (MMSE) channel estimator in which case the channel estimates are given by [17], [41] whereĥ XR,i and e XR,i are the i-th columns of the estimated matricesĤ XR and estimation error matrices E XR , X = A, B, respectively. Remark thatĝ XR,i , derived from g XR,i defined in Eqn. (1), is uncorrelated with q XR,i . The elements in g XR,i and q XR,i are Gaussian random variables with zero respectively, while p p is the transmit power of pilot symbol [9].

A. EXACT RESULTS OF SPECTRAL EFFICIENCY
In this section, the sum SE performance of the proposed system is investigated. In the MAC phase, based on Eqns. (6)-(7), the transformed signal at the relay associated with the i-th user pair is given by where z X r,i is defined as and as shown in Eqn. (7). Meanwhile, the transformed signal vector at the relay can be expressed as z R = (z A r +z B r ) ∈ C K×1 . With the assistance of Eqns. (16)-(17), we can specify the power of the estimation error, inter-user interference and compound noise in z R,i as respectively. With the expressions of desired signals in Eqn. (17), the signal-to-noise-plus-interference ratio (SINR) from T X,i (X = A, B) to the relay can be obtained as In addition, this paper applies the standard lower capacity bound related to the worst-case uncorrelated additive noise [9], [42]. The lower bounding method was suggested in [43] and to avoid repetition, we refer the reader to [43] for a more systematic study. To this end, the achievable SE of the i-th user pair in the MAC phase can be given on the bottom of this page.
Meanwhile, the SE of the user T X,i (where X = A, B) to the relay can be expressed as In the BC phase, F BC is applied to generate the relay's transmit signal, and then the received signal at T X,i can be obtained by Eqn. (9). For a more detailed description, z A,i can be given at the bottom of this page and to obtain z B,i , the subscripts "AR", "BR" in the channel vectors and estimation error vectors, the subscripts "RA", "RB" in linear precoding vectors, and "A", "B" in signal and noise terms can be replaced with the subscripts "BR", "AR", the subscripts "RB", "RA", and "B", "A" in z A,i , respectively. With the assistance of Eqn. (24), shown at the bottom of the page, we can obtain the SINR RX,i from the relay to T X,i (X = A, B) at the bottom of the page straightforward.
Consequently, the SE of the relay to the i-th user T X,i (X = A, B) can be given by Moreover, the SE of the i-th user pair in the BC phase is obtained by the minimum sum of the end-to-end SE from T A,i to T B,i and the end-to-end SE from T B,i to T A,i [9], Based on the above, the sum SE of the proposed system can be summarized as the total SE of all user pairs where R i for the i-th user pair is determined by the minimum SE in the two phases [44], [45].

B. LARGE-SCALE APPROXIMATIONS OF SPECTRAL EFFICIENCY
This section investigates the respective large-scale approximations of the sum SE with two linear processing methods in the case of M increasing to large but finite numbers.

1) MAXIMUM RATIO METHOD
In this subsection, we study the SE approximations of the system with MRT processing. The linear processing matrices defined in Eqns. (10)-(11) can be rewritten as respectively. The approximation of the normalization coefficient ρ DF in Eqn. (14) is obtained by When M increases to large but finite numbers, the approximations of the achievable SE in the MAC phase and the SE between the relay and user pairs defined in Eqns. (18)-(23), (25)- (26) can be given at the bottom of the page respectively. For simplicity, we assume = AA H and A i = A j (i, j = 1, . . . , K), which are also applied in the following simulations. Based on Eqns. (33)- (34), shown at the bottom of the page,R BR,i andR RB,i can be obtained by using p B,i , p A,i , and the subscripts "BR", "AR" to replace p A,i , p B,i , and the subscripts "AR", "BR" inR AR,i andR RA,i , respectively. Therefore, the SE approximation in the BC phase can be obtained bŷ Based on the above, the approximations of the sum SE of the proposed systemR associated withR i and the respective SE approximations in the MAC and BC phases, Proof: Please see the Appendix.

2) ZERO FORCING METHOD
In this subsection, the SE approximations of the system with ZF processing are studied.
When M increases to large number, the inner product of any two columns inĜ XR can be simplified as [19], [46], [47] 1 Therefore, when the number of relay antennas goes to a large number, the inner product of any two columns in estimated channel matricesĤ XR (X = A, B) can be approximated aŝ With Eqn. (38), the ZF linear processing matrices F MAC and F BC defined in Eqns. (12)- (13) can be approximated as respectively. The approximation of normalization coefficient ρ DF can be given by According to the above analysis and highlighting the properties of ZF processing of eliminating inter-channel interference [48], the approximations of the SE in the MAC phase and the corresponding approximations of the SE between the relay and user pairs are given at the bottom of the next page respectively. Based on Eqns. (43)- (44), shown at the bottom of the next page,R BR,i andR RB,i can be obtained by replacing the transmit powers p A,i , p B,i , and the subscripts "AR", "BR" inR AR,i andR RA,i with the transmit powers p B,i , p A,i , and the subscripts "BR", "AR", respectively. Then, the SE approximation in the BC phase can be expressed aŝ Accordingly, the approximations of the sum SER related toR i and the SE approximations in the MAC and BC phases with R 1,i −R 1,i → 0 and R 2,i −R 2,i → 0, respectively, while R i −R i → 0, is given bŷ Proof: Please see the Appendix.

C. ENERGY EFFICIENCY
Accordingly, the EE can be defined as the ratio of the sum SE to the total power consumption of the system, given by [14], where R represents the sum SE defined in Eqn. (28) and P tot represents the total power consumption.
In practice, the total power consumption consists of the transmit powers, the powers consumed by static circuits, and the RF components in all RF chains. Here, P sta is the power consumption of all static circuits in the system [14]. Typically, one antenna is connected to one RF chain by digital beamforming architecture [14], [49]. In this paper, we consider a common power consumption model where the respective power consumption of the user and the relay in the proposed system can be defined as [7], where P X,i is the power required at the i-th user T X,i (X = A, B) and P r is the transmission power consumed by the relay. ζ i and ζ r are the respective power amplifier efficiencies for the i-th user and the relay. The respective power consumed by the RF components for the i-th single-antenna user and the M-antenna relay can be given by P RF,r = M P DAC,r + P mix,r + P filt,r + P syn,r , (52) where P syn is the power consumed by the frequency synthesizer. P DAC , P mix and P filt are the power consumption of the digital-to-analog converters (DACs), signal mixers and filters respectively [31]. For sake of simplicity, ζ i = ζ r = ζ , P DAC,i = P DAC,r = P DAC , P mix,i = P mix,r = P mix , P filt,i = P filt,r = P filt and P syn,i = P syn,r = P syn for i = 1, 2, . . . , K are assumed in the following simulations.

IV. INCOMPLETE CSI ACQUISITION
In practical massive MIMO relay systems, the dense deployment of relay antennas leads to an increasing spatial correlation between adjacent antennas. Higher spatial correlation can obtain greater similarity between the channels of these closely spaced antennas [13], [31]. Therefore, apart from imperfect CSI, we implement incomplete CSI acquisition by collecting the CSI for a subset of relay antennas during the channel estimation stage and generating the CSI of the rest antennas via the averaging method. This may take advantage of the similarity of channels, and dramatically reduce CSI overhead in transmission. In Fig. 2, an example of the active antenna distribution is displayed. Let B and C represent the subsets of indices for active and inactive relay antennas during the channel estimation stage respectively with |B| = N c , and |C| = M − N c [31], [50]. Moreover, we express the design of the CSI distribution pattern as 1) First, to guarantee that each inactive relay antenna could have at least one adjacent active antenna, N c /M > 0.3 is considered. 2) The basic number of antennas with CSI each row should be N c /M v and evenly distributed. 3) Then, we shift the patterns circularly to determine the CSI distribution in the following rows. 4) When N c > M v × N c /M v , the additional antennas N c − M v × N c /M v with CSI should be added from the last row with the same even distribution pattern. This CSI distribution ensures that the antennas with CSI are evenly distributed and that each inactive antenna has at least one adjacent active antenna. For the sake of simplicity, the channel estimates for active antennas in B can be obtained by Eqn. (15) and the channel estimates of inactive antennas in C are obtained by averaging the estimated vectors of adjacent active antennasĥ where M C j represents the number of antennas with instantaneous CSI that is used to approximate the CSI of the inactive antenna C j . Intuitively, the CSI of the C j -th antenna is computed through averaging the CSI of its closest active antennas, B C j a , a = 1, . . . , M C j . With the averaging method, the actual channel estimate of inactive antenna C j can also be derived via averaging the actual channel estimate of active antennas,

A. POWER CONSUMPTION MODEL
Considering both the total power consumption model defined in Section III-C and the proposed incomplete CSI acquisition model, we note that only N c RF chains are required to generate the transmit signals. Therefore, the power consumption of RF components in RF chains for the relay with N c active antennas can be given by With the assistance of Eqns. (48)- (51) and (55), the proposed incomplete CSI acquisition model can achieve power reductions in the total power consumption as · P DAC + P mix + P filt + K + 1 2 P syn + P sta . (56)

B. COMPLEXITY ANALYSIS
In this section, we investigate the total computational complexity of the proposed system and analyze the complexity reduction achieved by the incomplete CSI strategy. Based on the system model in Section II, the signal processing operations performed in the proposed multi-pair two-way TDD relay system can be divided into three phases, namely, the channel estimation phase, the MAC phase, and the BC phase.
To this end, the total complexity of the proposed system can be expressed as follows.

1) CHANNEL ESTIMATION PHASE
The main computational complexity of the MMSE estimator lies in computing the matrix inversion [41]. For simplicity, long-term statistics are assumed to remain constant, and the effect of channel coherence time is taken into account. Therefore, the complexity can be simplified as [41], [51] where τ p is the length of the pilot sequence. Meanwhile, since the CSI of inactive antennas is obtained by averaging the CSI of adjacent active antennas, the additional flops are related to the number of adjacent active antennas M C j for C j -th inactive antenna. As a more general introduction, we assume that all inactive antennas have an equal number of adjacent active antennas. Accordingly, additional operation flops with two complex-scalar additions are considered [52] C est avr = 2K · 4(M − N c ).
Here we assume that each complex-scalar addition requires two real flops. Remark that the division by M C j shown in Eqn. (54) does not introduce extra complexity. In this case, the complexity of the channel estimation phase can be stated by

2) MAC PHASE
The overall complexity of the MAC phase depends on the operations involved in generating the precoding matrix and calculating the transformed signals. First, the complexity of implementing the precoding matrix can be given by [53] where τ MAC is the number of symbols in the MAC phase. Moreover, compared with MRT processing, ZF processing introduces extra computational complexity related to the conventional singular value decomposition (SVD) approach used to compute the pseudo-inverse precoding matrix [51], [52]. Thus, the complexity of this process is given by [50] Subsequently, the generated precoding matrix is multiplied with the symbols to generate the transformed signal. The complexity of computing the transformed signal in Eqn. (7) is given by Hence, with MRT processing, the complexity of MAC phase is given by C MAC MRT = C MAC imp +C MAC comp . The complexity of MAC phase for ZF processing is given by C MAC

3) BC PHASE
The BC phase also generates a precoding matrix and computes the precoded signal. Moreover, decoding the received information at the relay introduces additional complexity. The complexity of decoding is given by [53] where τ BC is the number of symbols in the BC phase. Besides, the complexity of implementing precoding matrix is given by As with the MAC phase, when ZF processing is applied, the SVD approach is applied to calculate the pseudo-inverse precoding matrix. Therefore, the complexity of this process is Furthermore, the complexity of computing the transmit signal in Eqn. (8) can be expressed as Thus, the complexity of the BC phase can be given by inv for MRT processing and ZF processing respectively.

4) TOTAL COMPLEXITY
The total complexity of the proposed system with different processing methods (X = MRT, ZF) can be given by

V. NUMERICAL RESULTS
In this section, numerical results are presented to assess the system performance of the proposed space-constrained twoway relaying system. The following parameters are used until otherwise specified. We consider a standard Long-Term Evolution (LTE) frame with a coherence interval length τ c = 196 (symbols) and a pilot sequence length τ p = 2K [9]. An equal number of angular directions L = 200 is applied and the angle spreads of the azimuth and elevation angles of departure are fixed to π/4 and π/3 radians respectively. For the power consumption model, we assume that p A,i = p B,i = p u , i = 1, . . . , K, P DAC = 7.8mW, P mix = 15.2mW, P filt = 10mW, P syn = 25mW and P sta = 2W [31]. With considered path loss model, all users' slow large-scale fading parameters are different and can be arbitrarily achieved by β k = (ϕ/D ν k ) 1/2 , where ϕ represents the shadowing effect with typical values ranging from 2 to 6, D k is the distance between the k-th user and the relay, and ν is the path loss exponent [54], [55]. It is assumed that the relay is located in the center of a cell and all users are randomly distributed within the cell. To make the results more practical, we consider β AR =  (28). We can observe that largescale approximations closely match the numerical results, and the sum SE increases with increasing inter-antenna distance while experiencing a smaller spatial correlation. However, the growth trend decelerates when d > 0.5λ. The larger transmit power of pilot symbols can achieve a better sum SE performance due to the estimation accuracy enhancement, and the performance can gradually approach the ideal one with perfect CSI. In addition, the growing number of relay antennas also has a positive impact. Our future work will study more precise power allocation.
In Fig. 4 (a), simulation results are presented to show the effects of an increasing number of relay antennas on the sum SE. With the increase of M, the sum SE grows without limit as there is no pilot contamination. Similar to Fig. 3, a larger inter-antenna distance can help to obtain a better SE performance. However, it is critical to note that in all cases, as M increases, the benefits of adding further antennas decline. Fig. 4 (b) particularly pronounces the relationship between the EE and the number of relay  antennas M. The power consumption increases linearly with the number of antennas, while the growth trend of the SE gains decreases to large M. Accordingly, it is shown that the EE performance would be saturated after reaching a particular EE value. To this end, with a fixed inter-antenna distance, the EE performance can be optimized with specific M while reducing power consumption and sacrificing a specific SE performance. Fig. 5 shows the sum SE v.s. inter-antenna distance d with ZF processing. Note that the "Approximations" are obtained by applying Eqns. (42)- (46), shown at the bottom of p. 7, and the "Numerical Results" are generated by Eqns. (18)-(23) and (25)- (28). We can see that the sum SE increases as the inter-antenna distance increases and begins to saturate when d > 0.5λ. Furthermore, compared with Fig. 3, ZF processing can outperform MRT processing by eliminating inter-user and inter-pair interference. Moreover, increasing channel estimation accuracy with higher transmit power of pilot symbols is beneficial for the SE performance to achieve the perfect CSI performance. The greater M is beneficial to the performance of sum SE. Fig. 6 (a) displays the relation of the sum SE to the number of relay antennas. The sum SE grows unboundedly with increasing M. Similar to Fig. 5, a greater inter-antenna distance is beneficial to the SE performance. The power consumption increases linearly while the growth trend of the SE decreases at increasing M. Thus, in Fig. 6 (b), we can observe that the EE performance would slowly decline after achieving an optimal EE with a specific M. Compared with Fig. 4, system performances of ZF processing can outperform those of MRT processing, which is also clarified by Fig. 5. Furthermore, the ideal approximations of matrix inversion in ZF processing result in less accuracy when M is small. In the non-asymptotic regime, the numerical results could closely match approximations when M becomes larger.

B. INCOMPLETE CSI 1) COMPLEXITY ANALYSIS
The development of total complexity is studied in Fig. 7. It is shown that Fig. 7 (a) explores the complexity of an incomplete CSI strategy with an increasing ratio of active antennas and compares it with the complexity of a complete CSI acquisition. First, the matrix inversion in ZF processing could generate additional computational complexity; therefore, the system proposed with ZF processing experiences higher complexity than the system with MRT processing. We can observe that the incomplete CSI strategy can reduce the total complexity of the proposed method. Fig. 7 (b) shows that total complexity increases dramatically as the number of relay antennas increases. Similar to Fig. 7 (a), a smaller ratio of active antennas achieves computational complexity reductions.

2) MRT PROCESSING
The effect of the inter-antenna distance on the sum SE and EE performance with incomplete CSI acquisition and MRT processing is shown in Fig. 8. Fig. 8 (a) shows that an increasing inter-antenna distance plays a positive role in the sum SE performance. Moreover, a larger number of active antennas help achieve a higher sum SE, and the sum SE meets a saturation when d > 0.5λ, especially when N c ≤ 0.5 × M. The EE performance is demonstrated in Fig. 8 (b). It can be observed that when d is small, namely, higher spatial correlation, a smaller number of active antennas with CSI can closely achieve the EE performance with full CSI. When d > 0.5λ, the EE would also approach a saturation with incomplete CSI acquisition. To this end, spatial correlation, and incomplete CSI acquisition can jointly attain potential EE benefits and complexity reduction. Fig. 9 shows the effect of the ratio of active antennas on the sum SE and EE performance. Fig. 9 (a) underlines that the sum SE increases with the increasing ratio of active antennas. It can also be observed that more antennas with CSI are required as the inter-antenna distance increases to attain the ultimate performance achieved by the full CSI acquisition. The EE performance is studied in Fig. 9 (b). We can observe that the EE growth rate would start to slow down when N c ≈ 0.5 × M, especially when d is small. In this case, when relay antennas are strongly correlated, the proposed scheme could select a specific ratio of active antennas to achieve the desired EE performance. Therefore, the proposed incomplete CSI acquisition can help maintain the required system performance and reduce complexity.

3) ZF PROCESSING
In Fig. 10, the relationship between the inter-antenna distance and the system performance with ZF processing is examined. First, compared with Fig. 8, it can be seen that ZF processing can significantly outperform MRT processing in this proposed system. Moreover, the results in Fig. 10 (a) show that increasing the inter-antenna distance benefits the sum SE. The sum SE could begin to saturate when d > 0.4λ, especially with fewer active antennas. However, larger N c could help achieve a higher sum SE with increasing d. In Fig. 10 (b), the performance of EE is illustrated, where the EE increases and then saturates for d. Additionally, we can observe that the reduction in the number of active antennas with CSI can benefit EE, especially when d ≤ 0.5λ, this will be further clarified by Fig. 11. Therefore, when d is small to achieve a high spatial correlation, a moderate incomplete CSI acquisition can help enhance the EE performance while maintaining the SE performance. Fig. 11 shows the effect of the ratio of active antennas on the sum SE and EE performance. Specifically, Fig. 11 (a) indicates that the sum SE increases with an increasing ratio of active antennas. Similar to Fig. 10, a greater inter-antenna distance achieves a better sum SE performance. The EE performance is studied in Fig. 11 (b). It can be observed that an increasing number of active antennas could play a negative role in EE performance after achieving an optimal result. It shows that when relay antennas are highly correlated, an optimal EE can be obtained with N c = 0.5 × M. Moreover, when inter-antenna distance is large, like d = 0.5λ with lower spatial correlation, the EE would approach the maximum when more than half antennas are active, which is also clarified in Fig. 10 (b). Intuitively, this occurs because the small loss in the SE is compensated by the substantial reduction in the power consumption. Therefore, N c = 0.5 × M could be the benchmark number of active antennas to maximize the EE while maintaining a required sum SE in moderate and high spatial correlation scenarios.

VI. CONCLUSION
This paper has studied the sum SE and EE performance of a space-constrained multi-pair two-way half-duplex DF relaying system with linear processing methods and incomplete CSI. In particular, the large-scale approximations of the achievable SE with MRT and ZF processing were derived, respectively. Meanwhile, a practical power consumption model was characterized to study the EE performance. To exploit spatial correlation, an incomplete CSI approach was introduced and analyzed. Our analysis and results demonstrated the essential design guidelines for the studied scenario. They verified that the proposed system with spatial correlation and linear processing methods can enhance the EE while preserving the sum SE performance with specific system configurations.

APPENDIX DERIVATION OF THE SUM SE APPROXIMATIONS A. MRT PROCESSING
With the assistance of Eqns. (29)- (30), the calculation of the relevant approximations is derived as follows. First, in the MAC phase, the four terms inR 1,i andR XR,i (X = A, B) are given by 1) Desired signal power of T X,i , X = A, B, 3) Inter-user Interference B i at the bottom of this page and 4) Noise C i , 2) Desired signal, 3) Estimation error,   (42), (43) are proofed. Then, we focus onR RX,i in the BC phase.R RX,i , X = A, B can be obtained with the following calculation when M increase to large number, 1) Normalization coefficient, ,a) (b,b) ,     ,a) (b,b) , X = B (84)