Zeroing Unknown Terms: A Novel Clustering Architecture for Low Network Overhead in Distributed Massive MIMO

We propose a novel user-centric clustering architecture for distributed massive multiple-input-multiple-output networks. We examine the uplink of a general multi-cell scenario in which a cluster of base stations (BSs) with large antenna arrays detect the signals of multiple users simultaneously. As opposed to traditional clustering schemes, where the channel coefficients of all the users within the cooperating cluster are learned at the BSs, in the proposed approach, BSs in the network learn the channels of only the strongest users based on the received signal-to-noise ratio. We consider two linear receivers, namely, zero forcing and linear minimum mean squared error receivers at the central processing unit. The two receivers operate with only partial knowledge of the user channels and the unknown terms are set to zero. We conduct a thorough analytical investigation and derive novel approximations for the instantaneous received signal-to-interference-and-noise ratio of an arbitrary user processed by a cluster of BSs. Furthermore, we develop simple closed-form expressions for the achievable rate and symbol error probability of an arbitrary user processed by a cluster of BSs. Numerical examples are used to illustrate the accuracy of the proposed approach and also to compare it to existing approaches. We show that our proposed architecture outperforms traditional clustering methods, particularly for cluster edge users.


I. INTRODUCTION
T HE EVER-INCREASING demand for wireless commu- nication has necessitated the exploration of advanced technologies to accommodate the growing capacity requirements.Among these technologies, massive multiple-input multiple-output (MIMO) has emerged as a promising solution due to its potential to improve spectral efficiency and enhance overall system performance [1], [2].By utilizing the large number of antennas at its disposal, this technology exhibits a remarkable increase in network capacity, often exceeding tenfold or more [3].
However, traditional massive MIMO systems are unable to ensure uniform data rates within the network.This limitation has prompted researchers to explore alternative approaches, leading to the emergence of distributed antenna systems.In such a distributed system, the antennas are geographically distributed over a wide area and they cooperate via connection to a central processing unit (CPU) by fibre or coaxial cable [4], [5], [6].Compared to its co-located counterpart, the distributed system brings forth numerous advantages.These include increased channel de-correlation for users even in close proximity, enhanced coverage area [7], and higher capacity [5].Extensive studies on the cooperative massive MIMO approach have consistently shown its superior performance over non-cooperative small-cell systems, particularly in terms of throughput [8], [9].
Within the realm of distributed massive MIMO systems, two distinct paradigms exist namely the base station (BS) centric approach and user-centric approach.However, the significant variation in signal-to-noise ratio (SNR) within a cell or cluster of cells remains unresolved in BS-centric approaches due to two primary factors: the rapid decay of received signal power with propagation distance and interference from neighboring non-cooperating BSs.These factors pose significant challenges in meeting the needs of users who require consistent and high-speed connectivity throughout their coverage area [10].In light of these limitations, this paper focuses on the user-centric paradigm, aiming to further enhance the performance of distributed massive MIMO systems.
A cell-free system emerges when the associations between users and cells are released, and the rigid boundaries of individual cells are dissolved [11], [12].In a cell-free system, users are no longer constrained to specific cells, resulting in enhanced flexibility and collaborative potential within the network [10], [13].User-centric cell-free massive MIMO, alternatively referred to as user-centric distributed massive MIMO, advocates for a clustering approach based on the users' channels [10], [13].In such a user-centric approach, each user is served by a cluster of BSs.Hence, any particular BS may cooperate with different sets of BSs to serve different users [14], [15].Several existing works have explored usercentric distributed massive MIMO systems, contributing valuable insights to this field.For instance, a user-centric virtual cell massive MIMO approach was introduced in [13] leading to reduced backhaul overhead when employing maximal ratio combining.However, this approach uses a lower level of cooperation, where local estimates at the APs are combined at the CPU.The study undertaken in [16] demonstrates that a centralized implementation maximizes spectral efficiency and reduces fronthaul signaling in cellfree massive MIMO networks.The user-centric framework proposed in [15] reformulates the dynamic cooperation clustering approach [14] for cell-free networks and is able to utilize more advanced interference-aware receiver combining techniques.In [17], two user-centric clustering schemes were investigated.One, fixing the number of BSs serving a user, and the other, fixing the radius of the user's serving cluster.The evaluation was based on the data rates of the cell-edge users.An approach where users are served by multiple BScentric clusters which in turn cooperate to create user-centric clusters was studied in [18].Nevertheless, a system where user data is detected centrally, as discussed above, imposes a significant strain on the fronthaul, because the CPU requires all the channels and the received signals at the BSs.Consequently, in systems where multiple BSs cooperate, the number of user channels that need to be learned at a BS increases rapidly.To acquire these channels, BSs would be tapping into the resources which would have otherwise been spent on data transmission.It should also be noted that combining these signals at the CPU will require additional processing power.Efforts have been made to introduce low-complexity approaches within the user-centric distributed massive MIMO context.Some examples are [11], [15], [19] which emphasize the joint processing of observations from selected users and BSs to enhance the system efficiency.
Despite these enhancements, certain limitations persist, which motivates the present paper.These limitations are discussed herein.The user-centric approach outlined in [15], [19], describes a scalable combining scheme based on the MMSE receiver where a BS learns the channels of the users who have chosen that BS as a serving BS.However, the scheme requires the BSs to learn the channels from a user even if that BS is not a BS serving the said user.For example, assuming the n-th user, U n , is served by BS 1 and BS 2 and U n+1 is served by only BS 1 , the approach proposed in [15] needs the channel from U n+1 to both BS 1 and BS 2 .With this setup, if all the BSs serving U n share U n as the common user while serving entirely distinct user sets otherwise, then all the BSs might have to learn the channels of most of the users in the network.In [11], the authors explore scalability in the cell-free architecture by introducing both BS-centric and user-centric minimum mean squared error (MMSE) receivers.The system under consideration consists of single antenna BSs serving a subset of users in an ultra-dense deployment scenario.The proposed model is said to approach network-wide MMSE in which all the BSs cooperate to serve every user in the system.However, we note that to achieve the network-wide MMSE performance, the user is served by a large number of BSs.According to the outlined user subset selection policy this results in a BS serving (and in turn learning the channels of) nearly every user in the system.As such, both these approaches suffer from having to learn a large number of user channels in the limit.As a consequence, the BSs are required to exchange more information with the CPU(s) increasing the network overhead significantly.Moreover, the large number of user channels used to aid in detection contributes to a notable increase in receiver complexity, yet fails to yield a substantial enhancement in performance.
Building on these works, this paper aims to fill those gaps and presents a novel approach that mitigates network overhead while maintaining a low-complexity design.We propose a user-centric approach based on the notion that a BS learns the channels of the strongest interfering users in addition to the users for whom it serves as the strongest BS.The detection of user signals takes place at the CPU, where the received signals from cooperating BSs are jointly processed.Unknown terms in the channel matrix at the CPU will be set to zero.
We focus on the uplink of a distributed massive MIMO network where several BSs equipped with a large number of antennas serve single antenna users dispersed at random across the network.These receiving BSs cooperate via a backhaul network that incorporates multiple CPUs, providing computational support for their joint operations.From a practical point of view, this cooperation is restricted to a set of geographically close BSs that form a cooperating cluster.In the traditional cooperation scenarios, this leads to two classes of network users: in-cluster users, that are jointly detected by the cooperating BSs, and out-of-cluster users, that cause interference.However, due to the nature of our approach, we segregate the users in the system using slightly different terms.User segregation is based on additional constraints proposed in Section III which become instrumental in reducing the overhead introduced to the network through cooperative communications.The corresponding terms to traditional in-cluster and out-of-cluster users are associated and non-associated users, respectively.Also, the set of users who are served by a cluster of BSs are defined as serving users.We explain these terms with a simple example in Section III.As such, this paper extends the results of our earlier conference version [20] and makes the following contributions: • We propose a modified user-centric clustering architecture with reduced computational complexity and reduced overhead added to the network which also improves the rate uniformity across the network.
In contrast to the conventional clustering schemes, where the channel coefficients between every user and BS in the cooperative cluster are required at the CPU for detection, our proposed approach only requires the channels of the strongest users to each BS in the cooperating cluster.The identification of the strongest users is determined by a pre-determined threshold.
• We derive simple closed-form expressions to accurately approximate the instantaneous received SINR of an arbitrary serving user in a distributed massive MIMO network.Our approximations are derived for two linear receivers namely ZF and linear MMSE (LMMSE) receivers.While ZF and LMMSE receivers are commonly applied in cooperative systems [21], [22], [23], in this work, they are utilized differently.This is due to their operation in a modified scenario where the CPU lacks the full channel matrix, working instead with partial user channel information provided by the BSs.Our analysis considers imperfect CSI in which the channels of the strongest users are estimated through uplink training symbols.Non-associated user interference is taken into account in both approaches.• Based on the derived approximations for SINR, we conduct a performance analysis and obtain remarkably accurate approximations for the achievable rate and symbol error probability (SEP) of an arbitrary serving user, under our proposed architecture.• We present a comprehensive complexity analysis, demonstrating a reduction in both complexity and network overhead in the proposed architecture compared to the reference architectures.We highlight the advantages of the proposed architecture by comparing the achievable rate of a user processed by our method against two traditional clustering architectures namely non-cooperation and static clustering.It is important to note that in the special case when perfect CSI is accessible at the CPU the results derived for ZF in this study converge to that of [20] in which only the ZF was considered.
The remainder of the paper is organized as follows.Section II describes the system model used throughout this analysis.In Section III, we introduce the modified usercentric approach.Then, Section IV derives approximations for the received SINR for both receiver types.Section V formulates the approximations for the achievable rate and SEP.A comprehensive complexity analysis is provided in Section VI.Numerical examples are provided in Sections VII and VIII concludes the paper.

II. SYSTEM MODEL
In this section, we introduce the system model used throughout the paper, followed by a simple example that will be used to describe the different clustering methods.We then proceed to present three baseline clustering methods, namely non-cooperation, static cooperation and dynamic cooperation prior to introducing the new modified user-centric clustering method in the subsequent section.
We consider the uplink of a communication network where BSs are deployed randomly based on stochastic geometry.More specifically, we model the BSs in R 2 , with a homogeneous Poisson Point Process (PPP), b , of density λ b .The i th BS (BS i ) is equipped with a large antenna array with M i antennas.We further consider a population of single antenna users distributed across the network.Locations of these users are generated according to another homogeneous PPP, u , which has a density λ u and is independent of b .By tessellating the plane into Voronoi regions around each BS, we can define cells in a traditional manner.Let L be the number of BSs in the system where the i th cell encompasses BS i along with the collection of users in set N i .The total number of users in the system is N sys = L i=1 |N i |.An example cellular network is illustrated in Fig. 1(a) where λ b = 20 and λ u = 120.The channels between the users and the receive antennas are modelled as independent Rayleigh channels with path loss.Large-scale fading coefficients of the user channels are assumed to be known a priori at each BS.This assumption is reasonable due to the fact that largescale fading coefficients vary slowly, usually over the range of several coherence intervals [24].In [25], it has been shown that in the presence of log-normal-based shadowing, a network in which the BSs are arranged at fixed positions in a hexagonal grid appear as though they were realized from a Poisson-distributed process.Leveraging this observation, the channels in the network also approximate the effects of shadow fading.

A. NON-COOPERATION
In the traditional non-cooperative approach, users assigned to a specific cell, i, would be detected at BS i .In the example network in Fig. 1(a), BS i serves the users in the set N i and all other users in the system are interferers.For example, pink, blue, purple and brown users are served by BS 1 , BS 2 , BS 3 and BS 4 , respectively, with each user's channels exclusively learned at their respective serving BS.For such a network, the C M i ×1 received signal vector at BS i can be written as where the superscript NC is used to denote the noncooperative approach and the C |N i |×1 vector s NC contains the data symbols from users in the set N i .The data symbols are independent and normalized such that E[s NC (s NC ) H ] = I.
The notation (X) H denotes the Hermitian transpose of matrix X.The vector z NC i denotes the inter-cell interference plus noise at BS i .We assume that the noise is additive white Gaussian, where each noise component is characterized by a complex Gaussian distribution with a mean of zero and a variance of σ 2 .The channels connecting the users with the receive antennas are modelled as independent Rayleigh channels with path loss.Therefore, the entries of the C M i ×|N i | channel matrix, H NC i , are independent yet non-identically distributed across the columns.The element in the (m, n)-th position of the channel matrix H NC i , denoted as h imn , follows a complex Gaussian distribution with a mean of zero and a variance of P i,n .Here, P i,n represents the large-scale fading coefficient between user n and BS i which is due to the distance dependant path gain.Note that, even though we do not explicitly model shadowing, shadowing is already incorporated into the location information (through the use of Poisson distributed BS setup [25]), and consequently, in P i,n .
This approach is severely hindered by the interference from neighboring cells and a significant drop in data rates can be experienced by users located near the edge of a cell [10].

B. STATIC COOPERATION
The data rate of cell edge users can be increased through multi-cell cooperation which exploits inter-cell interference [26].When we consider static clustering (one such cooperation technique), the BSs in a pre-defined cluster would be cooperating together to detect users in the vicinity.For example, Fig. 1(a), considers a cluster of three BSs, namely BS 1 , BS 2 and BS 3 , who cooperate to detect all users located in the combined three cell zone using M SC = M 1 +M 2 +M 3 cluster antennas.The three BSs in the cluster learn the channels of the users in blue, pink and purple, whereas all the other users are interfering users.The N SC users are defined as in-cluster users of the said cluster and the cluster BSs send the received signals and the channel coefficients from the N SC users to a CPU which detects the signals.These in-cluster users are subject to interference from K SC = L i=4 |N i | out-ofcluster users and the instantaneous channel gains of these users are unknown to the CPU.For the static clustering approach, the vertically stacked C M SC ×1 received vector at the CPU is where the superscript SC is used to denote the static clustering approach and the aggregated channel matrix, H SC , contains channels from all the in-cluster users to M SC cluster antennas.Also, z SC = ((z SC 1 ) T , (z SC 2 ) T , (z SC 3 ) T ) T , is the aggregated out-of-cluster interference and noise.Note that z SC i contains the out-of-cluster interference from the K SC users and the noise perceived at BS i .The data symbols from the users in the set N SC is carried in s SC .From Fig. 1(a), we see that user U 1 , who is located in the cluster-center, is well served by this clustering.However, user U 2 , who is at the cluster edge, suffers from strong interference by the out-of-cluster users.A better choice for U 2 would have been a cluster arrangement of BS 1 , BS 4 and BS 5 .Clearly, this cooperative cell concept with disjointly clustered BSs cannot solve the problem of cluster edge users having low data rates due to out-of-cluster interference.

C. DYNAMIC COOPERATION
To encourage uniform data rates throughout the network, research is shifting from static clustering to user-centric clustering which is dynamic as a user's cluster is chosen based on the user location.User-centric clustering increases fairness and provides a better service to all users in the system [10].In a system where user-centric clustering is enforced, each BS needs to learn the channels of the users in neighboring cells in addition to learning the channels of the users in its own cell.For example, BS 1 , which learned the channels of the users in the three cells in a three-cell static clustering architecture, now needs to learn users in seven cells to operate in a dynamic architecture.As such, user-centric clustering suffers from a significantly higher network overhead due to the overlapping BS clusters as the channel information required at the CPU is increased [10].Furthermore, limiting the fronthaul overhead is imperative to achieve scalability [27].
In the next section, we propose a modified user-centric approach which reduces the computational complexity and overhead added to the network while providing good performance throughout the system.In contrast to the traditional clustering schemes discussed above, our proposed approach is grounded on the concept that a BS only learns the channels of the strongest interfering users in addition to the users for whom it acts as the strongest BS.

III. THE MODIFIED USER-CENTRIC APPROACH
The first contribution of our paper, which is the design of a new user-centric clustering approach, is presented in this section.Our modified user-centric approach is based on a partial joint processing (PJP) technique to cluster BSs [20].Let us explain the proposed method using a toy example.We define a PJP cluster as a group of three 1,2 neighboring BSs cooperating together.The rationale behind choosing a threecell cluster is twofold.Firstly, it naturally arises in a Voronoi tessellation. 3Secondly, for a typical user, the optimal cluster cardinality is small, as this minimizes overhead by limiting the number of cooperating cells [28], [29].The small cluster size recommended in this approach and the localization of the BS clusters eliminates the necessity for complex largescale BS scheduling.As such, the overall network size will not hinder the scalability and efficiency of the method.The three BSs in the PJP cluster are named BS i , BS j and BS k , and they are equipped with M i , M j and M k antennas, respectively.
It is assumed that the BSs are connected to one or more CPUs via high-capacity, delay-free links, through which the received signal vector and the available channel coefficients at each BS are sent to the CPU(s).The CPU processes the received signals of each user by consolidating received signals from BSs in the BS cluster serving that specific user.It should be noted that the BSs themselves can function as CPUs by collecting received signals and performing combining.In this configuration, a given PJP cluster of BSs designates one BS to execute the CPU functionality, while another BS in the same cluster may provide the CPU functionality for a separate PJP cluster to which it belongs.As such, a BS might have to transmit the obtained CSI to multiple CPUs (BSs).However, as we demonstrate in Section VII, the number of user channels estimated in PJP is lower than that of 3-cell static clustering.Consequently, the network overhead associated with PJP will also be comparatively lower.It should be noted that, while there are some deployment challenges compared to non-cooperative systems, the proposed approach introduces fewer challenges not only compared to fully cooperative systems but also to systems with fixed or dynamic clustering involving a similar number of BSs as PJP."Some deployment challenges linked to the proposed method include ensuring that every BS is connected to each neighboring BS and configuring the master node or CPU for each cluster of BSs.
Contrasting with traditional architectures, a BS operating in the PJP architecture learns the channels of all the users for whom it functions as the dominant BS, alongside some additional users with strong interference.That is, from the perspective of BS i , it learns the channels of two kinds of users.The first group of users are the users located in the i th cell and the next group of users are the users whose channels are learned by BS i but who are not located in the cell i.These second group of users of BS i are picked out of the set {1 . . .N sys } \ N i if they maintain a strong average received SNR to BS i .More specifically, the channel of U n ∈ {1 . . .N sys } \ N i is learned at BS i if the following constraints are satisfied.

1) The received SNR of U
where λ is a pre-defined performance threshold.2) BS i does not already serve another user on the same pilot as U n .If BS i already serves a user on the same pilot as U n , the user surpassing the threshold in (3) by the largest margin will be selected.
3) The set of users whose user channels are known at BS i , denoted as N PJP i ⊂ {1, . . ., N sys }, is such that where the superscript PJP indicates the partial joint processing approach and N max represents the maximum number of users that could be learnt at a BS.
Note that the second constraint is introduced in order to avoid pilot contamination within the PJP cluster. 4Overall, constraints 1 to 3 ensure the selection of a limited number of users with strong channels, effectively preventing pilot contamination.The ability to select λ and N max that best enhance the system performance, is a valuable design attribute.This feature enables the system to be adaptable to varying BS and user densities.For example, in a network characterized by high user and BS density, decreasing λ and increasing N max would contribute to enhancing system performance.Fig. 1(b) depicts an example scenario where a PJP cluster is formed by BS 1 , BS 2 and BS 3 .Considering only the BSs within this PJP cluster, the channels of the users represented by the colors pink, blue, and purple are only learned at BS 1 , BS 2 and BS 3 , respectively, whereas the channels of the users in green are learned at more than one BS.Note that the channels of some of these users will be learned at other BSs outside the considered cluster of BSs.However, the BS cluster made by BS 1 , BS 2 and BS 3 cooperating, is not a user-centric cluster for all the users in the sets and N PJP 3 .For example, U 2 , located at the edge of this cluster is positioned close to the center of the BS 1 − BS 4 − BS 5 cluster.The channels of the users which strongly interfere with U 2 , are known by BSs 1, 4 and 5 contributing to better performance for U 2 if served by this BS cluster (BS 1 − BS 4 − BS 5 ).As such, in a PJP cluster, we only serve users who are inside the triangle which is constructed by connecting the three cooperating BSs.More specifically, if a user is located inside a Poisson-Delaunay triangulation, the three BSs at the vertices of the triangle are chosen as its serving BSs.Note that, the Poisson-Delaunay triangulation is dual to the Poisson-Voronoi tessellation and is uniquely determined once the BS positions are known [30].As BSs in PJP learn channels of additional users who are not served by the cluster, it might appear that they learn more channels than in traditional static clustering.However, in Section VII we show that PJP BSs learn fewer channels while delivering higher data rates.Thus, the three BSs (BS 1 − BS 2 − BS 3 ) cooperate to serve the collection of users in set N PJP [1,2,3] who are located in the shaded region in Fig. 1(b).These users are defined as the serving users of the said cluster.The serving users of a particular PJP cluster are the collection of users whose data symbols are cooperatively detected by the BSs in that PJP cluster.
For a generic PJP cluster, where BSs i, j and k are cooperating, the three BSs learn a total of The defined N users are the associated users for the specific cluster made by BS i − BS j − BS k .Note that all the users learned by the set of BSs in a specific PJP cluster are termed as associated users for the said cluster.These users are subject to interference from the users in the set K = {1, . . ., , who are defined as non-associated users of the cluster.For the example PJP cluster in Fig. 1(b), non-associated users are in yellow and associated users are shown in other colors.
The user channels learned at each BS are then transmitted to the CPU, where the channel matrix is assembled by setting the unknown channels of the associated users to zero.Hence the term Zeroing Unknown Terms (ZUT).It is important to note that at the BSs, complete information about the channels of the users in N PJP i (for BS i ) is available.In contrast, at the CPU, some user channels to certain BSs are missing due to the nature of this approach.For such a network, the C M×1 received signal vector at the CPU is given by, where the C M×N channel matrix, H, contains the channels of N associated users and H denotes the non-associated user channel matrix.Also, s = (s 1 , . . ., s N ) T and s = (s 1 , . . ., s K ) T are vectors which contain data symbols from the N associated users and K = |K| non-associated users, respectively.The noise, distributed as CN (0, σ 2 I M ), is represented by n.It should be noted that H is not completely available at the CPU as not all BSs in the PJP cluster learn the same set of user channels.For example, in Fig. 1(b) only the channels of the green colored users are learned at more than one BS.By separating out the unknown terms, we can re-express (5) as where H 0 , contains only the known channel coefficients at the CPU.The unknown channels represented by zeros in H 0 according to the ZUT procedure are contained in E = H − H 0 .The r th column of H 0 and E correspond to the channels of the r th associated user and are defined as (h 0 ) r and e r .

IV. RECEIVED SINR ANALYSIS
In our previous work [20] we have assumed that perfect CSI is available a priori wherever required.In this paper, the system is analyzed under imperfect CSI conditions where the channels are estimated utilizing the LMMSE estimator.We derive approximations for the received SINR for two predominant linear receivers, namely ZF and LMMSE.For the massive MIMO scenario, the approximations for both ZF and LMMSE receivers are highly precise.

A. CHANNEL ESTIMATION
We assume that the small-scale fading varies in time according to a block-fading model [31], where the channel is constant over each frame of length τ c channel uses, and changes independently from frame to frame.As mentioned in Section II, we assume that the large-scale fading coefficients of the users are known a-priori at each BS.Since the smallscale fading gains are not known to the receiver, each BS estimates the channel vectors based on a sequence of known training symbols.We have taken the channel estimation approach outlined in [10].However, for completeness, we have included the important steps of the derivation and introduced some additional terminology to account for channel estimation.Let τ p be the number of symbols reserved for pilot transmissions in each coherence interval with the assumption that each user transmits a pilot sequence that spans these τ p symbols.The channel of each user can be estimated at each BS using the received signals.Because the pilots are τ p dimensional vectors, there are at most τ p mutually orthogonal sequences.In this context, we consider that the network uses τ p mutually orthogonal pilot sequences φ 1 , . . ., φ τ p ∈ C τ p ×1 which satisfy the following constraint, Elements of φ t will be transmitted over τ p consecutive channel uses and the index of the pilot assigned to user n is defined as p n ∈ {1, . . .τ p }.Given the limited number of τ p orthogonal pilot sequences, in cases where τ p < N sys , multiple users will share the same pilot.We define the set of users that use the same pilot as user n as χ n = {r:p r = p n , r ∈ {1, . . ., N sys }} ⊂ {1, . . ., N sys }.Considering BS i, the C 1×τ p received sequence at antenna m is, where g m,r is the Rayleigh faded channel from user r to antenna m. n m ∈ C 1×τ p is the i.i.d distributed receiver noise with variance σ 2 .Note that the pilot sequence φ p r = φ p q when χ r = χ q .
To estimate the Rayleigh faded channel of user n, we first remove the interference from users using orthogonal pilots by multiplying the received signal y m with the normalized conjugate of the pilot, φ p n , that is φ * p n / √ τ p , resulting in, Estimating the true channel, g m,n ∼ CN (0, 1), will be based on y pilot m .In (9), the first term is the desired channel g m,n scaled by P i,n τ p and the second term contains the interference generated by the pilot sharing users.The third term is the noise with variance unchanged because φ * tn √ τ p is a vector with unit norm.By concatenating the unwanted terms we re-express (9) as, where w m denotes the sum of the interference of the pilot sharing users and noise with variance σ 2 w m = r∈χ n \{n} P i,r τ p +σ 2 .The variance of y pilot m can be calculated as, Therefore, the LMMSE estimate of g m,n at antenna m is [10], The channel estimate ĝm,n is complex Gaussian with zero mean and variance The exact channel, g m,n , can be written as where gm,n is the channel estimation error, which is independent of ĝm,n and is complex Gaussian with mean zero and variance σ 2 gm,n = σ 2 wm P i,n τ p +σ 2 wm [10].Based on ( 14), ( 6) can be expressed as, The estimated C M×N matrix, Ĥ0 , and the C M×N estimation error matrix, H0 , contains independent horizontally stacked columns, ( ĥr r ĝr , and ( h0 ) r = P 1/2 r gr , respectively.Here, P r defined in block-diagonal form as P r = diag(P i,r I M i , P j,r I M j , P k,r I M k ) contains the long term powers of user r.Note that ĝr = [ĝ 1,r . . .ĝM,r ] T ∼ CN (0, I M ) and gr = [g 1,r . . .gM,r ] T ∼ CN (0, I M ).The covariance of the estimated channel, ĝr , can be expressed as ĝr = τ p P r ( 2 w + τ p P r ) −1 , where 2 w = diag(σ 2 w 1 , σ 2 w 2 , . . ., σ 2 w M ).The full expression for ĝr is shown in (16) shown at the bottom of the next page.
It is important to note that in (15), we have defined z to contain the impact of estimation error via the term H0 s.Due to the ZUT process, some blocks of Ĥ0 will be zero depending on whether the BS learns the corresponding channels.In the remainder of the paper, we evaluate the performance of the proposed architecture by considering an arbitrary user n in the set N PJP [i,j,k] .

B. ZF RECEIVER
We first examine the received SINR when the ZF receiver is used at the CPU for joint detection of the transmitted signals.The channel matrix containing the known channels at the CPU, Ĥ0 , is utilized to perform ZF.However, as z is not white, we employ a whitening filter −1/2 as outlined in [20] where In (17), the value of θ m is associated with antenna m and is expressed as Note that the expectation in ( 17) is taken over the small-scale fading of the channel estimation error, unknown channel coefficients, out-of-cluster interference and noise.The ZF receiver determines the output of the combiner as: where the weight matrix obtained based on Ĥ0 and which are known to the central processor.An expression for the instantaneous received SINR of an arbitrary serving user n can be formulated using (19) as: where ZF = H0 HH 0 +EE H +H H H +σ 2 I.The denominator in (20) contains the instantaneous channel gains of both associated and non-associated users of a given BS.In the subsequent analysis, we adopt a similar methodology as in [20] to obtain an approximate of the received SINR for an arbitrary user in a PJP system.However, for the sake of completeness, we include the critical steps of the process.
We note that the instantaneous channel gains of the associated users tend to be larger than those of the interfering non-associated users.Hence, we approximate the instantaneous channel gains of the non-associated users and unknown channels of the associated users with their expected values [32].Furthermore, by carefully choosing an appropriate number of pilots, we can ensure that the estimation error remains negligible in comparison to the channel estimates [32].As such, we can replace ZF in (20) with its expected value, E[ ZF ] = , to obtain an approximation as According to [33], the approximation given in ( 22) can be expanded as follows: where ( ĥ0 ) n is the n th column of Ĥ0 and where ( Ĥ0 ) n represents Ĥ0 with column vector ( ĥ0 ) n removed.Given that the influence of small-scale fading reduces as the number of antennas grows [34], we proceed to analyze E[X] by using Lemma 1 given below.Lemma 1: The expected value of a matrix in the form of R(R H R + βI N−1 ) −1 R H where R is a C M×N−1 matrix and β is a real scalar, can be given by where d 1 , d 2 , . . ., d N−1 are real scalars.Proof of Lemma 1 is given in the Appendix A. Therefore, when R = − 1 2 ( Ĥ0 ) n and β = 0, E[X] can be expressed as a diagonal matrix.We also note that X is an idempotent matrix (Appendix B) and therefore the trace of X is equal to its rank [32].Accordingly, following a similar approach to [21], we approximate X by a simple diagonal matrix as where we have defined X by spreading the trace of the idempotent matrix X equally along the diagonal as, X = N−1 M I M .Now, by substituting (26) in (20) where X is an error matrix we can rephrase (20) as where It is worth noting that both the numerator and denominator in equation ( 28) consist of quadratic forms that encompass the sum of M terms.By dividing both the numerator and denominator of by M, we can analyze them.According to the law of large numbers for non-identical variables [35], as M approaches infinity, the denominator of converges to its mean.This convergence is guaranteed as all channel gains are finite.As such, we can write, ĝr = diag P i,r τ p q∈χ r P i,q τ p + σ 2 I M i , P j,r τ p r∈χ r P j,q τ p + σ 2 I M j , Some terms of P n in ( 29) could be zero depending on whether the channel of the user n is learned at BS i , BS j or BS k .However, it is guaranteed that at least min(M i , M j , M k ) of the diagonal terms in P n are non-zero as user n is known by at least one of the BSs in the PJP cluster.Hence, for large M, the denominator of ( 29) divided by M converges to its mean value, μ ZF .
Analyzing the numerator of ( 28) is more complex than the denominator.Despite the Hermitian nature of X , it is random with interdependencies among its elements.Consequently, to facilitate our analysis, we shift our focus to mean-square convergence criteria.We then proceed to examine the convergence of the numerator in a mean-squared sense, using Lemma 2 as defined below.
Lemma 2: Let ξ denote a scalar in the form of ξ = where u ∼ CN (0, u ) is an M × 1 vector with diagonal covariance matrix u and is an M × M matrix.The expected value of the mean square of ξ (i.e., E[ξ 2 ]) converges to zero as M approaches infinity.
We provide the proof of Lemma 2 in Appendix C. Due to Lemma 2, when M N, Therefore we can conclude that for large M, Using (31), we can approximate γ ZF n in (27) for M N as where

C. LMMSE RECEIVER
As discussed in our previous work [20], the possible noise amplification in a ZF receiver can be mitigated by adopting a linear alternative like the LMMSE receiver.Additionally, investigating the LMMSE receiver holds significance as it stands as the optimal linear receiver which maximizes the SINR [11].Accordingly, we analyze the LMMSE receiver in this subsection.To do so, we first separate out the desired user, n, from the rest of the signals and concatenate signals from the unknown channels of associated users, the nonassociated user signals, and noise into a single term.As such, (15) can be rephrased as, In (33), s n contains signals from the N − 1 associated users except user n, who is the user under consideration.With LMMSE receiver combining, the estimated symbol for user n is, ŝn = V H n y, where is the weight vector for user n which gives the minimum mean squared error [36].In (34), contains the correlation of the noise plus non-associated user interference, unknown channel coefficients as well as the channel estimation error of the estimated channels.Due to the independence of the terms in z, simplifies to a diagonal matrix which can be expressed as, where the value of θ m is associated with antenna m and is the same as in (18).The instantaneous received SINR for user n when the LMMSE receiver is utilized is, In ( 36), LMMSE is defined in (37) shown at the bottom of the page, whereas ( H0 ) n and E n represent H0 and E with column vectors ( h0 ) n , e n removed, respectively.Note that the r th column of E corresponds to the unknown channel of the r th associated user and is indicated by e r .Similar to Section IV-B, we proceed to obtain an approximation for the instantaneous SINR in (36).As before, we approximate the instantaneous channel gains of the nonassociated users and unknown channels of the associated users with their expected values.As a result of the fact that ( ĥ0 ) r , ( h0 ) r , e r are all independent and zero mean, E[ LMMSE ] reduces to zero in (37).Therefore, we can derive an approximation for (36) as follows: where the received interference-plus-noise correlation matrix is given by Now, we shift our focus to the numerator of in (47).As per Lemma 2, when M N: From ( 53) and ( 54) we can conclude that for large M, Using ( 44), ( 52) and (55), we finally approximate γ LMMSE n in (46) for M N as where This section is focused on analyzing the performance of the proposed architecture by utilizing the derived approximations for the SINR in Section IV.We derive novel approximations for achievable rate and SEP for the considered two receivers.In this section, γ n and C correspond to received SINR for the considered receiver (i.e., γ ZF n or γ LMMSE n ) and the constant defined for that respective receiver (i.e., C ZF or C LMMSE ).

A. ACHIEVABLE RATE ANALYSIS
The achievable rate of user n is given by where α = τ p /τ c , and τ c is the coherence interval in symbols.
In scenarios with perfect CSI, the entire coherence block is used for data transmission, and (58) reduces to We note that the Jensen's inequality provides a simple upper bound for R n [21], given by The expected value of γ n in the form of ( 32) and ( 56) can be computed as, When deriving (60), we have used (11).Substituting (60) into (59) results in the closed form expression given by Increasing τ p facilitates the learning of more user channels but reduces the time allocated for data transmission, impacting the achievable rate (61).In practice, the selection of τ p should aim for an optimal point, striking a balance between channel estimation and achievable rate [37].

B. SYMBOL ERROR PROBABILITY ANALYSIS
We use a characteristic function (CF) based approach to derive the SEP of user n.Noting that the channel estimates from user n to each receiving antenna are independent and Gaussian, we first write the joint PDF of the channel gains in ( ĥ0 ) n as where ( ĥ0 n ĝn .By substituting (62) into the definition of the CF of γ n in [38] we obtain the characteristic function for γ n which is given in (63) shown at the bottom of the page.In (63), ( ĥ0 ) m,n is the estimated channel from user n to the m th antenna in the cluster.The multifold integral in (63) can be solved by using [39, Lemma 2] on (63), Based upon (64), we use the moment generating function (MGF) approach in [38] to derive an expression for the MGF of γ n , i.e., M γ n (s) as In ( 65), (ϑ n ) m , which is the (m, m)-th element of −1 P n ĝn , is associated with antenna m.Therefore, when m is an antenna of BS i , Accordingly, the SEP with coherent M PSK phase shift keying modulation for user n, can be written as [38], where Z = sin 2 (π/M PSK ) and T = (M PSK −1)π M PSK .The expression in (67) can be solved by using the tight approximation in [40] to produce In Section VII, we present simulations showing that (61) and (68) give accurate approximations for the achievable rate and SEP, respectively, for an arbitrary serving user n.

VI. COMPLEXITY ANALYSIS AND NETWORK OVERHEAD
In this section, we analyze the computational complexity and network overhead per coherence block for the proposed approach, comparing it with the two clustering schemes outlined in Section II.The complexity is analyzed in two aspects: combining vector computation and channel estimation complexity, whereas the network overhead is measured by the number of user channels transmitted from the BSs to the CPU.The computational complexity associated with combining vector computation is generally similar for identical receiver designs, contingent upon the size of the channel matrix, which, in turn, relies on the number of users processed and BSs in the cooperating cluster.Quantifying this is challenging due to the situation-specific nature of the parameters associated.However, our proposed approach introduces a block structure to the channel matrix, resulting from the zeroing of the channels of the weak users.The introduced additional structure contributes to lowering the complexity associated with computing combining vectors in the proposed approach compared to traditional static clustering with the same number of cooperating BSs but a full channel matrix.This efficiency arises from the more streamlined calculation of the inverse of a block matrix through the utilization of the Schur complement [41].
The channel estimation complexity and the network overhead depend on the total number of user channels estimated (T C ) in a coherence block.Note that the number of user channels estimated at a BS in the proposed architecture depends on the predefined threshold λ and N max and can be adjusted to best suit the system.For the three approaches considered, the metric T C is given below.
While a specific channel estimation scheme has been employed in Section III, it is essential to note that the proposed system is independent of the chosen channel estimation approach.Thus, assuming that the estimation of a user channel requires executing C complex operations, the channel estimation complexity would be CT C .Additionally, the network overhead amounts to T C complex symbols per coherence block.
In Section VII, we use the metric, T C , in (69) to gain insights into the complexity and network overheads by providing numerical results for the example scenario shown in Fig. 1.Despite the superior performance of PJP compared to the traditional static clustering architecture (Fig. 5), the number of user channels learned in PJP is lower (Table 1) supporting our claims of low network overhead and complexity.Furthermore, we discuss how adjusting the predefined threshold λ will increase (or decrease) the number of user channels learned.This adjustment, in turn, has a direct influence on the overall system performance.

VII. SIMULATIONS AND NUMERICAL RESULTS
In this section, we verify the analysis presented in previous sections through Monte Carlo simulations.While our results apply to any general setup as detailed in Section II, for verification purposes we use the example scenario in Fig. 1, where the total number of users in the system, N sys = 119.We illustrate the accuracy of our approximations for SINR, achievable rate and SEP for both considered receivers.We use a distance-dependent path loss model as detailed in [11] and set the path loss exponent to 4. We assume that all the BSs in the PJP cluster have an equal number of antennas (i.e., M i = M j = M k ).
In this setting, within each coherence interval, a share of resources is set aside for the pilot transmissions from the users.As such, we set τ c = 200 and τ p = 40.In the subsequent simulations, we divide the τ p pilots into disjoint groups and assign a set of disjoint pilot sets to each cell.The classical four-color theorem [42] is utilized to ensure no two neighboring cells share the same pilot group and therefore, each cell is assigned 10 pilots each.We use the same pilot assignment scheme for all three clustering methods. 5As the four-color theorem restricts adjacent cells to have different sets of pilots, BSs in both the 3-cell static clustering approach and PJP can learn additional users outside of the assigned cell without having two users sharing the same pilot.However, when it comes to learning the channels of users who are located outside the designated cell area, the BSs in these two approaches choose different users.In 3-cell static clustering, a BS learns the channels of users located within the cell areas of the other two cooperative BSs.On the other hand, in PJP, a BS chooses to learn the channel of a user surpassing 5 Although we consider the aforementioned pilot assignment scheme in numerical results, our approach is not constrained to this specific scheme.This pilot assignment was deliberately chosen to enable a fair comparison between the proposed scheme and the 3-cell static clustering approach.By adopting this specific pilot assignment strategy, both approaches can operate without being adversely affected by pilot contamination.This ensures that any performance differences observed between the two methods are solely attributed to their inherent characteristics rather than the coherent interference caused by pilot contamination. the threshold in (3), and these users are not required to be located in a specific cell, as long as the BS is not currently serving a user on that particular pilot.In the following plots, we consider U 1 as the user of interest when obtaining results unless otherwise stated.
First, we demonstrate the validity of our approximations for the instantaneous received SINR by plotting its cumulative distribution function (CDF) for the considered user.In Fig. 2, we vary the number of antennas at each BS as M i = 16, 32, 64 and plot the CDFs of the approximate received SINR in (32) and (56) alongside those of the exact SINR in (20) and (36), respectively.The plots clearly illustrate that our analytical expressions in (32) and (56) very closely approximate the exact received SINR and the accuracy of the approximation increases significantly as we increase the number of antennas at each BS.Interestingly, even when we have a highly loaded system with the ratio between the number of antennas and users being 48 : 37 (i.e., M i = 16), the approximation still holds well.In a heavily loaded system, we wouldn't typically expect such accurate approximations to apply effectively to massive MIMO scenarios.Therefore, these results are very promising.
Next, Fig. 3 and Fig. 4 consider PJP and plot the achievable rate and SEP, respectively, of the user under consideration, versus the received SNR.When obtaining SEP results, we assume that the transmitted symbols are modulated using binary phase shift keying (BPSK).We consider three different scenarios by changing the number of antennas located at each BS as M i = 16, 32, 64.Fig. 3 shows that when the number of antennas is large, our approximate achievable rate results, generated using (61), very accurately predict the simulation points, produced via Monte Carlo simulation, throughout the full range of SNRs.It should be noted that reducing the number of cooperative antennas below M = 39 is not viable as it would result in a scenario where the number of users served in the PJP system exceeds the number of antennas available to serve them.In this overloaded scenario, the ZF receiver cannot be used as the matrix requiring inversion is singular.Interestingly, the approximation still holds remarkably well even as the number of antennas approaches this limit (i.e., M = 48).We also observe that the achievable rate increases with the increasing number of antennas.Furthermore, Fig. 4 clearly shows that the approximate error probability results, generated using (68), closely align with the values obtained through Monte Carlo simulation.Moreover, it is evident that the error probability performance improves as the number of antennas increases.The gaps between the analytical approximations and the simulation points in both plots are very small even when M i is 16.To emphasize the advantages of the suggested method, we contrast its performance with that of two clustering techniques described in Section II. 6This comparison, depicted in Fig. 5, is based on the achievable rate when the LMMSE receiver is used.We evaluate the achievable rates of U 1 who is located at the cluster-center and U 2 who is located at the cluster edge in the static clustering architecture.Results for 6 Due to the requirement of a large number of orthogonal pilots to operate without pilot contamination, the associated excessive network overhead, and the computational complexity linked with dynamic cooperation, we refrain from comparing the proposed approach with dynamic clustering.However, it is worth noting that as the BSs possess CSI about more users, dynamic clustering is expected to offer superior performance.the PJP approach are shown for threshold λ 2 , for which the selected users are depicted in Fig. 1(b).Both U 1 and U 2 are processed by the static cluster made by BS 1 − BS 2 − BS 3 cooperating.Note that U 1 would be served by the BS 1 − BS 2 − BS 3 cluster in the PJP approach and by BS 3 in the non-cooperating approach.Similarly, BS 1 − BS 4 − BS 5 cluster processes U 2 in the PJP approach and BS 1 processes U 2 in the non-cooperating approach.Results for static clustering and non-cooperating approaches are based on [23] and [36], respectively.The plot clearly depicts a higher performance for both users when processed by the proposed architecture.In particular, it can be observed that U 2 , who is a cluster-edge user, has a significantly better achievable rate when processed by the proposed approach.This performance gain is expected as our modified user-centric architecture perceives the traditionally defined cluster-edge users as cluster-center users of a different PJP cluster.Note that the slight performance gap evident for the two users in the PJP architecture is due to the fact that in our simulation setup, U 2 is interfered with a higher number of users than U 1 .Moreover, the effect of non-associated user interference brings forth a saturation regime at high SINRs where a further increase in SINR does not notably improve the achievable rate.It should be noted that this phenomenon is not unique to the proposed approach but appears in traditional clustering approaches as well [32].Consequently, traditional out-of-cluster interference mitigation techniques could be applied to the proposed approach to alleviate such interference, particularly in high-density networks.
In Fig. 6, we have examined the PJP approach under three different threshold values where λ 1 > λ 2 > λ 3 .We have also tabulated the number of user channels learned by each architecture and at different λ values in Table 1.From Fig. 6 and Table 1, it is evident that there exists a trade-off between the chosen threshold and the number of user channels learned by a BS, influencing the overall performance of the system.Opting for a lower threshold leads to learning a higher number of user channels by the BSs, thereby enhancing the system's overall performance.Conversely, setting λ to a large value prevents BSs from learning channels of users beyond the designated cell area.In such a scenario, cooperation becomes redundant, as each BS only learns channels of the users within its assigned cell area.Consequently, the system reverts to a non-cooperative setup and the performance of the PJP system would collapse to that of Non-cooperation.However, using (56) with the PJP channel matrix (H 0 ) to derive approximations would result in errors.This discrepancy arises because, even though the channel of each user is learned only at a single BS, (57) contains terms for N = |N PJP i| + |N PJP j| + |N PJP k | users.Hence, when the system reverts to a non-cooperative setup, H NC i should be used in the analysis.With this adjustment, the achievable rate of the PJP approach would correspond to the yellow curve in Fig. 6.
As is evident from Table 1, to provide similar performance for U 1 , a BS operating in the proposed approach needs to learn around 16 user channels on average whereas the same BS operating in the static clustering approach learns 27 user channels.The total number of user channels estimated in a coherence block, T C , for each architecture is presented in Table 1.These are obtained by substituting the corresponding values to (69a), (69b) and (69c) when M i = M j = M k = 32.For non-cooperation, T C is given for BS 2 , which learns the highest number of user channels.As such, a BS in the proposed approach sends a lower number of user channels to the CPU.Consequently, the proposed approach exhibits a 40% reduction in the network overhead and channel estimation complexity for this example.Accordingly, it can be seen that the proposed architecture improves the achievable rate of traditional cluster-edge users while also reducing the overhead to the network.

VIII. CONCLUSION
In this paper, we proposed a modified user-centric approach to BS cooperation.We consider the uplink of a multi-cell scenario where a cluster of large antenna arrays jointly detect the signals from multiple serving users.We obtained novel closed-form expressions to approximate the instantaneous received SINR, the achievable rate and the SEP of an arbitrary serving user.These expressions are based on ZF and LMMSE receivers.Our approximations were derived for Rayleigh fading channels and took into consideration the impact of channel estimation errors as well as interference stemming from non-associated users.The accuracy of the derived approximations was illustrated through simulations.The performance of the proposed scheme was compared to more traditional clustering schemes using numerical examples showing the advantages of the proposed scheme.Exploring the impact of pilot contamination on the proposed scheme presents an intriguing avenue for future extensions of this research.

APPENDIX A PROOF OF LEMMA 1
In the following we show that D is a diagonal matrix by proving that the offdiagonal elements are zero.First, let us write an off-diagonal element of D as where i = j, r i is the i-th row of R and Q = R H ī R ī + βI N−1 with R ī representing R with row vector r i removed.Conditioned on Q and r j the expectation in (A.1) can be re-expressed as where the inside expectation, T i,j , is over the variables in the row vector r i .We rearrange the inverse in T i,j as (A.3) by applying the inverse of a small rank adjustment [43].
As E[r i ] = 0, the first expectation goes to zero and (A.3) can be further simplified to produce , then T i,j = E[ϒ|Q, r j ].Next, we substitute r i = v i G i , where the row vector v i ∼ CN (0, I) and G i is a diagonal matrix to re-express ϒ as As Q is Hermitian matrix, it can be written in terms of a unitary matrix and a diagonal matrix such that Q = H which results in , then W is Hermitian.Therefore, W = B H , where is a unitary matrix and B is a diagonal matrix.Hence, Now (A.7) can be expressed in a simpler notation as where the row vector x = v i and each element in x has a complex normal distribution.In (A.8), the denominator is an even function of x whereas the numerator is odd.As such, the distribution of ϒ is symmetric around zero.Therefore, the expectation of ϒ is zero, i.e., T i,j = 0. Finally, it follows that matrix D is diagonal as D i,j = 0 for all i = j.

APPENDIX B PROOF FOR THE IDEMPOTENT NATURE OF X
The idempotent nature [43] of matrix X can be shown from the below simplifications.r,r denotes the r th diagonal element of the corresponding matrix.We proceed to upper bound the expected value of (C.4) by replacing the diagonal matrix P n by (P n ) max I with (P n ) max representing the maximum element in P n , and ( −1 ) r,r ≤

FIGURE 1 .
FIGURE 1. Example of BS positions from a homogeneous PPP and users uniformly distributed in the corresponding Poisson-Voronoi cells.Small dots represent BSs and crosses represent users.

FIGURE 2 .
FIGURE 2. The CDF plot of the received SINR U1 when processed by the proposed approach.

FIGURE 3 .
FIGURE 3. Achievable rate of U1 when processed by the proposed approach.

FIGURE 4 .
FIGURE 4. SEP of U1 when processed by the proposed approach.

FIGURE 5 .
FIGURE 5. Achievable rate of U1 and U2 when processed by different architectures.

FIGURE 6 .
FIGURE 6. Achievable rate of U1 and U2 at different λ values.

2 E ξ 2 = |u H u| 2 M 2 .|u k | 2 + 2 = 1 r 1 2r
Consider scalar ξ defined as ξ = (u H u) M where u ∼ CN (0, u ) is a 1×M vector with diagonal covariance matrix u and is an M × M diagonal matrix.The expectation of the mean square of ξ is,j r =j u * r r,j u j u r * r,j u * j , (C.2b)where (.) * denotes the complex conjugate, r,j denotes the (r, j)-th element of and u r denotes the r th element of u.The expression in (C.2b) is obtained by noticing that when s = k, E[u s u k ] = 0 and rearranging the terms in (C.2a).Since |u j | 2 has an exponential distribution, E[|u j | 4 ] = 2σ j , where σ 2 j is the entry at (j, j)-th position of u .Now we can further simplify (C.2b) as (C.3) shown at the bottom of the page.To evaluate the first term in (C.3), consider a diagonal element of H given by,[P n ] r,r −,r ( κ ) r −1 P n ( κ ) H r M 2 .(C.4)In (C.4), ( κ ) r is the r th row of the matrix κ and ( ) − ,r and (P n )1 2 .1) is the sum of (C.7) and (C.9) we can writeE ξ 2 → 0, when M → ∞, (C.10)which concludes the proof of mean-square convergence in (55).