MIMO and Massive MIMO Transmitter Crosstalk

The effects of hardware-induced crosstalk in MIMO transmitters, subject to nonlinear power amplifier distortion, are considered in this paper. A methodology that provides tractable results and a clear understanding of the effects of crosstalk on transmitter performance is introduced and applied to different transmitter models. In particular, a physically motivated $2 \times 2$ MIMO transmitter model, which is subjected to input and output crosstalk, is studied in detail, as well as a behavior motivated transmitter model, which is subjected to linear crosstalk. For the latter structure, asymptotic results, when the number of transmitters tends to infinity, are derived. These results provide insight into different 1D and 2D transmitter structures in the massive MIMO scenario. The methodology provides tractable analytical results of the performance of the transmitter. It is shown that the transmitter crosstalk degrades the performance in terms of normalized mean squared error with 3 dB going from a $2\times 2$ set-up to a 1D array of a massive amount of transmitters, and an additional 3 dB loss going from a 1D to 2D transmitter structure. Transmitter input power back-off optimization is further studied, with back-off determination that takes the effects of MIMO crosstalk into account in order to increase the energy efficiency of the transmitter.

present in a practical SISO transmitter, the MIMO transmitter adds additional artifacts, for instance, leakage between the transmitter branches or antennas, or so called crosstalk, that negatively influence its performance [1]. The crosstalk may originate from a plurality of sources, including circuit board couplings [1]. Currently, the MIMO transmitters are typically 2 × 2 structures, e.g. for IEEE 802.11 [2], [3], LTE [4], and 79 GHz radar [5], with tailored 2 × 2 MIMO approaches for digital error correction [6]. In particular, the crosstalk was identified as a key issue for the 5-GHz 108-Mb/s 2 × 2 MIMO WLAN transceiver in [2]. Work on M ×M MIMO transmitters for M > 2 include the effects of housing the transmitters on a fixed physical space [7] and system identification and measurements on 3 × 3 MIMO set-ups [8], [9].
With the massive MIMO trend [10], [11], large-M transmitter structures can be expected, and they will have imperfections included. Understanding these new transmitter structures as illustrated in Fig. 1 is one of the main motivations for this paper. This paper studies the behavior and performance of M × M MIMO transmitters based on a methodology that dates back to the classical work of Bussgang [12]. Early results on the performance of SISO systems under different kinds of nonlinearities include the work of [13], [14]. The methodology has recently gained renewed popularity and has been used to analyze the performance of orthogonal frequency domain multiplexing (OFDM) excited transmitters, spanning from the SISO case in [15]- [18], 2 × 2 MIMO transmitters in [19], to multi user massive MIMO scenarios including quantized precoding [20], 1-bit ADCs in uplink [21], [22], and precoding and quantization in the downlink [23]. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ In this paper, this traditional methodology is employed to the analysis of MIMO transmitters with crosstalk and nonlinear distortion. It provides tractable analytical results that provide a clear understanding of the transmitter behavior, without relying on laboratory experiments or numerical simulations, such as the behavior modeling work [6], [24]. Behavior models can be used in digital predistortion design [6], [25] and in numerical system simulations [26], whereas analytical expressions for the nonlinear distortion noise can be used in decoding algorithms [27]. The focus of the present work is on details of the transmitter's hardware impairments, namely crosstalk before and after the amplifiers and amplifier nonlinear distortion; previous work on hardware impairments in MIMO [28] and massive MIMO transmitters and systems [29]- [31] focus on system performance but are less detailed on the specific impairments -crosstalk and nonlinear distortion combined -in this paper. In [32] the effect of crosstalk before the amplifier in 2 × 2 transmitters on bit error rate was investigated.
A methodology to analyze the performance of nonlinear MIMO systems is presented, where the contributions are focused on a flexible, yet compact model-based description of the imperfections of the transmitter, using common quality measures for hardware impairments like nonlinear distortion and crosstalk. Although the presented methodology is generic, here it is applied to M × M MIMO baseband behavior models of radio frequency transmitters. Results are included for both 2 × 2 MIMO transmitters, to transmitters with an arbitrary amount of M branches. As a special case, asymptotic transmitter performance when M → ∞ is included which provides insight for the effects of nonlinear distortion for the massive MIMO set-up. The approach provides the researcher and practitioner with a methodology to understand the nonlinear behavior of a MIMO transmitter, including the nonlinear distortion, the effects of crosstalk, and how the performance depends on the correlation of the input streams that can be a result of digital beamforming or precoding. It is, thus, useful for transmitters in applications such as point-to-point and massive MIMO. The performance of entire communication systems is, however, not analyzed. Such analyses would include also channel and receiver models.
The paper it motivated by its contributions to the understanding of dirty radio MIMO transmitters [33]: • Generic methodology for nonlinear systems: The paper provides a generic approach using a matrix-based methodology to analyze static nonlinear MIMO systems which are subject to Gaussian input excitation. The methodology is well suited to analyze the behavior of radio frequency transmitters which are subject to OFDM input streams. The strength of the methodology is first illustrated by considering a model of a 2 × 2 MIMO radio frequency transmitter which is subject to input-and output crosstalk [19]. This transmitter model relies on the physical properties of co-located transmitters. This is illustrated in Fig. 2. It is shown that the methodology is able to derive results that conform with the recent theoretical and experimental work on the provided structure [19], [34], which both used a direct and accordingly cumbersome analytical approach to analyze the performance of   Fig. 1. In particular, 2D grid placements have been identified as a key technology for 5G New Radio [35]. The performance of dirty MIMO transmitters subject to their 1D or 2D physical space is investigated, with respect to the inherent transmitter crosstalk. The derived results on transmitter NMSE are used to study the transmitter performance under different constraints on the transmitter layout. The paper is organized as follows. In Sec. II, the applied methodology is presented, and closed form results are derived for the Bussgang gain matrix, the properties of the nonlinear distortion error, and the total error of the considered transmitter system. In Sec. III, the derived methodology is applied to an established 2 × 2 MIMO transmitter model, verifying the work in [19]. Based on the obtained results, an alternative behavior model structure is suggested that grasps the dominant behavior of the original model. The alternative model is suitable for the analysis of M × M MIMO transmitter models for M > 2 in different 1D and 2D structures. Sec. IV provides an analysis of the M ×M MIMO transmitter set-up, including the asymptotic massive MIMO case when M → ∞. Sec. V provides the conclusions of the work.

II. ANALYZING NONLINEAR MIMO SYSTEMS
We present the theory i some detail. Parts of Sections II.A and II.B have been presented earlier [20]. We develop a formalism for treating nonlinearities with a linear and a nonlinear part, suitable for RF transmitters.

A. Signal Model
Consider a static and linear M × M MIMO system excited by the vector input u, that is where superscript H denotes Hermitian transpose, and E[·] denotes statistical expectation. The covariance matrix U describes both the input crosstalk and the possible correlation of the input streams due to linear precoding or beamforming. In this work, only crosstalk is considered because it is known that on average linear precoding or beamforming do not influence the NMSE of the transmitter [19]. With reference to Fig. 2, the signal u (exciting the nonlinearities) is, in general, an internal transmitter signal; however, in this work, it is typically used to refer to the input signal (exciting the nonlinearity). The practical MIMO system is subject to nonlinear distortion and inter-channel crosstalk. To capture these effects, the practical M -vector output r is modeled by where the M × M matrix H ∈ C contains the weights for the linear relations, that is, the desired small signal gain and the input linear crosstalk. Furthermore, G ∈ C is a weighting M × N matrix and f (u) ∈ C denotes the N -column vector with, for the application, relevant nonlinear terms up to a given polynomial order.
Let y denote the baseband model of the observable transmitter output, that is y = P r + n, (4) where the M ×M matrix P ∈ C in (4) handles possible output crosstalk, and n is the zero-mean complex-valued Gaussian thermal noise, independent of the input u. The covariance of the thermal noise is denoted by N, that is

B. Bussgang Gain of Nonlinear System
To perform the analysis, the MIMO system signal r after the nonlinearity (3) is expressed as an attenuated version of the input u, and a distortion noise v. Here, the distortion noise v is uncorrelated to both the input u and the thermal noise n. Accordingly, by construction, the output r in (3) should fulfill where s in (6) is a linearly transformed version of the input u, that is In (7), the M × M matrix A ∈ C is called the Bussgang gain to honor the initial work [12]. An expression for A is derived next. It is clear from (7) that the covariance of s reads For the distortion noise v in (6) to be uncorrelated with the input u it is required that where 0 is the zero matrix of appropriate size. With the requirement (9), define as the output-input cross correlation matrix. It follows directly from the original signal model (3) that W defined in (10) can be expressed as In (11), the N × M matrix U contains the higher order moments of u, and is defined by Now, starting with the model (6) that introduces the distortion noise, the correlation matrix W in (10) can alternatively be expressed as where (7) and (9) were used in the second equality. Assuming that the involved matrix inverses exist, the Bussgang gain A follows from (13) as where the second equality follows from (11). In (14), the Bussgang matrix A Δ is defined as a part of the Bussgang gain A, that is A = I + A Δ . Note that the Bussgang gain A is a function of the system matrices H and G, and the moment properties of the input signal u via U and U. For a linear system, G = 0 in (3) and accordingly A = I, as expected.

C. Distortion Noise Properties
The model (6)-(7) describes the output from the nonlinear function in terms of a linearly attenuated replica of the input u, and a distortion noise v. The properties of the latter term is as follows. From (6), the distortion noise reads v = r − s.
The distortion noise covariance V is defined by where R = E[r r H ] and S is defined in (8). The second equality in (16) follows since s and v are uncorrelated.
In order to calculate R in (16), note from the original signal model in (3) that where [z] denotes the real part of z ∈ C, and the higher order moment matrix U of size N × N is introduced as In a similar vein, first note that S in (16) is given in (8). Now, inserting the Bussgang gain derived in (14) into (8) yields Now, inserting the results (17) for R and (19) for S into the expression (16) for the error covariance V, a straightforward calculation yields The covariance V of the distortion error v is a function of the system matrix G, and the moment properties of the input signal u via U −1 , U and U.

D. Properties of the Transmitter Output Error
Using (6)-(7), the observable transmitter output y in (4) reads The model (21) contains three uncorrelated stochastic terms, and is equivalent to the model (4), subject to the condition (9). It is worth noting that u and v are uncorrelated by construction, as given in (9). However, they are clearly not statistically independent, as v is a result of a nonlinearity affecting u. Accordingly, v is typically not Gaussian, even though u is Gaussian by assumption.
With the observable baseband output y in (21), and the ideal output r o according to (1), the error signal e is given by The introduced signal u captures the (linear part of the) error due to crosstalk and Bussgang attenuation, v is the nonlinear distortion noise, and n the thermal noise. From (3, (7) and (15) we get where v L is linear and v NL nonlinear in u. The three terms in (22) are jointly uncorrelated. Accordingly, the error covari- where U = E[ u u H ] is to be decided, V is given by (20), and N is the thermal noise covariance (5). From (22), it follows that the covariance matrix U that is eminent in (24) is given by The result U in (25), V in (20), and (14) for the Bussgang gain A are the main results of the paper. The methodology and results of the analysis of the NMSE of different MIMO transmitters are presented later in the paper. The NMSE is defined in the next section of the paper.

E. Transmitter Figures of Merit
Since the focus of the paper is on the behavior and performance of the transmitter only, figures of merit reflecting the quality and performance of the communication link are not considered. Such figures require channel and receiver models. We refer to e.g. [37], [38] for such studies. Accordingly, the NMSE is adopted as the main figure of merit, where it is defined for the :th branch by (that is, = 1, . . . , M) where [·] , denotes the (, ):th element of the matrix within the brackets, and where the normalization γ 2 is with respect to the average input power and the ideal gain. With an ideal gain h o and input variance σ 2 in all the M branches, the normalization reads If it is assumed that the channel and receiver impairments are perfectly known and compensated for, the terms linear in u in (22) and (23) can be neglected and the NSME becomes identical to the error vector magnitude (EVM) [29]. To analyze the cross-branch properties of the transmitter, additional figures of merit, such as the normalized error covariance (NEC) can be defined in a similar vein as the NMSE [19], but is beyond the scope of this paper.
As a complement to the NMSE we use a measure of the out-of-band error, which only is affected by the transmitter's nonlinear properties. The out-of-band error of channel is conventionally measured by the adjacent channel power ratio (ACPR) (also called adjacent channel error power ratio) [ where E (ω) is the power spectral density (PSD) of the error signal and Y (ω) is the signal's PSD. Notice that in (28) the numerator is integrated over the adjacent channel and the denominator of the channel itself. The channel limits are given by the spectrum masks of the telecommunication standard in question. The terms in (22) and (23) that will contribute to the error signal in the adjacent channel are v NL and n, and, hence, in (28) where V NL, (ω) and N (ω) are the PSD of the :th branch of v NL and n, respectively. With the approximation that the linear gain is the same as the ideal gain and that the power of the nonlinear distortion is small compared to the power of the signal y , we get for the denominator in (28) ch.
The presented methodology is applied to the frequently used 2×2 transmitter model in Fig. 2 and verifies some of the results from the literature. The main result is an analytical expression for the NMSE as a function of parameters for the crosstalk, nonlinearity, and thermal noise. For experimental validations of the model, we refer to [34]. Based on the obtained results, an alternative behavior model is proposed. The alternative model accurately grasps the dominating behavior of the transmitter and neglects the less dominant terms.

A. Signal Model
With reference to Fig. 2, let the power amplifier output be described by In (31)-(32), the compression parameter ρ is assumed realvalued, and ρ ≤ 0. A quasi-static performance analysis is possible by letting ρ and other transmitter parameters (as defined below) be complex-valued [19]. However, such analysis will not, except for the specific case, provide additional information worth the extra effort required. For the considered 2 × 2 behavior model, the reader is referred to [19] for such an analysis. We restrict the analysis to third order nonlinearities. Polynomial models with higher order terms, ρ n u |u| n−1 , where n = 3, 5, . . ., could, however, be used. The third order terms are typically dominating and can be estimated from quality measures for amplifiers such as the 1 dB compression point or third order intercept point [39].
Nonlinear functions other than polynomials could be analyzed, in which case the calculation of U and U would be different. For example, experimentally determined amplitude (AM/AM) and phase (AM/PM) distortion data could be used and U and U calculated numerically. Using the matrix formulation, (31)-(32) is rephrased as The input u is subject to crosstalk, that is the covariance matrix U reads where where δ ∈ R models the input crosstalk, and σ 2 is the joint power of the uncorrelated inputs before the crosstalk, aka x 1 and x 2 in Fig. 2. The crosstalk is typically caused by circuit board couplings and is therefore modeled as a static linear reciprocal network [39]. Explicitly, combining (34) and (35), the covariance matrix U reads To calculate the NMSE, the crosstalk induced correlation has to be compensated for. Accordingly, the ideal system H omatrix in (1) is given by where the second equality directly follows from (35). The components of the transmitter output y read whereñ 1 andñ 2 are jointly uncorrelated thermal noises with common variance σ 2 n , and where μ ∈ R denotes the output crosstalk. Using the matrix formulation, (38)-(39) is expressed as where the covariance N of the equivalent thermal noise n = (n 1 n 2 ) T due to the crosstalk reads

B. Bussgang Matrix and NMSE
For the considered 2 × 2 transmitter model, it is shown in Appendix V-A that the Bussgang gain in (14) reads that is, A Δ = a Δ I with reference to (14). As further shown in Appendix V-A, the covariance V of the distortion noise v in (15) reads Due to symmetry, the NMSE is equal in both branches, that is NMSE = NMSE for = 1, 2. The terms in the NMSE are derived in Appendix V-A, and the result is given in (44), shown at the bottom of this page.
In (44), a = 1+a Δ is the scalar Bussgang gain given by the diagonal elements of A, with a Δ = 2ρ (1+δ 2 ) σ 2 according to (42). The NMSE shown in (44) is known from [19], where it was based on a cumbersome direct calculation of the NMSE. Accordingly, the proposed methodology provides a flexible tool for analytical and numerical evaluations of this class of models, which was not provided in [19].
In a first order approximation, the NMSE in (44) is given by where denotes an approximate expression where only the dominant terms are retained. The derivation is outlined in Appendix V-A. In (45), the linear crosstalk parameter ξ = δ+μ was introduced, which is a parameter that will be frequently used in the sequel.
To get an approximate expression for the ACPR we use that for the output crosstalk μ 1 in (23), (31), and (31), and get that v NL, = ρu |u | 2 . The PSD of the nonlinear distortion, V NL, (ω) is obtained from frequency domain convolutions of the Fourier Transform of v NL, . The bandwidth of v NL, is three times that of the u (the channel bandwidth). We use that the adjacent channels bandwidth is the same as the channel's bandwidth and use no guardbands. From (28), (29) and (30) we then get C

. An Alternative MIMO Transmitter Model
As is evident from the full expression of the NMSE in (44) and its approximation in (45), the performance is determined by the dominant terms. Consider the alternative behavior model in Fig. 3, which is a 2 × 2 MIMO system with uncorrelated inputs, that is U = σ 2 I and the system description given by where ξ = δ + μ. Furthermore, there is no direct output crosstalk, that is P = I and N = σ 2 n I, or y = r + n.
With U = σ 2 I, it directly follows that U in (12) reads U = 2σ 4 I, and U in (18) reads U = 6σ 6 I, respectively. Using (14), a direct calculation provides us with the Bussgang matrix, that is A Δ = 2ρ σ 2 H −1 . To calculate the error covariance E in (24), both V in (20) and U in (25) are required. An expression for the covariance V directly follows as V = 2ρ 2 σ 6 I.
The covariance matrix of the linear distortion U in (25) is derived in Appendix V-B, that is The error covariance E in (24) now follows by summing up (49), (50) with the thermal noise covariance N = σ 2 n I, or restricting the summation to the diagonal terms Accordingly, the NMSE for a 0 dB gain transmitter, which is equal in the two branches due to the symmetry of the problem, reads In Fig. 4, the NMSE (44) and (52) (and, accordingly, also the approximative expression (45) that coincides with (52)) Using polynomials of higher order than three gives terms ∝ ρ 3 ρ 5 σ 6 (and higher orders in σ 2 ) in the expressions for NMSE vs. σ 2 . Such terms affect the NMSE at high input powers. In Fig. 4 a line ∝ σ 6 is shown to illustrate the behavior and for comparison with the term due to third order nonlinearities (∝ σ 4 ).
The derivation and expression for the ACPR is the same for the alternative model, Fig. 3, as for the model in Fig. 2, i.e. as in (46). In Fig. 5 the ACPR is shown as obtained from (46) with the same parameters as for the NMSE shown in Fig. 4. Notice that the scale of the x-axis in Fig. 5 is different from that in Fig. 4. In Fig. 5 three regions I, II, and III are indicated. In region I, the ACPR describes the nonlinear distortion. In region II, the out-of-band error is dominated by the thermal noise; in the channel the thermal noise is small. In region III, which is of little practical interest, the thermal noise dominates in the channel and out-of-band.
Using other numerical values for the noise variance, σ 2 n , compression parameter, ρ, and crosstalk, ξ = δ + μ, gives qualitatively the same behavior of the NMSE and ACPR as in Figs. 4 and 5 with the main behavior given by the straight lines.
In summary, the input crosstalk determined by δ and the output crosstalk determined by μ can approximately be replaced with a linear crosstalk from the adjacent branch to the output of the branch, with gain ξ = δ + μ. This observation is used to study general M × M MIMO transmitters below.

IV. M × M MIMO WITH LINEAR CROSSTALK
In this section, a M × M transmitter with linear crosstalk is considered, where M can be large (that is, as M → ∞). ACPR (46) (blue dashed line) versus average input power σ 2 . In region I, the ACPR depends on the σ 6 term in the numerator and the σ 2 term in the denominator in (46). In region II the numerator is dominated by the σ 2 n term and the denominator by the σ 2 term and in region III the numerator and denominator are both dominated by the respective σ 2 n term. The black dotted lines show the linear approximations for regions I, II, and III, respectively.

A. Signal Model and its Properties
Consider a M × M MIMO transmitter with linear crosstalk from the neighboring branches. That is, consider the signal model (3) with The M transmitter branches obey the individual compression factors ρ (with ρ real-valued and negative), or the diagonal matrix G in (3) is given by A generic description of the system matrix H in (3), taking into account the symmetry between the branches is with entries h ∈ C (with |h | ≤ 1). H is assumed to be invertible. The covariance matrix of the transmitter input u reads U = σ 2 I. Furthermore, the matrix P in (4) reads P = I, so that the observable transmitter output reads where the covariance of the thermal noise n is given by N = σ 2 n I. For the given covariance matrix U and the nonlinearity f (·) in (53), it is shown in Appendix V-C that U = 2σ 4 I and U = 6σ 6 I. Then, the Bussgang matrix A Δ according to (14) directly follows as Further, the covariance matrix V in (20) of the distortion noise v reduces to Finally, U is required, which is derived in Appendix V-C. All the required information is now available to calculate the error covariance E in (24). In (58), V is diagonal, i.e., the distortion noise of the different branches is uncorrelated, which is in agreement with [29], [37]. Notice that for the model in Fig. 2, V in (43) has off-diagonal elements, indicating partly correlated distortion noise. However, the off diagonal elements are ∝ δ 3 and the diagonal elements are ≈ 1 and typically δ 1. For the model in Fig. 3, V in (49) becomes identical to (39) if δ is neglected. Below the NMSE corresponding to (58) in (24) is studied.

B. NMSE and Input Back-off for NMSE Minimization
The NMSE for the :th branch reads In (59) the elements of H, h m , that contain the crosstalk, are arbitrary, and hence, different physical models for the cross talk and transmitters of arbitrary geometry can be studied if the corresponding h m are known. Before analyzing the NMSE in (59) in detail for specific H, corresponding to the 1D and 2D cases in Fig. 11, note that although the NMSE depends on H, its minimization with respect to the average input power σ 2 does not. The minimum NMSE is obtained in the :th branch with the average input power σ 2 given by where min[·] denotes the minima of the function within the brackets. It should be noted that the optimal input power backoff (60) only depends on the properties of the considered branch via the compression parameter ρ and the power of the the thermal noise σ 2 n . In particular, the average input power minimizing the NMSE is independent of the actual crosstalk, the nonlinear properties of the other branches, and the actual number of M branches. Inserting the numerical values that were used to generate the plots in Fig. 4 into (60) result in an optimal (that is, minimizing NMSE) average input power σ 2 of 21.7 [dBm], which also is displayed in Fig. 4. The small shift to the right in the minimum of the NMSE from (44) to (45) seen in Fig. 4, is negligible in practical cases. The larger the crosstalk, the larger the shift will be. The minimum of the NMSE of (45) will always be shifted to the right in the figure, for physically realistic values of the crosstalk.
The assumptions made when deriving the approximate expression for the ACPR in (46) are valid also in the M × M case. Thus, (46) can be used and the behavior will be the same as in Fig. 5, also for M × M transmitters.

C. NMSE for 1D Transmitter Layouts
A natural extension of the 2 × 2 MIMO case to its M × M counterpart considers a linear placement of the individual transmitters according to Fig. 1, that is the system matrix H is Toeplitz and given by H = Toeplitz(1, ξ, ξ 2 , . . . , ξ M−1 ).
In (61), ξ ∈ R is the parameter of the linear crosstalk, that is ξ = δ + μ, and given in dB as 20 log 10 ξ/2 [dB] to be consistent with the set-up in Sec. III. The direct coupling of non-adjacent channels soon becomes small and we therefore neglect it. However, direct coupling of non-adjacent channels could be modelled by ξ n , where n is smaller than two. More complicated models or experimental values for the crosstalk could be modelled by h m in (59). Now, the asymptotic NMSE for a given branch , when the number M of transmitters tends to infinity can be calculated, that is From (59), it follows that for the 1D structure in Fig 1, the asymptotic NMSE reads where In (64), M/2 should be read as the centrally located transmitter in the 1D array, and C values for > M/2 follow by symmetry, that is C M− = C , etc. The C -values are explicitly given in Appendix V-C, but are observed to monotonically increase from = 1 to = M/2. Since |ξ| 1, in practice C in (64) equals the number of nearest neighbours, that is Accordingly, there is approximately a 3 [dB] drop in NMSE performance in the non-boarder branches compared with the two boarder branches when the transmitter is operating under optimal input power back-off. The result (63) indicates that the reduction in performance over the 2×2 set-up due to a massive amount of 1D-distributed transmitters is 3 [dB], excluding the two branches at the boarders. In fact, the minimum NMSE for a system with non-negligible crosstalk is well approximated at optimal input back-off (60) by min[ NMSE ] ξ 2 boarder branches 2ξ 2 non-boarder branches.
The underlying exact result leading to (66) is rather unaffected by the actual number of M branches, e.g. for the numerical values used to generate Fig. 4, the loss in performance in a non-boarder branch going from M = 3 to M → ∞ is in the order of a tenth of a dB. As crosstalk has already been identified as a key issue for 2 × 2 transmitters, additional efforts are required to handle it for M > 2. For all non-boarder branches, there will be an additional 3 dB loss of NMSE-performance, or alternatively they will require a means to reduce the crosstalk to acceptable levels On the one hand, with a fixed distance between the transmitters, the physical length of the MIMO transmitter is infinite when M → ∞. On the other hand, it can be assumed that a given 1D array length, determined by the radio frequency properties, and a crosstalk parameterξ increases with an increasing number of transmitters due to the reduced distance between them. In the limit M → ∞ there is an infinite number of transmitters distributed over the given length, and accordingly the NMSE increases without bound.

D. NMSE for 2D Transmitter Grid
Now consider the 2D structure in Fig. 1. With reference to Appendix V-C, the asymptotic NMSE reads where C 4 for non-boarder branches, C 3 for edge branches, and C 2 for the four corner branches, given the assumed properties of the crosstalk. For transmitter grids with non-negligible crosstalk, the minimum NMSE for a 2D structure is described by which provides a typical (for the non-boarder branches) deterioration of the NMSE of around 3 [dB] compared with the 1D structure, and 6 [dB] compared with the NMSE of a 2 × 2 transmitter. The results are illustrated in Fig. 6. In the figure, the broad minima should be noted, especially for the 2D structure. This observation motivates sub-optimal schemes for input back-off optimization, which is studied next.

E. Transmitter Input Power Optimization
To increase the energy efficiency of the transmitter, a suboptimal scheme, where an increased average input power is determined, should be considered. This is subject to a slight degradation in NMSE. Such an approach was introduced for a dirty 2 × 2 transmitter in [19], and is here refined for the problem at hand. Let Δ be a user selected parameter which determines the allowable loss in performance, that is the determined average input power, for which where the minimum possible NMSE is determined by the set up, see (66) and (68) for the 1D and 2D scenario, respectively. For example, Δ = 1 (or, Δ 0.35) allows for a 3 [dB] (or, 0.5 [dB]) degradation of the NMSE. Then, the average input power σ 2 that fulfills (69) follows as where C denotes the number of neighbouring transmitter branches. For the example displayed in Fig. 6, an increased average input power of around 6 [dB] for the 2D scenario, given a 0.5 [dB] loss in NMSE, should be noted.

V. CONCLUSION
The contributions of the paper include a tool for analyzing the behavior of MIMO transmitters in terms of NMSE. The methodology has been applied to two different transmitter models: one physically motivated with crosstalk both before and after the nonlinearity introduced by the power amplifier, and one simplified model with linear crosstalk approximating the main behavior of the former model.
Simple analytical expressions make it straightforward to analyze the NMSE and ACPR for different values of the parameters describing the hardware impairments. It is concluded that crosstalk degrades the performance of a MIMO transmitter. Under tuned average input power conditions, the MIMO transmitter NMSE is determined by the crosstalk, with additional 3 dB loss in NMSE going from 2 × 2 MIMO to a large 1D structure of transmitters, and an additional 3dB loss in NMSE for large 2D structures.
It is further shown that SISO-based methodologies for power amplifier back-off determination are valid for a M × M MIMO transmitter subject to crosstalk. However, schemes taking the crosstalk into account may be favorable because they allow for the operation of the transmitter with increased energy efficiency, subject to a predefined small loss in performance. APPENDIX

A. 2 × 2 MIMO Transmitter With Input and Output Crosstalk
1) Noise Properties: The noise covariance U is given by (36). Furthermore, the matrix U in (12) reads where [40]. Note from (36) and (71) that U can be expressed in terms of U, that is Finally, the matrix U in (18) is here given by In order to resolve (73), note that for integer it holds for the diagonal elements that E[(u u * ) 3 ] = 6(1 + δ 2 ) 3 σ 6 , which follows from (36) and the general result E[(u u * ) n ] = n! E[u u * ] n [40]. The off-diagonal elements of U are of the form using the result in Appendix V-D. Accordingly, the offdiagonal elements of U read In summary, U explicitly reads 2) Bussgang Gain A: With H = I, G = ρ I, and the relation (72) between U and U, it directly follows that A in (14) is given by (42).
3) Distortion Noise Properties V: With G = ρ I and the relationship (72) between U and U, the expression for the covariance V of the distortion noise (20) reduces to where U given by (36) and U by (76). Now, a straightforward calculation yields V in (43).
5) Nonlinear Distortion P V P H : The term P V P H corresponding to the nonlinear distortion in (24) follows from a straightforward calculation. The matrix P is given by (40), and V is diagonal and given by (43). Let the diagonal elements of P V P H read Inserting the elements of V given by (43) into (84) yields

B. 2 × 2 MIMO Transmitter With Linear Crosstalk
To calculate U in (25) In a similar vein, U reads From (87) and (89) it is observed that 2) Bussgang Gain and Distortion Properties: By the observation (88), it directly follows that the Bussgang gain A according to (14) reads as in (57). Through this observation (90), the error covariance V defined in (20) reads as in (58). To calculate U in (25), note that the equality reduces to Then, U = σ 2 X X H with X given by Now, the diagonal elements of U follow from a straightforward calculation, with the :th element given by 4) 2D Transmitter Layout: For simplicity, the analysis is given for the centrally located transmitter. With reference to Fig. 1 it follows S c = 4ξ 2 + 12ξ 4 + 28ξ 6 . . . 4ξ 2 , where c indicates that the transmitter is centrally located. With similar arguments, the results for the boarder and corner transmitters follow.