Optimizing Antenna Arrays for Spatial Multiplexing: Towards 6G Systems

In this paper we discuss the design of antenna arrays to be used for multiplexing applications. In particular, we introduce a suitable performance index to analyze the effect of the antenna geometry and the distribution of users for the overall performance of Multi-User Multiple Input Multiple Output systems. By means of such performance index, antenna arrays can be designed so as to increase the number of multiplexed parallel sub-channels. Numerical results show that a proper design could allow to double the contemporary served users and the overall system throughput.


I. INTRODUCTION
If we look back at the advancements of the communication systems in the last twenty years, there is one key point that has been the object of continuous advancement: the capability to exploit the spatial variation of the electromagnetic field. In particular, this capability has been triggered by the ever-increasing use of antenna arrays, that allow better focusing of the electromagnetic signal toward specific directions (through beamforming [1]), the possibility to multiplex parallel data streams using the same time/frequency resource (through Multiple Input Multiple Output (MIMO) systems [2]), to filter out signals coming from unwanted directions (through spatial filtering [3]) or communicating with multiple users sharing the same time-frequency resource (through Multi User MIMO (MU-MIMO) systems [4]).
In particular, MU-MIMO theory has been developed in the framework of Information Theory as an extension of the MIMO spatial multiplexing theory [5]. Loosely speaking, MU-MIMO expands and fuses the concepts of beamforming and spatial filtering, and allows the reuse of the time/frequency resources among many users assuring low (ideally null) interference among the users. MU-MIMO is currently considered for use in many modern communication The associate editor coordinating the review of this manuscript and approving it for publication was Tutku Karacolak . systems, and 5G systems in particular, wherein up to 8 users can be theoretically served using the same time/frequency in a cell.
For the current development of 5G networks, and the upcoming 6G networks, massive MIMO [6] -a particular MU-MIMO system employing a very large number of antennas/users, is undoubtedly one of the most promising communication paradigms, and some multiple beam communication systems have been demonstrated [7], [8].
As a matter of fact, the literature on massive MIMO systems is huge: a large number of papers discusses the efficiency of the architecture [9]- [22]; many array prototypes have been built [23]- [26], and several measurement campaigns confirm the theoretical results [27]- [32]; some other papers discuss the use of low-complexity architectures and other non conventional solutions [7], [33]- [39].
The importance of the ''electromagnetic'' aspects of the communication systems is also testified by the large number of papers dealing with the use of reconfigurable surfaces [48], [49], that aim to eliminate some bottlenecks of standard massive MIMO arrays. Again, the overall impression is that the arrays used in current massive MIMO systems do not exploit at best their capabilities. In fact, a different design paradigm is needed, not based on classical pattern synthesis with given gain, beam-width and side lobe level [45], [50] -as used for instance in radar applications or in systems with limited multiplexing capability [51]- [53]. In [45] MIMO array synthesis is formulated as the maximization of the quality of the resulting communications. In present work, we optimize the multiplexing capability of the array, i.e. its capability to communicate at the same time with a larger number of users.
The main aim of this paper is trying to fill the current gap in understanding the impact of antenna array layout in the performance of massive MIMO systems, and to provide some guidelines on how to optimize the antenna array geometry in order to obtain maximum performance. In particular: • we introduce a novel performance index, the factor, that can be used as an easy-to-calculate predictor of beam-forming losses.
• we show how to use the factor to analyze MU-MIMO systems and to optimize the antenna array layout.
• we show that, contrarily to the common assumption, grating lobes may not have an impact in MU-MIMO systems, and inter element spacings significantly larger than half-wavelength can be particularly advantageous.
The first part of the paper is focused on the case of Lineof-Sight (LoS) communication systems, that is particularly relevant for mm-waves communications [12], [14], but -as we show in the ray-tracing simulation Section -the approach used for LoS systems allows to synthesize antenna arrays capable of providing good performance also in multi-path scenarios.
The manuscript is divided as follows: in Section II the theoretical limits of the multiplexing capability of electromagnetic systems are briefly recalled by means of a simple mono-dimensional case study. In Section III we introduce a novel metric for evaluating MU-MIMO performance, and we discuss the effects of the random position of the users. In Section IV a planar array is analyzed by means of the approach introduced in the previous Section. In Section V we use the proposed performance index to guide the optimization of the antenna array layout in order to achieve optimal performance. In Section VI the results are validated by means of a ray-tracing approach, removing some of the simplifications that have been used in the previous sections. Conclusions follow.

II. MULTIPLEXING LIMITS OF ELECTROMAGNETIC FIELDS: A SIMPLE MODEL
In this Section a simple model of the MU-MIMO communication system is introduced and discussed, in order to gain a better understanding of the results presented in the next Sections. The reader interested in a more general discussion can deepen the topic in [47], [54], [55].
We want now to focus on the simple case depicted in Figure 1a: a continuous line source, of length A and amplitude a(y), aligned along y−axis, with all the terminals on a circular observation curve , placed in the (x, y) plane at a fixed distance in the far-field of the source (i.e. the base station), so that the effect of the variation of the distance (and hence of the received power) does not influence the results on the multiplexing capability. We also suppose that the base station can work with perfect Channel State Information (CSI) [56].
The radiated field E(φ) on the observation curve is: where an inessential multiplicative constant depending on the distance has been dropped, since it is not needed when working in far-field condition, and the Hilbert-Schmidt decomposition [57] of the radiation operator has been used. In (1) σ k are the singular values and the functions f k (φ) and a k (y) are properly normalized in order to make their integral unitary, respectively on the observation domain and on the line source. Only a finite number of {σ k } would be larger than the noise threshold, thus limiting the maximum number of independent channels that can be transmitted by the considered limited size source. The discrete counterpart of the Hilbert-Schmidt decomposition of continuous linear compact operators is the Singular Value Decomposition (SVD) of matrices. In order to perform a numerical analysis of representation (1) let us suppose A = 10λ, and an observation domain in the range φ ∈ [−π/4, π/4], and approximate the continuous functions with densely discretized vectors; in particular, we consider a dense discretization of the source, a = a(y) with y a dense equispaced ν = 201 elements vector sampling the range [−A/2, A/2], and a dense discretization of the field on the observation domain, e = E(φ) with φ a dense equispaced µ = 1001 elements vector sampling the range [−π/4, π/4], obtaining: where H is a µ × ν complex matrix, the ''continuous observation-continuous source'' channel matrix, whose (m, n) entry is The singular values of H are shown in Figure 1b: it is clear that the distribution has a turning point around k = 16: the terms with a larger index give an ever smaller relative contribution to the overall radiated field.
It turns out that the number of effective singular values contributing to the description of the field is limited, and is related to the Number of Degrees of Freedom (NDF) of the field [47], which is indeed 16 in the considered case.
It is worth noting that the singular values distribution of Fig.1b is weakly dependent from the sampling used. If we compared different samplings on the source and on the observation domain (not done here for the sake of conciseness) we would find very similar plots.

A. BEAM-FORMING ANALYSIS
It is particularly interesting to analyze the behavior of a multi-user radiation system, since it allows to evaluate the performance of any source intended to work in a spatial multiplexing application lying in the linear region of dimension A.
To this aim, let us imagine that M = 16 receiving terminals are placed in equispaced angular positions. Independent data streams can be sent to the terminals using the so called zero-forcing approach, consisting in synthesizing M different radiation patterns so that the pattern aimed to communicate with an user has M − 1 zeros in the directions of the others. Such a synthesis can be performed by the pseudo-inversion [58] of the channel matrix. The zero-forcing approach is only one of the possible strategies for multi-user systems, but it has been shown to be sub-optimal as far as the channel matrix has a good condition number, and since it requires linear processing its computational effort is relatively little.
A representation of the aforementioned radiation patterns is given in the upper subplot of Figure 2, where the directivities of the continuous sources are calculated by means of numerical integration. In accordance with the theory, the problem is stable; for each beam, the nulls are perfectly synthesized, and the directivity in the direction of the served user is always close to the theoretical maximum of 13dB for the continuous line source in broadside.
As soon as M exceeds 16, the synthesis of the zeros is paid by an increase of the radiated power in unwanted directions (specifically, for angles |φ| > π/4) and the directivity of some beams can be significantly reduced. Such an effect is clearly shown in bottom subplot of Figure 2 for M = 18 and represents beamforming losses.

B. COMMUNICATION RATE ANALYSIS
We may wonder how the considered directivity variation influences the communication rate. To answer this question we could calculate the zero-forcing communication rate, similarly to the approach in [28], by means of: where ρ m represents the receiver Signal-to-Noise Ratio (SNR) for the m−th subchannel. Supposing that the receiving terminals use antennas with identical gain G rx , and that the transmitting aperture has ideal efficiency (so that the directivity is equal to the gain), the receiver SNR can be written as: where P T is the overall transmitted power, α m is the fraction of power on the m−th sub-channel, G m is the source gain in the m−th direction, and σ 2 n is the variance of the Additive White Gaussian Noise at the user terminal receivers. Now substituting (5) in (4) gives: where ρ 0 = P T G 0 G rx λ 2 4πr 2 σ 2 n is a reference SNR, calculated with a source of gain G 0 communicating with a single user employing all the available power, and the coefficient γ m = α m G m /G 0 is a positive number in the range [0, 1], representing the reduction of SNR due to the power splitting between different users and the gain reduction in the desired direction due to the effects seen in Figure 2. It is worth mentioning that beamforming losses, are often not correctly taken into account when dealing with MU-MIMO systems, but represent the main limit when trying to maximize the multiplexing capability of a communication system.
In Figure 3a the system throughput is plotted as a function of ρ 0 and M using α m = 1/M . In the plot the red curve is the ''uphill path'', which represents the value of M maximizing the available throughput for each value of ρ 0 : it is clearly visible that the throughput is maximized when M = 15 for lower SNRs, and M = 16 for higher SNRs. It must be underlined that the choice of α m = 1/M is not optimal from the point of view of the overall throughput, since more sophisticated power allocation strategies exist (waterfilling approaches) [56]. Unfortunately, such strategies lead to a reduction of the user rate for some terminals, and as stated in the Introduction we are interested in guaranteeing as much as possible the fairness of the network resource among user terminals. Furthermore, this paper is not aimed at finding the communication strategies that guarantee the best performance, but we focus on the analysis of the impact of the antenna geometry on the overall system performance, so this choice is reasonable.
The above result considers the ''overall'' throughput. However, the per-user throughput is generally different for different users, and related to the spreading of singular values. As an example, in Figure 3b the minimum per-user rate is plotted, as a function of ρ 0 and M . This rate has been calculated as where γ min has been calculated using the minimum gains in the desired direction for each value of M . It is clearly visible that a better minimum user rate is guaranteed when less user terminals are served, but the reduction of R min is limited up to M = 16, and we see a rapid drop of the rate for values higher than this threshold.
C. SOME COMMENTS Some comments on the results seen up to this point are now in order.
First of all, it must be underlined that the considered source has a limited dimension in terms of wavelengths (current MU-MIMO are larger, usually made as planar arrays, and in case of massive-MIMO systems employ up to hundreds of antennas), but this choice has been done to help the visualization of the results: all the considerations performed in this section still hold for larger, but finite dimension, sources.
Second, in the analyzed multi-user system users are in fixed locations: it is unlikely that the users are exactly placed in those positions, and stay in the same place for the entire duration of the communication. In order to provide more realistic results some randomness on the users' positions must be added (and properly managed).
Third, from Figure 3b it seems that the best R min is obtained when the multiplexing is not used, or used in a very limited way. But let us think to a network that needs to serve f.i. K users, and all the users are divided into G groups of no more than M terminals (so that G = K /M ), where M represents the number of parallel data streams towards the users; supposing a fair division of the time resource (each group communicates for a fraction 1/G of the time), each user could get an effective minimum rate of In the next section we show that there are specific values of M maximizing R min . Finally, beamforming analysis has been carried out with a continuous aperture, in order to make it as much ''general'' as possible. The analysis can be repeated with a coarse sampling of the source, discretized with an inter-element distance of λ/2, all the results being almost indistinguishable by the ones of the continuous source.

III. AN EFFECTIVE PERFORMANCE INDEX
The analysis of the rates provides a precise evaluation of the performance of a communication system, but since the rates depend on the reference SNR ρ 0 , this would be an unpractical approach for a preliminary rough estimation. It could be useful to define a quality parameter to quickly estimate the expected performance of a system configuration.
According to the considerations done in previous subsection, in order to obtain a more uniform performance of the system it may be useful to reduce the spreading of the singular values. A first possibility to quantify the spreading could be the use of the condition number of the channel matrix [58]; despite its appreciable quality of being dimensionless and of ease interpretation, the condition number takes into account only the relative behavior of two singular values (the greatest and the smallest).
A better choice would be the use of Frobenius norm of the pseudo inverse of channel matrix, let it be H † , since in the hypothesis of Gaussian noise at the receivers, such norm is inversely proportional to the average SNR of a zero forcing communication system [59]. The main problem with the use of the aforementioned norm, is that the comparison VOLUME 9, 2021 of systems with different numbers of antennas/users involved may not be practical, since it is not a ''normalized'' parameter.
For this reason we propose the use of the index, defined as: where || * || F is the Frobenius norm and M is the minimum dimension of channel matrix. Because of the sub-multiplicative property [58] of the Frobenius norm we have ||H † || F ||H|| F ≥ M , so is a non negative dimensionless quantity representing the stability of the inversion: when 1 the pseudoinversion is very stable, whereas a value of 1 corresponds to a very un-stable pseudoinversion. As a matter of fact the definition of index is very general, holding for any kind of discretization of both the source and the observation domain. For example, if the focus is on the number of users an antenna can serve simultaneously, the source is densely sampled and M is the number of users. On the other hand if the focus is on the antenna array performance, regardless of the number of users, the source is coarsely sampled at antennas locations, and the observation domain discretization is dense, so that M is the number of antennas. The latter definition is used in Section V and is referred to as˜ to avoid any confusion.
In the next subsections we demonstrate by means of numerical examples that allows an efficient estimation of beamforming losses.

A. SCHEDULING OF RANDOM USERS
The analysis of beamforming limits in previous Section was performed with terminals at fixed angular positions in space, and the provided results were perfectly aligned with the previous achievements on the same topics [60]- [62], but the working condition was unrealistic since the position of the users is usually random. To understand how close we could get to the aforementioned limits, we have considered the following scenario: is randomly generated; • the set of K users is divided into G groups of M elements as G = K /M using the approach discussed in the following; • for each element within a group the zero-forcing algorithm is employed and the rate for the m−th user in the g−th group is calculated as R g,m = 1 G log 2 (1 + γ g,m ρ 0 ). Finally the overall throughput is calculated as: where the coefficient 1 G takes into account that the communication with each group can take place only in a fraction of the overall communication time.
To perform a significant statistical assessment of the communication system performance we have considered A suitable scheduling has now to be considered, in order to properly group users, thus avoiding ''angularly close'' users [63]- [65], whose final effect is a reduction of the antenna gain in the wanted directions, similarly to what is shown in Figure 2. To this end several strategies have been studied [66]- [69], but we propose a novel scheduling approach employing the factor previously introduced. The main idea of the scheduling algorithm, described in detail in the Appendix, is to swap users between groups in order to lower as much as possible the maximum value of the factor between groups.
The reason for lowering the factor can be understood by looking at Figure 4: the average beamforming losses, calculated as the geometric mean of the directivity loss towards the users in a group due to the zero-forcing beamforming, is strongly correlated with the factor of the channel matrix of the aforementioned group of users. So the factor is an easy-to-calculate and effective predictor of the actual multiplexing performance of a communication system: for instance, a value of = 0.1 implies average beamforming losses of the order of 0.6-0.8 dB. For the sake of completeness, in the same plot the matrix condition number (σ 1 /σ M ) is also plotted: it shows a good correlation, but its prediction capability is lower due to a larger spreading of the scattering plot.
The performance analysis employing −scheduling is shown in Figure 5 (solid line) and compared with the results obtained when random scheduling is used (dashed line): the spreading of the factor is significantly smaller with respect to the random scheduling case, and also the directivity does not show a significant decrease up to M = 10 users per group. The throughput is shown in Figure 6: minimum per user rate, quantified in all the following examples by the 5% quantile, and mean overall rate are plotted in the case of random scheduling (left subplots) and scheduling (right subplots). It is apparent that there is a strong advantage in 53280 VOLUME 9, 2021 using the scheduling, also for the per-user performance, apart from the lowest values of ρ 0 . Also the overall rate is maximized for values of M ≥ 10 apart from the lowest cases, and it is interesting to see that the optimization of the system throughput now occurs for values of M guaranteeing a good fairness between users.

B. SOME COMMENTS
The analysis with the random user positions has shown that by means of a proper scheduling of the users we are able to come very close to the theoretical limit, obtaining a good overall system performance, as well as a good per user rate. Randomness seems to introduce a penalty of 10-15% of the number of parallel streams transmitted with respect to the NDF.
It must be underlined that all the simulations shown have been obtained considering the same transmitted power; this means than when we increase the number of subchannels M a smaller power will be available for each parallel stream, thus reducing the subchannel SNR ρ g,m = γ g,m ρ 0 of (10). Fortunately, the SNR appears inside a logarithm in the rate formulas so, unless we are considering very low average SNRs or we suffer from strong beamforming losses, it is always advantageous to increase M .
This notable result has been obtained with an uniform random distribution of the user's angular position in the considered range. We need to analyze the behavior of the system when non-uniform distributions are involved. Furthermore, we need to analyze how the geometry of the source, i.e. the Base Station (BS) antenna, influences the system performance, in order to optimize the geometry of the antenna for multiplexing applications. To this aim, let us now switch to a more complex, 3-dimensional model of the communication system.
As a first step, let us now define a grid sampling the θ and φ directions with M θ × M φ points, and calculate the factor for the resulting channel matrix.
From a numerical analysis we found that any grid with M θ ≤ 3 and M φ ≤ 9 would lead to a stable inversion, with  values of ≈ 10 −1 or smaller, whereas M θ ≥ 5 or M φ ≥ 10 would lead to an unstable inversion: we can expect about M θ × M φ = 3 × 9 = 27 independent sub-channels to be sustained by a system of this kind.

A. UNIFORM RANDOM ANGULAR DISTRIBUTION OF USERS
As seen in the previous section, this analysis gives us only a maximum bound for the number of sub-channels that can be sustained. To obtain more reasonable results we need to introduce some randomness, and we suppose again that we have K overall users, to be divided in G groups of size M . According to the considerations driven before, -scheduling for the users is employed. For sake of simplicity, we consider a planar array of 8×8 λ/2 equispaced isotropic elements, and for the calculus of the channel capacity we assume a reference gain G 0 = 19.74dB (the broadside gain of an uniform planar square array of isotropic elements).
In Figure 8, the quantiles for the factor and the directivities in the direction of the user are shown as a function of M . The spreading of the factor is now significantly small as in the case depicted in Fig.5, but in this case the directivity seems to show a significant spreading even for low values of M ; this effect is actually due to the scanning losses occurring in planar arrays, and only for M > 10 we start seeing the effects of the zero forcing beamforming.
The throughput is shown in Figure 9 (left subplots). The minimum rate per user confirms the result seen in Figure 6: the advantage in using multiple layers also for the per-user performance, is significant, and best performance is achieved with up to 15 parallel streams when the value of ρ 0 is FIGURE 9. Analysis of the 8 × 8 array with random user positions (uniform distribution in φ ∈ [π/2 − φ R /2, π/2 + φ R /2] and θ ∈ [−θ R /2, θ R /2]) using −scheduling. Upper subplots: 5% quantile for the user rate as a function of M and ρ 0 ; lower subplots: mean overall rate as a function of M and ρ 0 . The red curve represents, for each value of ρ 0 , the value of M that provides the best rate. Left subplots: array with dy = dz = 0.5λ; right subplots: optimized array with dy = 0.56λ and dz = 1.93λ. particularly high. The optimality of using multiple layers is seen also when analyzing the overall rate, that is maximized with values of M = 19 when ρ 0 is particularly high. Again, this positive effect occurs without a degradation of the minimum user performance, so obtaining a good fairness between users.

B. UNIFORM PLANAR DISTRIBUTION OF USERS
Unfortunately, in real cases the effective angles of the terminals with respect to the base stations are rarely distributed according to uniform angular distributions [71], [72]. We would like now to investigate how this non-uniformity affects the system performance. To this aim, we consider a Line-of-Sight case, with users uniformly distributed on a plane within a circular sector of 120 • (deg) that goes from a minimum radius on the ground r min = 42m to a maximum radius r max = 333m as in Figure 10.
We suppose that the BS array has a height of 30m with respect to the ground, and it is tilted of θ T ≈ 20.1 • as in Figure 7b in order to have a direction of arrival of the signals in the angular range φ ∈ [−φ R /2, φ R /2] and θ ∈ [−θ R /2, θ R /2] with φ R = 120 • (deg) and θ R = 30 • (deg). This configuration would result in an uniform distribution of the angle of arrival with respect to the angle φ, but a non uniform distribution with respect to the angle θ, with an increase of the density for angles close to the azimuthal plane, similarly to what has been obtained in [70], [73].
It is worth underlining that the chosen configuration (in terms of distances, tilting of the antenna, height of the  antenna. . . ) is just an example to show the effect of a non uniform angular distribution of the users. Furthermore, in a similar scenario the different distances of the terminals to the BS would result in a significant variation of the received power. Since we are interested in analyzing the effect of the angular distribution of users, we consider all the terminals at the same fixed distance.
If we now repeat the analysis on this configuration, using K = 200 users grouped in G groups of M elements using the -scheduler, we obtain the dashed lines in Figure 11. The analysis of the quantiles for the factor and the directivities in the direction of the user as a function of M shows a limited spreading of the factor, but the plot is now shifted to the left with respect to the uniform distribution case: a non uniform distribution of users strongly influences the stability of the inversion of the resulting channel matrices. A worsening is also present in directivities that exhibit an increased spreading with respect to the uniform case: the reduction in directivity due to beam-forming now occurs from M > 6. The negative effect of the non uniform angular distribution of users is confirmed also by the analysis of the communication throughput in the left subplots of Figure 12: the minimum rate per user shows a lower minimum rate per user with respect to the uniform case, and the best performance is achieved with M = 9 parallel streams when the values of ρ 0 are particularly high. Also the overall rate shows a strong reduction, and its value is maximized for M = 11 when ρ 0 is particularly high.

C. SOME COMMENTS
The analysis performed in this section has shown that the considered array of 64 radiating elements is theoretically capable of multiplexing about 27 independent data streams in the considered angular sector from the analysis of the factor, but even with the maximum SNR considered (ρ 0 = 30dB) this value is reduced to 19 when introducing the randomness of the position of users (a 30% reduction), and we see a further decrease to 11 (a 60% reduction) when the users distribution is not angularly uniform. The optimal multiplexing rates are even smaller, approximately halved, for a lower SNR ρ 0 = 15dB.
Even if these values are aligned with the expected optimal multiplexing in massive MIMO systems, where the optimal number of terminals to achieve channel hardening is about 25% of the number of antennas [16], it is undoubted that we are using our 64 elements active antenna array at a fraction of its potential. In next section we show how to use the factor VOLUME 9, 2021 to optimize the antenna layout in order to improve the system performance.

V. ANTENNA ARRAY OPTIMIZATION
In this section we analyze the impact of the variation of the antenna geometry on system performance. We still consider antenna arrays with N = 64 elements, so that the systems will be compared in a fair way. This choice leads to larger size antenna arrays; the idea of using larger arrays for massive MIMO is not novel [43], [74], [75], but in this section we discuss this possibility from the point of view of the antenna, with the aim of minimizing the needed increase in size.
For the sake of simplicity, in this paper we investigate only equispaced planar array geometries, non regular geometries will be object of a forthcoming work.

A. UNIFORM ANGULAR DISTRIBUTION OF USERS
As a first test, we investigate the effect of the grid spacing on the system performance of the 8 × 8 array for terminals uniformly placed in the φ R × θ R angular region; it is well known that the use of a spacing larger than half wavelength may produce ''grating lobes'' [76], and their effect has been questioned for MIMO communication systems [43] but, as we show in the following, this opinion is not justified. Furthermore, because of the increased size of the array, in some cases the antenna may not be working in the far-field condition, but this effect is relevant only for the closest terminals, and we have numerically verified that the overall results do not change significantly because of this approximation.
The analysis can be done by employing the performance index associated to the channel matrix. The source is sampled at 64 points, according to the selected antenna layout, with variable spacing dy between the columns and dz between the rows. Since the focus is no more on the number of users to be served, but on antenna layout, the observation domain is densely sampled on a truncated icosphere grid with M = 3517 points over the angular region φ R × θ R . According to the generalization of definition (9) introduced in Section II, we refer to such factor as˜ , in order to avoid any confusion with the index used (so far and in the following) to analyze the performance of a specific antenna and user configuration.
The result of the parametric analysis is provided in the upper subplot of Figure 13. The imagemap of log 10˜ clearly shows that there is a strong influence of the vertical spacing dz on the performance: a value of dz ≥ 3λ/2 is required to obtaiñ < 1. The analysis of the imagemap also shows that there are some combinations leading to very small values of˜ , that share the vertical spacing of 1.93λ. To better analyze what happens for dz = 1.93λ in the lower subplot of Figure 13 it is possible to see the plot of log 10˜ and the broadside directivity of the array. The plot of˜ factor shows periodic minima with a period of approximately 0.56λ.
Accordingly, there are some spacing combinations that are worth investigating; in particular dy 1 = 0.56λ and dz 1 = 1.93λ (that shows a low value of˜ and good broadside directivity) is analyzed in Figures 8 and 9. The minimum per-user rate estimated in Figure 9b is maximized for high SNR using M = 37 and the overall throughput is maximized for M = 44 (Figure 9d), representing more than 2/3 of the number of transmitting antennas.
We compare in Figure 14 the ''uphill paths'' of Figure 9, i.e. the maximum achievable overall rates, when for each value of ρ 0 the optimal value of M is selected. For low SNR values, the two antennas provide the same maximum rates (with a non significant advantage for the λ/2 equispaced array), but when the SNR becomes higher the advantage of the optimized array is clear.
The optimization that has been performed by means of a parametric analysis could also be explained by some antenna array considerations. The horizontal spacing dy, is indeed very close to the maximum inter-element spacing dy max = 1 2 sin(φ R /2) guaranteeing that no grating lobes appear in the considered angular region when focusing on any user within the same region; similarly, the vertical spacing dz, is very close to the maximum inter-element spacing dz max = 1 2 sin(θ R /2) guaranteeing that no grating lobes appear in the considered angular region when focusing on any user within the same region.
For sake of completeness, we also tried to analyze other solutions, obtained from the other minima of the˜ function from Figure 13, but those layouts provide practically the same performance of the optimized layout with dy = 0.56λ and dz = 1.93λ.  These results require some considerations on the actual effect of grating lobes. First, in the optimized antenna the grating lobes actually exist, but they do no occur in the region where the user terminals are placed; but even when they occur in the region where the terminal is, the performance of the communication system is not affected.
This effect can be explained in this way: once the users are scheduled it does not matter if the antenna radiates most power in directions close to the direction of the users, or in other directions, as far as these directions are not taken by other users.

B. UNIFORM DISTRIBUTION OF USERS ON PLANE SECTOR
With reference again to the scenario of Figure 10 let us focus on two specific ''cuts'': the azimuthal cut (for θ = π/2) and the elevation cut (for φ = 0). Roughly speaking, a square array presents the same scanning and beamforming capability along these two cuts. As a matter of fact, it is evident that the elevation beamforming capability is used in a much more limited way with respect to the azimuthal beamforming capability: having a certain number of antennas in the z direction of the array is useful from a directivity point of view, but may not help significantly from a beamforming point of view.
Let us now evaluate the˜ factor for this situation; to this aim, we consider a dense triangular sampling on the plane sector, resulting in an angular distribution with denser samples for lower values of θ, similar to that shown in [70], [73]. The result of the calculation of˜ is provided in Figure 15. This distribution is significantly different with respect to that presented in Figure 13, since to obtain low values of˜ we need to increase the value of dz up to 20λ. The figure shows again some periodicity for variable dy, but the interesting feature is the fact that large spacings on dz result to be particularly effective.
In particular, let us analyze the configuration of the 8 × 8 array with dy = 0.56λ and dz = 17.40λ, for which we havẽ ≈ 0.031. In Figure 11 we can see that the factor is now lower than one for a value of M ≤ 48 (whereas for the square array that condition occurred for M ≤ 12), and also the directivity shows almost no reduction up to M = 30 (whereas for the square array that condition occurred for M ≤ 6).
The analysis of the throughput, shown in Figure 12 confirms the excellent performance: the minimum per-user throughput is obtained at high SNR for M = 37, and the average overall throughput happens for M = 44: the values of the rates (minimum per user and average) are approximately the same of the optimized array in the scenario with a uniform angular distribution of the terminals.
To better understand the role of˜ in the performance of an antenna array, in Figure 16 we have compared the maximum channel capacity achievable when using the optimal number of parallel subchannels M for some antenna configurations, that, apart from the λ/2 equispaced standard array, have different values of˜ . The plot confirms that the lower the value of˜ , the better the overall system performance, but using a value of˜ lower than 0.1 has a marginal impact on the performance.
It is worth noting that the vertical spacing employed in some arrays in Figure 16 is very large, and grating lobes will appear in the angular sector considered. As a matter of fact, the presence of grating lobes does not affect negatively the performance of the system, as far as a proper scheduling of the users is employed.
To provide a numerical evidence of this effect, in Figure 17 it is possible to see a comparison of the directivity patterns radiated by the 8 × 8 array with dy = 0.56λ and VOLUME 9, 2021  It must be underlined that the angular directions are different in the two subplot, since the scheduling algorithm acts in a different way in grouping the K=200 starting users. dz = 2.30λ with respect to the 8 × 8 array with dy = 0.56λ and dz = 17.40λ when M = 40, with a non-uniform angular distribution of the users, that are uniformly placed on a plane. Grating lobes are present, but the more grating lobes, the smaller the dimension of each grating beam. The larger overall dimension of the array helps in placing ''zeroes'' of the pattern in the needed directions, without affecting the directivity of the beam toward the user to be served: the directivity in the desired direction is much larger in the dz = 17.40λ spaced array with respect to the dz = 2.30λ spaced array.

C. SUMMARIZING
The proposed optimization procedure can be put in a nutshell in this way: once a proper sampling for the observation domain, matching the actual distribution of users/angle of arrival, has been selected, it is possible to perform an optimization of the antenna geometry, in order to find the parameters that allow a sufficiently low value of˜ .
For instance, we tested several different antenna configurations, with inter-element distances found with the parametric analysis of˜ ; in particular in Figure 18 we compare some configurations with˜ ≈ 0.03: very different antennas sharing the same value of˜ have the same performance, thus confirming that˜ is an effective index of an array performance.

VI. VALIDATION BY MEANS OF RAY-LAUCHING
In previous sections a basic Line-of-Sight propagation model was used. Other more general models [77], [78] are widely accepted in wireless community and could be used to further validate the proposed paradigm. On the other hand, their use in massive MIMO applications has been questioned [79], [80], so that we decided to validate the idea simulating an urban scenario by means of ray tracing to estimate antenna performance. To make the analysis even more realistic, the element pattern and polarization of the array, previously neglected, have been taken into account: the antenna array has been modelled as a grid of dual polarized elements (±45 • linearly polarized short wires), at a height of 30m over the ground, and the terminals are placed at a height of 1.5m over the ground in a test city (Figure 19), that is simulated by means of a custom ray-tracing algorithm coded in Matlab, using the ray-launching (or pincushion) method. The amplitude and polarization of the rays reflected by the surfaces (ground or buildings) are calculated using a proper combination of Fresnel coefficients according to the polarization of the incoming ray. A maximum number of 20 reflections for each ray is considered, and for the sake of simplicity the relative permittivity of the ground and concrete is chosen equal to 4 (a typical average value for this kind 53286 VOLUME 9, 2021 of simulations). The simulation is performed in continuouswave, so a single frequency is considered at time.
The city is procedurally generated: the area is divided in square blocks of flats of 50m side; the streets have a width of 15m meters. In each square block there are some buildings of variable shape, and variable height (between 4m and 20m). The users are uniformly distributed on the ground, in a hexagonal cell of 333m diameter; no users are considered inside the buildings or outside the hexagonal cell. Each user has a single antenna, capable of receiving the vertical polarization of the field (i.e. the field aligned with the z-axis).
In this section we consider three 8×8 arrays, with different spacings between the elements, considered in the previous section: in the first the spacing is dy = dz = 0.50λ (Figure 16, black curve); in the second it is dy = 0.56λ and dz = 1.93λ ( Figure 14, green curve); in the third it is dy = 0.56λ and dz = 4.90λ ( Figure 16, red curve).
It must be underlined that, because of the different distances of the users from the base station, and the presence of multiple reflections causing a significant fading, the amplitude of the signal received by the users will show a significant spreading, even when there are no beam-forming losses due to the multiplexing. Notwithstanding, this effect is present in all the simulations considered so, from the point of view of the antenna architectures, the comparison will not be impaired by this effect.
As in the LoS cases, 50 different sets of 200 users are considered for the performance calculation; the field on each user is calculated summing the eventual LoS component (present in approximately 50% of the users) with the non-LoS signals reflected by the ground and the walls in the simulated environment. The -scheduling is employed to perform an efficient grouping of the users, and the beam-forming is realized by means of Zero-Forcing, like in the simulations of previous Sections. It has to be noticed that the aforementioned spreading of the amplitude of the signals due to the different distances would have impaired the calculation of the factor, so for the calculation of we have chosen to normalize the channel response of the users, so that the vectors related to each user in the channel matrix have the same norm. In Figure 20 we can see the behavior of the quantiles for the factor as a function of M for the three antenna layouts considered. The behavior of the factor calculated with the multipath scenario is very similar to the behavior shown with the LoS scenario: using a larger array with a proper element spacing seems to be an effective method to improve the stability of the inversion also in this case. It is also worth underlining that the use of factor is much more general with respect to approaches employing angular separation between users [63]- [65], since it can be directly applied also in cases, like the one considered in this section, when the propagation is different from LoS.
To properly calculate the possible rates we have instead chosen to take into account the effect of the different distance of the users from the base station. To this aim, for each user, we calculate the SNR considering a conjugate beam-forming at the BS array, supposing that the BS antenna is using all the available power on each user; then we introduce the average SNR at the receiverρ 0 , that is the geometrical average of the SNRs at the receiver calculated in the previous step. Finally, the overall rate when using G groups of M elements, is obtained as: whereγ g,m is a factor in the range (0, 1) modeling the power reduction due to the zero-forcing beam-forming and the splitting of the overall power due to the contemporary communication with M users. It is important to underline that the sub-channel SNR ρ g,m =γ g,mρ0 , is always calculated using the SNR obtained by means of ray-tracing: the average SNRρ 0 is only introduced to obtain plots comparable to the ones obtained for LoS simulations. The advantage of the optimized arrays with respect to the standard λ/2 equispaced array is confirmed also for the calculation of the rates. In Figures 21 and 22, we consider the variation of the overall rate as a function of M andρ 0 for the three arrays considered. It is apparent that the optimized VOLUME 9, 2021  arrays allow the communication with a larger number of contemporary users, because of their higher multiplexing capability.
It is worth mentioning that the performance calculation has been repeated for other procedurally generated cities, obtaining similar results.
Finally, it is important to recall that the simulations shown in this section include the effect of element pattern and polarization of the radiating element, and eventual grating lobes radiated by the array are also included in the electromagnetic simulation of the city. In particular, one of the main results of the previous section, that the grating lobes do not affect the multiplexing capability of the BS antenna as far as a proper scheduling of the users is employed, holds true also when a multi-path environment is involved (see . Furthermore, since the overall radiated power by the BS array is always the same, the grating lobes are not expected to increase the overall electromagnetic pollution (in-cell or inter-cell): the grating lobes only change the way in which radiated power is spread in non-wanted directions from the BS antenna, but we have to remember that in real propagation environments, the presence of several multiple paths for the electromagnetic fields, will anyway result in a radiated signal in unwanted directions.

VII. CONCLUSION
In this paper we have discussed the optimization of antenna arrays for multiplexing applications; in particular we have focused on massive MIMO systems working in Line-of-Sight propagation condition, but we have shown that the considerations done for LoS can help also in the design of antenna arrays working in multipath scenarios.
In the first part of the paper, using a simple yet effective line aperture model we have recalled the fundamental limits of the number of subchannels that can be synthesized by means of a zero-forcing approach, and we have introduced , a performance index that is able to quantify the stability of the matrix inversion needed for using the zero forcing: a low value of implies a good stability of the inversion, and little beamforming losses of the patterns used to communicate with the users, whereas a large value of is associated with large beamforming losses, that make the communication unreliable.
Beside the analysis of a specific MU-MIMO system, the proposed performance index has been demonstrated to be very useful, since: • it can be used for achieving an effective user allocation allocation ( scheduling), that allows a performance, in terms of rates and number of multiplexed channels, that is very close to the maximum possible; • it can guide the design and optimization of the antenna array to be used for multiplexing applications (˜ factor).
In particular, in the paper we have focused on a specific case study, the analysis of a planar array used as base station for a 64 elements massive-MIMO system with a sectoral coverage of 120 • × 30 • (deg) angular region. We have seen that, with the standard half-wavelength spaced array, if the direction of users in the angular region is uniform, the stability of the inversion of the channel matrix is guaranteed up to a maximum of 27 subchannels, but because of the randomness of the position of the users, even when using the -scheduler it is limited to no more than 19 channels, less than 1/3 of the BS antenna elements.
Then we have analyzed the same standard half-wavelength spaced array when the direction of users is not angularly uniform; in particular we have considered users uniformly distributed on a plane sector. The non uniform angular distribution of the users leads to a further reduction of the number of subchannels that can be multiplexed with a sufficient directivity, which is reduced to 11 (about 1/6 of the BS antennas).
By means of a parametric analysis of˜ , we have identified some elements spacings that significantly improve the number of subchannels that can be multiplexed, as well as the maximum overall rate: the optimized arrays, even when the angular distribution of users is not uniform, allow the use of a number of subchannels that is even larger than 2/3 of the number of radiators of the BS.
Moreover, by means of ray-tracing simulations, we have shown that the proposed approach can be fruitfully applied also in the case of complex propagation environments.
A particularly interesting result obtained regards the effect of grating lobes, that have always been seen as negative and avoided by antenna engineers. We have shown that, for multiplexing applications, as far as a good scheduling algorithm is employed, their presence should be not feared, so we can exploit (when possible) even very large dimensions of the antenna array, improving the beamforming capability without the need of increasing the number of radiators.
It is worth underlining that the introduced factor is perfectly suited for evaluating a system implementing the zero-forcing algorithm, but since it simply quantifies the stability of the radiation operator employed it represents a good design target also for systems employing different communication approaches.
In the future we plan to extend the analysis done in this paper to systems employing multiple layers per users, to analyze the effect of the use of subarrays and non regular lattices for the base stations, and to further analyze the optimization of antenna arrays working in complex propagation environments.

APPENDIX A USER SCHEDULING
Let us consider a K × N complex matrix H, where K represents the number of users and N is the number of excitations at the base station antenna. The problem of user grouping, i.e. the extraction of G groups of M elements in order to make the elements of each subset almost orthogonal, is known to be a NP problem, and only sub optimal approaches are known.
One of these sub optimal approaches is the QR-scheduling described in [70]; one of the main advantages of that method is its low computational cost, but some groups typically show a very low orthogonality of the elements. The QR scheduling is not adequate to be directly used for communication, but it can be used to quickly generate a starting solution to be elaborated by another algorithm, as the algorithm described in the following.
In the proposed implementation, the algorithm is an iterative algorithm starting from a preliminary grouping of the users, according to the following steps: 1) For each group the factor in (9) is calculated.
2) One random element of the group with the highest value of is selected for swapping with one of the elements belonging to the other groups.
3) The factor of the two groups involved in the swapping are calculated. 4) If the factor of both groups is lower than the previous maximum the swapping is accepted, otherwise the swapping is cancelled. 5) If the stopping criterion is met (in terms of number of iterations performed or target value of reached) the algorithm is stopped, otherwise it goes back to STEP#2.
The computational effort of the algorithm is quite intense, since 2 SVDs are required at each iteration to calculate the new factors, but this algorithm is proposed in this paper as benchmark and to perform evaluation on antenna performance, it is not meant to be used in online applications.
Finally, it must be noted that the algorithm is general: it has been applied throughout this paper for LoS cases, but it can also be applied to general random matrices arising from non-LoS propagation environments.
DANIELE PINCHERA (Senior Member, IEEE) received the Dr.Eng. (summa cum laude) degree in telecommunication engineering and the Ph.D. degree in information and electronic engineering from the University of Cassino and Southern Lazio, Cassino, Italy, in 2004 and 2008, respectively. He is currently a Tenured Researcher with the Department of Electrical and Information Engineering (DIEI), University of Cassino and Southern Lazio. His current research interests include smart antennas and multiple-input-multiple-output systems, special purpose antennas, large array synthesis, radar systems, satellite communications, compressed sensing, sensor networks, and industrial and medical applications of microwaves. He is also a member of ELEDIA@UniCAS, the Italian Electromagnetic Society (SIEm), and the National Interuniversity Consortium for Telecommunication (CNIT). He serves as a Referee for many scientific journals, including IEEE TRANSACTIONS ON  His current research interests include the connections between electromagnetism and information theory, the analysis, synthesis and characterization of antennas in complex environments, antennas and propagation for 5G, ad hoc wireless networks, compressed sensing as applied to electromagnetic problems, and energetic applications of microwaves. He is also a member of the ELEDIA@UniCAS Research Laboratory, the National Interuniversity Research Center on the Interactions between Electromagnetic Fields and Biosystems (ICEmB), where he is also the Leader of the 5G Group, the Italian Electromagnetic Society (SIEM), and the National Interuniversity Consortium for Telecommunication (CNIT). He was a Speaker with the summer Research Lecture Series of the UCSD CALIT2 Advanced Network Science in 2008. He serves as a Referee for many scientific journals. He has served as an Associate Editor for IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION.
FULVIO SCHETTINO (Senior Member, IEEE) received the Laurea degree (Hons.) and the Ph.D. degree in electronic engineering from the University of Naples, Naples, Italy. He is currently an Associate Professor with the University of Cassino and Southern Lazio, Cassino, Italy. His current research interests include numerical electromagnetics, regularization methods, the connections between electromagnetism and information theory, the analysis, synthesis and characterization of antennas in complex environments, antennas and propagation for 5G, and energetic applications of microwaves. He is also a member of the ELE-DIA@UniCAS Research Laboratory, the National Interuniversity Research Center on the Interactions between Electromagnetic Fields and Biosystems (ICEmB), the Italian Electromagnetic Society (SIEM), and the National Interuniversity Consortium for Telecommunication (CNIT). VOLUME 9, 2021