SLINR-Based Downlink Optimization in MU-MIMO Networks

Optimizing the downlink of multi-cell multiuser multiple input multiple output (MU-MIMO) networks has received substantial attention; however, the schemes in the literature consider centralized solutions requiring significant overhead in information exchange (e.g., global channel state information or CSI) and computation load (the need to solve a single large problem). This paper presents a decentralized weighted sum-rate (WSR) maximization algorithm for the multiuser downlink, accounting for beamforming, scheduling, and power allocation. We show that the signal-to-leakage-plus-noise ratio (SLNR) used in previous work suffers from significant drawbacks that limit its potential use in WSR maximization. We address this by proposing a new performance measure, the signal-to-leakage-plus-interference-plus-noise ratio (SLINR), which incorporates intra-cell interference and inter-cell leakage. The SLINR exploits the benefits of the SLNR approach, but by explicitly including interference, avoids many of its flaws. We derive an iterative and decentralized resource allocation approach under imperfect CSI, and our simulation results show that, despite BSs using only local information, the proposed algorithm comes within 3.8% of the throughput achieved by centralized schemes.

With the advent of 5G wireless networks, service providers are facing increasing pressure to deliver higher data rates to more users. One of the main factors limiting the achievable rates is interference, from both intra-cell and inter-cell transmissions, which is characterized by the signal-tointerference-plus-noise ratio (SINR). Since each active link causes interference to other links, it is essential for base stations (BSs) to consider the impacts their transmissions have on the performance of the network as a whole. In this regard, a key technique proposed for multi-cell multiuser multiple input multiple output (MU-MIMO) wireless networks is beamforming in the downlink [1], [2], [3], [4], which coordinates transmissions between BSs to maximize a chosen function of the SINR at scheduled users, e.g., our choice of maximizing the weighted sum-rate (WSR).
The associate editor coordinating the review of this manuscript and approving it for publication was Luyu Zhao . A significant additional challenge faced with beamforming is that it adds complexity to the problem of resource allocation, e.g., how should the network decide which users to serve at each point in time, and with what beamforming patterns? In [4], the authors propose an SINR-based resource allocation algorithm using fractional programming (FP) and the Hungarian algorithm to maximize the network WSR on the downlink. Nevertheless, SINR-based optimization approaches are inherently difficult to solve due to their highly-coupled nature; for each user in the network, the interference effects of each BS's transmissions must be accounted for. This requires that cooperating BSs have global knowledge of the channel state information (CSI) between every antenna and every user in the entire network, i.e., both intra-cell and inter-cell CSI. We note that the acquisition of intra-cell CSI is a significant problem on its own, especially with the advent of massive MIMO networks [5]; requiring inter-cell CSI as well seems highly impractical.
Given the challenges with acquiring CSI, SINR-based solutions are likely unrealistic in practice, but provide a theoretical upper bound on performance. Researchers have proposed using leakage as a proxy for interference, leading to the signal-to-leakage-plus-noise ratio (SLNR) [6]. In our quest for a decentralized solution to the multi-cell WSR maximization problem, we too begin with the SLNR; however, it has important limitations. After detailing these limitations, our main contribution is to develop a hybrid performance measure for decentralized resource allocation, accounting for inter-cell leakage and intra-cell interference. We note that [7] builds on our work by applying our performance measure to distributed resource allocation in cell-free MIMO networks, showing that our approach yields an effective objective function for optimization.

B. LITERATURE REVIEW 1) SINR-BASED OPTIMIZATION
As mentioned, direct optimization of rates necessitates SINRbased optimization. Since the SINR is a ratio, optimizing network utility functions, such as the WSR, is non-convex and NP-hard [8]; we can, at best, find local optima. The quadratic transform, an FP technique, is proposed in [9] to decouple the ratio terms. This transforms the problem into a form amenable to iterative optimization.
In [4], a centralized SINR-based resource allocation algorithm is proposed to optimize the network WSR on the downlink using FP based on the quadratic transform. It also makes use of the fact that, for fixed beamforming vectors throughout the network, a BS can optimally schedule users locally without changing the inter-cell interference pattern. This optimal scheduling is accomplished using the Hungarian algorithm. This algorithm iterates between designing optimal beamforming vectors given the current set of scheduled users, then refining the optimal set of users given the current beamforming vectors; this procedure converges to a stationary point.

2) SLNR-BASED OPTIMIZATION
SLNR-based approaches for resource allocation have garnered some interest [10], [11], [12], [13], [14], [15]. The fundamental benefit is that optimizing an SLNR-based function requires only local CSI. SLNR-based techniques have, therefore, been the dominant approach in the literature when developing distributed solutions using only local information. It is well-known that a closed-form solution exists for the SLNR-optimal beamformer-a significant simplification from the SINR case [6].
The work in [10] addresses SLNR-based beamforming under per-antenna power constraints and devises an approach that is robust to imperfect CSI. SLNR is used in [11] to decouple a two-stage beamforming algorithm across BSs; similarly, [12] employs SLNR-based beamforming to decouple an SINR-based problem for connected vehicles into independent subproblems. In [13], an explicit expression for the SLNR-optimal beamformer is derived for the case of singleantenna receivers. It is shown that under an equal power allocation, this beamformer has performance identical to the minimum mean square error (MMSE) beamformer. The authors note that at high SNR, the SLNR converges to a scaled F random variable which is independent of the SNR.
It is worth noting that power allocation is difficult to optimize under SLNR. This is because the power allocated to a beam scales both the signal and the leakage equally, thereby attenuating the effect of power control. Optimizing SLNR also complicates scheduling, which is highly coupled even within a single cell. This is because leakage captures how a beam impacts all the users being served; when this set of users changes, so must the leakage for all the beams (this is in direct contrast to the SINR-based case where the Hungarian algorithm is applicable). Thus, leakage is a beamcentric measure that depends on user scheduling decisions, whereas interference is a user-centric measure that depends on beam design.
A one-pass scheduling algorithm is proposed in [14], which computes a lower bound on each user's SLNR, then chooses the subset of users with the greatest lower bounds. This lower bound loosens when the number of users is much greater than the number of antennas, and so they also propose a successive scheduling algorithm, which incrementally schedules the best user given previous scheduling decisions. In [15], a mixed approach is proposed with one half of the users scheduled using the max-lower bound criterion and the other half using successive scheduling. The authors also propose power allocation based on the SLNR computed for each user.

3) OPTIMIZATION WITH INTERFERENCE AND LEAKAGE
In [16], the authors propose downlink beamforming for a single-cell network based on the intra-cell leakage and interference using a performance measure they call SLINR. In [17], the author considers 3 BSs and 3 paired users at the cell edge. The author develops an SLINR-based iterative beamforming scheme requiring CSI exchange across base stations, but not joint computation. Here, the SLINR is a function of the inter-cell leakage and interference, which is different than the definition used in [16]. In [18], the authors propose a downlink precoding scheme based on a performance measured called SILNR, which only requires local CSI. The SILNR incorporates intra-cell interference and the geometric mean of the inter-cell leakage terms. None of [16], [17], [18] incorporate user scheduling.

C. CONTRIBUTIONS
In this paper, we develop a valuable performance measure for guiding decentralized resource allocation. We too call this performance measure ''SLINR'', but it differs significantly from the previous work in [16], [17], and [18]. Unlike in [16], our paper studies a multi-cell network, and our SLINR comprises intra-cell interference and inter-cell leakage. Unlike [17], our paper studies a general multi-cell network model with an arbitrary number of BSs and users, and our algorithm is completely decentralized and does not VOLUME 10, 2022 require CSI exchange. Unlike [18], our SLINR depends on the sum of the inter-cell leakage terms rather than their geometric mean.
Furthermore, we develop the motivation for our SLINR from first principles and derive an effective resource allocation algorithm which, unlike the previous work, includes scheduling and per-user weights. We optimize the WSR, and include a comprehensive performance evaluation for different choices of weights. Our results compare the proposed algorithm to other decentralized ones and the state-of-the-art in centralized resource allocation (a short version of this paper has been accepted at a conference [19]). In summary, our major contributions are: • We motivate replacing the SINR (or SLNR) with a general, decentralized version of the SLINR, which uses inter-cell leakage and intra-cell interference. In our formulation, the SLINR can be computed without information exchange across BSs (unlike [17]).
• We develop an iterative and decentralized SLINR-based resource allocation algorithm for the multi-cell multiuser environment at hand, which obtains a stationary point. The development of the SLINR and its optimization under imperfect CSI is our main contribution.
• We show that the performance of our SLINR-based algorithm comes within 3.8% of that achieved by the state-of-the-art centralized SINR-based algorithm in [4]. Additionally, our algorithm provides fairness across users.

D. STRUCTURE AND NOTATION
This paper is organized as follows. Section II presents our system model (traditional MU-MIMO) and the weighted sum-rate optimization problem faced by the network.
To solve this problem in a decentralized manner, Section III develops resource allocation based on the SLINR. The resulting algorithm is analyzed numerically in Section IV, and we conclude in Section V. We use standard notation. Vectors are represented using lowercase boldface (e.g., h) and matrices using capital boldface (e.g., V). The Hermitian of a matrix is denoted by (·) H , the inverse of a matrix by (·) −1 , and the 2 norm of a vector by || · ||. The identity matrix of dimension M is denoted I M . CN (µ, R) denotes the circularly symmetric complex normal distribution, with mean µ and covariance R. The absolute value of a scalar is denoted by | · |. R N ×1 and C N ×1 denote real and complex column vectors of length N , respectively.

II. SYSTEM MODEL A. NETWORK MODEL
We model the network as a collection of cells, each containing a single BS. B denotes the set of all BSs in the network. Each BS is equipped with M t transmit antennas, and the b th BS serves K b single-antenna users in its cell on the same frequency resource. The downlink channel from the i th BS to the k th user of the j th BS is denoted by h kj,i ∈ C M t ×1 and is given by h kj,i = g kj,i β kj,i , where g kj,i is a random variable from a small-scale block fading process and β kj,i is the large-scale path loss, which depends on the BS-to-user distance, i.e., d kj,i . We assume that d kj,i is slowly-varying such that it can be regarded as a constant over the period of interest; therefore, the large-scale fading coefficient β kj,i can also be regarded as a constant. Since β kj,i varies slowly, it can be estimated with low feedback overhead. 1 Let u kb denote a binary scheduling variable indicating if user k of BS b is (u kb = 1) or is not (u kb = 0) scheduled and let v kb ∈ C M t ×1 denote the user's beamforming weight vector. This user is to receive symbol s kb ; thus, its received signal can be written as where η kb ∼ CN (0, σ 2 kb ) denotes additive white Gaussian noise. The achievable rate with W B Hz of channel bandwidth is defined (in appropriate units) as where γ kb is the SINR, written as Finally, we control the relative priority of each user via the rate weighting factor, w kb .

B. CHANNEL ESTIMATION
We consider a time-division duplex system, where channels are estimated on the uplink using a pilot-based scheme. We assume channel reciprocity holds. Users within a given cell are assigned pilot sequences of length N p , denoted by kb ∈ C 1×N p for the k th user of the b th BS. We model the pilot sequences as having unit norm, i.e., kb H kb = 1. Note that the vectors kb are known by all BSs; we also assume each BS has knowledge of every user's large-scale fading coefficient and transmit power (denoted by p u ), similarly to [21].
The training signal received by the b th BS is denoted by Y b ∈ C M t ×N p and, assuming all pilot signals are received coherently, is written as where Z b is measurement noise, with entries distributed according to CN (0, σ 2 Z ). To estimate the channel from the b th BS to thek th user of theb th BS, BS b first preprocesses (4) to eliminate the contributions from users with pilot sequences orthogonal to kb . This is done by projecting (4) onto the user's pilot signal, . BS b then uses the linear minimum mean square error (LMMSE) estimator to compute the estimateĥkb ,b : In (5), Rỹ˜k˜b ,b hkb ,b is the cross-covariance between the measurementỹkb ,b and the channel, and Rỹ˜k˜b ,bỹkb,b is the covariance matrix ofỹkb ,b . Assuming spatially uncorrelated channels, we obtain the cross-covariance matrix as Rỹ˜k˜b ,b hkb ,b = Dkb ,b = βkb ,b I M t , and the covariance matrix as where Ukb denotes the set of users whose pilots are non-orthogonal to H kb (including userk at BSb). Since the covariance matrices are diagonals, simplifying (5) yields the estimatorĥkb In (7), ζkb ,b is an SINR-like term for the channel estimation procedure, written as From the theory of LMMSE estimation, the estimate (7) is distributed as CN (0, Akb ,b ), where and the estimation error

C. PROBLEM FORMULATION
We wish to maximize resource allocation performance throughout the network, as measured by the weighted sumrate, in a decentralized manner (i.e., without any real-time information exchange across BSs). In this section, we state the centralized optimization problem that the network is faced with, and in the following sections, we propose and analyze a decentralized approach for solving it effectively.
The network-wide optimization is over the scheduling variables and beamformers, collected in matrices U and V, respectively. Accordingly, the weighted sum-rate maximization problem for a single time slot can be expressed in terms of the SINR (3) as shown in (11).
The first set of constraints in (11b) enforce binary scheduling decisions. The set of constraints in (11c) ensure that the number of users scheduled at a BS does not exceed the number of antennas. Finally, the set of constraints in (11d) limit the power available for transmission at each BS.

III. DOWNLINK OPTIMIZATION BASED ON SLINR A. PROBLEM ANALYSIS
We begin by noting that the optimization problem (11) is mixed-integer, non-convex, and NP-hard [8]. Since optimality is infeasible, our goal is to design an algorithm with desirable properties that solves the problem effectively. Next, we emphasize that we wish to find a fully decentralized solution to this problem, i.e., each BS can only know its local scheduling decisions, beamforming vectors and CSI. This requirement alters the problem structure, since it constrains our solutions to those which can be obtained by each BS performing an independent optimization using only local information. Accordingly, our first step for finding a decentralized algorithm is to state the single-cell optimization problem, over local design variables, for the b th BS: In (12), the scheduling variables and beamforming variables designed by the b th BS, respectively. An insurmountable problem arises with (12): it is impossible for a BS to evaluate γ kb via (3), since the inter-cell interference experienced by each user depends on other BSs' beamforming decisions and CSI (neither can be known without information exchange). Thus, neither direct rate-based optimization nor explicit interference coordination are possible in this setting. Additionally, we wish to study the practical case of imperfect CSI outlined in Section II-B, so the optimization strategy should account for channel estimation errors.

B. PROPOSED APPROACH
As discussed in Section III-A, optimizing rates directly (i.e., using an SINR maximization methodology) is not possible in the decentralized setting, because BSs require information exchange in order to calculate inter-cell interference. As such, we need to choose among alternative performance measures, which we call pseudorates. Note that in the following, we do not claim that these pseudorates are achievable, nor do we use them to evaluate performance; rather, the pseudorates are only used in our optimization procedure, and we always evaluate system performance with respect to the real (SINR-based) achievable rates.
As a starting point, consider an SINR-based optimization approach that only uses the intra-cell interference (since each BS can evaluate this using solely local information). This is a reasonable strategy that can serve as a performance baseline, but we will be interested in trying to improve upon it since it is inherently selfish. By selfish, we mean that it encourages each BS to maximize its own performance with no consideration for how its transmissions might reduce the performance of other BSs. Since we are interested in maximizing the performance of the network as a whole, it would be convenient if we had a way to balance the selfish tendency of each BS with a tendency for neighbourly behaviour.
This discussion brings to mind the leakage-based approaches mentioned in Section I. The leakage philosophy dictates that a BS accounts for other cells by considering the impacts its transmissions have on their users. This is in contrast to the interference philosophy, which dictates that a BS accounts for other cells by considering the impacts their transmissions have on its own users. The leakage and interference philosophies are shown in Figs. 1a and 1b, respectively, visualizing our earlier statement that leakage is beam-centric while interference is user-centric. Unlike interference, leakage can be computed without exchanging CSI and beamformers. The power that beam v kb leaks to the users in neighbouring cells, i.e., its inter-cell leakage, is [6] and the power it leaks to the users in its own cell, i.e., its intracell leakage, is From (13), we can see that computing the inter-cell leakage requires knowledge of other BSs' scheduling decisions. Instead of requiring perfect knowledge of these variables (i.e., information exchange), we instead adopt a flexible approach based on probabilistic knowledge, i.e., Pr (u k b = 1). Our evaluation of the inter-cell leakage for beam v kb then becomes an estimate Importantly, Pr (u k b = 1) could be set according to various policies, e.g., it may be fixed, or time-varying (since in general, not all users are equiprobable). Note that time-varying policies would not require scheduling information to be exchanged between BSs during each allocation, as long as the probabilistic knowledge of the scheduling variables in neighbouring cells changes relatively slowly. In other words, real-time knowledge of the scheduling variables is not necessary as long as the impact this has on the inter-cell leakage computation is small. Regardless of how the probabilities are assigned, by relaxing our knowledge of the scheduling variables in neighbouring cells to probabilities, we obtain a decentralized way of evaluating a beam's inter-cell leakage via (15); thus, BSs are fully decoupled. A user's SLNR is defined as the ratio of its signal power to its total leakage and noise power SLNR-based optimization is attractive in certain respects: it only requires local CSI and design variables, optimal beamformers can be determined in closed form, and it is inherently ''neighbourly'' since it penalizes inter-cell transmissions. However, this approach has important drawbacks, as mentioned in the introduction. The first key drawback comes from the observation that scheduling decisions are highly coupled under the SLNR, since the decision of whether to schedule a user affects the leakage of all beams. The second key drawback is that, since the power of a beam appears in both the numerator and the leakage terms in the denominator of the SLNR, it struggles to guide power allocation effectively.
Neither of these drawbacks are present for the SINR. Given fixed beamforming patterns, local user scheduling can be optimally solved using the Hungarian algorithm when using the SINR (and naturally, local decision making is conducive to our goal of developing a decentralized resource allocator). Even without knowing the beamformers used by other BSs, we can still use the Hungarian algorithm to make local scheduling decisions via the intra-cell interference. Furthermore, since the SINR has different beams in the numerator and denominator, the power cancellation problem experienced by the SLNR does not occur. As such, an SINR-based measure which uses only the intra-cell interference does not suffer the drawbacks of the SLNR, but at the same time, it does not experience the benefits of the SLNR's neighbourly intentions.
In summary, this discussion has shown that both leakage and interference bring something to the table for this decentralized environment: using the intra-cell interference allows BSs to make local scheduling decisions optimally given fixed beams, and using the inter-cell leakage allows BSs to account for their impact on other cells without exchanging CSI and design variables. However, both leakage and interference leave something to be desired when considered on their own. As such, we now consider a performance measure comprising leakage and interference, in hopes of retaining the benefits of each while diminishing the drawbacks.
We write the hybrid leakage-and interference-based measure as the SLINR, given by where I intra kb is the intra-cell interference experienced by the k th user of BS b, given as We emphasize that (17) can be computed without exchanging information across BSs.
To model imperfect CSI, we replace the channels with their estimates, and treat the estimation errors as additional noise like in [20] and [22]. This will persist throughout the rest of the paper. The SLINR in (17) then becomeŝ In (19),L inter kb andÎ intra kb are identical to (15) and (18), respectively, except that the channels are replaced with their estimates. The estimation error for the inter-cell leakage,L inter kb , is given bỹ and the estimation error for the intra-cell interference,Ĩ intra kb , is given byĨ The signal model employed for SLINR-based processing is visualized in Fig. 2. We now introduce the SLINR-based pseudorates asR kb = W B log 1 +γ LI kb . We optimize these as a proxy for the rate maximization problem, with the understanding that system performance must ultimately be evaluated via the real (SINR-based) rates using (2). Accordingly, the weighted sum-pseudorate maximization problem can be written like the original decentralized formulation (12), except that the rates in the objective are replaced by pseudorates, i.e., for BS b: VOLUME 10, 2022

C. RESOURCE ALLOCATION
We begin the optimization for the b th BS by noting that (22) is a sum-of-functions-of-ratios maximization problem. To solve it, we first introduce auxiliary variables, φ kb , to remove the fractional SLINR terms from the logarithms in the objective, yielding an equivalent optimization problem: (23c) Next, we introduce Lagrange multipliers, λ kb , for the equality constraints, and form the Lagrangian where collects the primal variables to save space. Differentiating with respect to φ kb , we find the optimal Lagrange multipliers to be where den{γ LI kb } denotes the denominator of (19). Now, substituting (25) into (24) yields the revised objective function (26) For all tuples in the constraint set, the numerators of the fractional terms are non-negative and the denominators are strictly positive. Therefore, we can use the quadratic transform [9] to decouple the numerator and denominator of each fraction, yielding a new objective function and a new optimization problem: subject to y kb ∈ C, (11b), (11c), (11d).
The problem in (28) is equivalent to (23), in the sense that the optimal values of the primal variables and objective functions are identical; this follows directly from properties C2 and C3 of the quadratic transform, detailed in Section II-B of [9].
We now show how (28) can be solved using cyclic maximization with closed-form updates. Note that cyclic maximization falls under the broader class of minorizationmaximization algorithms, which are guaranteed to converge to a stationary point under natural assumptions [23]. Keeping Similarly, we set ∂f q (z b ,y b ) ∂y * kb = 0 to find the optimal value of y kb as Beamformer and Power Constraint: To find the optimal v kb , we first introduce a dual variable µ b ≥ 0 to capture the perbase-station power constraint, and form the Lagrangian By setting the k th element of the gradient ∇ v * b L (z b , y b , µ b ) equal to zero and solving for v kb , it can be shown (see Appendix A) that the optimal beamforming vector is where the intermediate matrix X kb (µ b ) is given as In (33), µ b can be chosen using a bisection search in order to satisfy the sum power constraint. Due to the constraint on the maximum number of users that can be scheduled as given by (11c), the beam design step yields N b ≤ M t nonzero beams, denoted for the b th BS as ν nb , n = 1, . . . , N b . Scheduling Variables: The last step in the iteration is to update the local schedule, u b . Crucially, the inter-cell leakage in (15) is independent of local scheduling decisions for a given beam, and for a fixed set of beams, the intra-cell interference pattern is independent of local scheduling decisions too. Thus, when the k th user of the b th BS is served on the n th nonzero beam, we can write the intra-cell interference it experiences as the total interference minus the new signal term The new estimation error follows similarly, written as and we denote the new inter-cell leakage asL inter kb (ν nb ). Here, L inter kb (ν nb ) andL inter kb (ν nb ) are identical to (15) and (20), respectively, except that ν nb is used in the calculation instead of v kb . With the previous equations in mind, we can see that the SLINR for this user can be written aŝ where the fixed quantity c kb = σ 2 kb +Î intra ν nb is dependent on the beam choice. The weighted pseudorate when this user is served on the n th nonzero beam is then given by (37).
Updating the local scheduling variables, u b , is then a matter of determining the optimal user-to-beam matchings, given the beams designed via (32). For the b th BS, this can be formulated as the linear sum assignment problem in (38). In (38), the binary variables x kb,n indicate whether the k th user of the b th BS has been scheduled on the n th beam (in such a case, we set u kb = 1 and v kb = ν nb for the next iteration). Importantly, (38) can be solved optimally in O(K b N 2 b ) steps using well-established techniques such as the Hungarian algorithm [24].
We have now derived updates for all variables. Accordingly, the SLINR-based downlink resource allocation algorithm for the b th BS is stated in Algorithm 1. We emphasize that this algorithm is executed at each BS independently, i.e., the overall resource allocation is distributed. We also emphasize that it is impossible for any of steps 4 to 7 to decrease the value of the objective function (28), since they are each locally optimal with respect to the function. Accordingly, the objective function must be monotonically non-decreasing throughout the optimization procedure. This, plus the fact that the objective is bounded above (since the weighted sum-rate Update φ b using (29) 5: Update y b using (30) 6: Update V b using (32) 7: Update u b and V b jointly by solving (38) 8: Increment i 9: until convergence or i = N iterations must be finite), ensures that Algorithm 1 converges to a stationary point. Algorithm Complexity: Because Algorithm 1 is distributed, its complexity can be analyzed on a per-BS basis. To this end, we define K max b∈B {K b } as a tight upper bound on the number of users per cell. We distinguish between K and K b because each BS runs the algorithm independently, and some steps have a complexity that scales with the number of local users (i.e., K b ) while others have a complexity that scales with the number of users in neighbouring cells (bounded above by K ). We consider the inner product of two M × 1 vectors to be O(M ), and the multiplication of an M × M matrix and an M × 1 vector to be O(M 2 ).
We begin the analysis with Step 4, where φ o kb is updated using (29). This is equivalent to evaluating (19), which is dominated by the matrix-vector multiplication inL inter kb andĨ intra kb , yielding an overall complexity for this step of O(|B|KM 2 t + K b M 2 t ). In Step 5, the update for y o kb in (30) can reuse the den{γ LI kb } term computed as part of Step 4; with this caching, the overall complexity of Step 5 just comes from the two inner products in (30), and is therefore O(M t ).
Step 6 consists of several sub-steps, the first of which is computing the outer products and matrix additions comprising Next, X kb (µ b ) needs to be inverted, which is O(M 3 t ), then finally the beamformer is obtained through an O(M t ) inner product.
Step 6 also involves a bisection search on the power constraint to determine µ b , but we treat this search as having constant complexity since it scales with search parameters that are independent of the scenario under study. These sub-steps need to be performed for up to M t non-zero beams, yielding an overall complexity for Step 6 of O(|B|KM 3 Step 7 also consists of sub-steps, the first of which is setting up theR kb,n matrix and the second of which is solving (38). It can be shown that setting upR kb,n is dominated by K b M t computations ofL inter kb (ν nb ), which is O(|B|KK b M 3 t ). As mentioned previously, (38) can be solved in O(K b M 2 t ), so the overall complexity for Step 7 is O(|B|KK b M 3 t + K b M 2 t ). We have now derived the computational complexity of each step in Algorithm 1. Putting them together and keeping the highest-order terms yields a per-iteration complexity of O(|B|KK b M 3 t + M 4 t ) for BS b, which is dominated by setting upR kb,n in Step 7 and inverting X kb (µ b ) in Step 6.
In summary, in this section we motivated and developed resource allocation based on a hybrid leakage-and interference-based performance measure, the SLINR, which serves as a proxy for the SINR when optimizing rates. The SLINR allows for a completely distributed solution like the well-known SLNR, but does not suffer from its drawbacks (discussed in Section III-B). Using SLINR helps deal with interference while also allowing for effective power control and scheduling. Our next section illustrates the efficacy of the proposed approach.

IV. NUMERICAL RESULTS
In this section, the performance of the proposed algorithm (Algorithm 1) is compared to the state-of-the-art centralized SINR-based algorithm from [4], and some decentralized algorithms. Evaluations are made for max sum-rate (equal weights) and proportionally fair weighting on the basis of throughput and fairness. Unless otherwise noted, we assume perfect CSI is available.
We simulated a 7-cell environment (|B| = 7) with wraparound. For convenience, we implemented hexagonal cells with a BS at the center of each cell. Throughout this paper, a topology refers to a particular placement of users in the cells; these user placements were randomly generated according to a spatially uniform distribution. We model Rayleigh fading channels by choosing g kj,i ∼ CN (0, I M t ) and set the large-scale path loss as β kj, where d 0 and α are given, along with the other main system parameters, in Table 1. Each time a resource allocator was run, the (SINR-based) rates achieved by each user were recorded. For simulations using proportionally fair weights, the weights inputted to the algorithm for the (n+1) th time slot were computed according to w (n+1) kb kb denotes the user's long-term average data rate over an exponentially-decaying window, calculated via the update equationR kb , with α f denoting the forgetting factor. Throughout this section, the proposed algorithm used Pr (u k b = 1) = M t /K b , i.e., we assumed all users in neighboring cells were equally likely to be scheduled (see (15)).
Unless otherwise specified, we initialized both the proposed algorithm and the SINR-based algorithm using the best known initialization: the maximum interference-free weighted sum-rate (MIFWSR). We computed the term for each user, and chose the min (M t , K b ) users with the largest values. For both the proposed algorithm and the SINR-based algorithm, the initial guess for the beamformers (i.e., at the start of each iterative optimization process) was chosen to be matched filtering weights with equal power.

A. CONVERGENCE
We begin by illustrating the convergence of the SLINR algorithm. Fig. 3 plots the sum-rate convergence of the SLINR-based algorithm and the SINR-based algorithm for a single time slot with identical channels and user placements. The sum-rates shown for the SLINR-based algorithm are the total across all BSs for a fair comparison with the SINR-based algorithm. Importantly, we plot both the true sum-rate convergence and the sum-pseudorate convergence for the SLINR-based algorithm. The local pseudorates are the information that the SLINR-based algorithm has access to, whereas the true rates are what the users experience. We see that the SLINR-based algorithm converges slower than the SINR-based algorithm, albeit only by 1 to 2 iterations. The SLINR-based algorithm also converges in a monotonically nondecreasing fashion, both in terms of the true rates and the pseudorates. The sum-pseudorate is quite close to the true sum-rate, suggesting that the SLINR is a reasonable proxy for the SINR. Finally, we note that the true sum-rate converges to a slightly higher value than the sum-pseudorate.

B. LEAKAGE WEIGHTING FACTOR
In this section, we consider a generalization of (17): In (39), ≥ 0 is the leakage weighting factor; by varying it, we can study the role of the inter-cell leakage in detail.
When = 0, the proposed algorithm is still decentralized but leakage no longer plays a role; as such, we refer to the algorithm as ''SLINR (no leakage)'' in this case. For the results in this section, we consider a distance of 500 m between adjacent BSs and K b = 40. Results are averaged over 1000 channel realizations on a single topology. First, we consider the max sum-rate case (w kb = 1), which causes the algorithm to focus on the users with the largest instantaneous rates (usually near the cell center). From Fig. 4a, we can see that the throughput of the proposed algorithm is concave in , with a maximum being achieved near the interval [0.4, 0.6]. Performance improves by about 100 Mbps when increasing from 0 to 1; although, in percentage terms, this improvement is not large, it nevertheless demonstrates that penalizing leakage can yield gains.   . 4b helps explain the trend in the sum-rate. The initial increase appears to be coming from the improvement in the average rates of the cell-center users; however, we can see that increasing worsens the average rates of the mid-cell users, and causes a negligible difference for the cell-edge users (who are rarely scheduled). As such, the improvement in the cellcenter users' rates is eventually insufficient to make up for the loss in the rates of the mid-cell users. With a uniform user distribution, there are fewer cell-center users than mid-cell users, hence the eventual decrease in sum-rate as increases.
As for why performance shifts from the mid-cell users to those near the cell center, this can be explained by the fact that as increases, each BS starts accounting for the leakage it causes to other cells. This decreases the estimated performance of the users who are far from the cell center, but has a negligible impact on those near it (since their SLINRs are large, and therefore their pseudorates are relatively insensitive to the introduction of the leakage penalty). For the max sum-rate case we are considering, this effectively de-prioritizes users who are further away from their BS. As we will see next, the story changes when a different weighting scheme is adopted. Next, we consider the case of proportionally fair weights, which causes the algorithm to prioritize each user in proportion to the inverse of its average rate. With this choice of weights, throughput is not our primary concern; instead we focus on fairness-related measures, such as the WSR shown in Fig. 5a. The decrease in WSR for 0 < < 0.4 shows that penalizing small amounts of inter-cell leakage harms the overall performance of the network. One explanation is that in this regime, we are paying for penalizing the best users through our attempt to be leakage-aware; at the same time, we are not sufficiently leakage-aware to reap the rewards. As increases from 0.4 to 1, we recover the initial performance of the network (with a slight gain).
Importantly, we note that Fig. 5b shows a 10% improvement in the 10 th percentile rate within this interval, meaning that although the transition from = 0 to = 1 has not improved the WSR drastically, it has improved the rates of the cell-edge users to a noticeable extent. This indicates VOLUME 10, 2022 the suppression of inter-cell leakage is paying off. Finally, we note that as increases beyond 1, the WSR increases to a maximum of 8.5% higher than at = 0, and the 10 th percentile rate improves by up to 17%, thus demonstrating the efficacy of SLINR.

C. PERFORMANCE COMPARISONS ACROSS ALGORITHMS
Throughout the rest of our results, we return to the setup described in Table 1.

1) MAX SUM-RATE PERFORMANCE
We begin with a sum-rate performance comparison between our proposed algorithm, the state-of-the-art centralized SINR-based algorithm from [4], and the SLNR-based algorithms from [14] and [15]. The SLNR-based algorithms work in a one-shot fashion: they first determine beam directions for each user (note: unit power), then compute an SLNR-based priority for each user, and finally determine a schedule greedily. For a fair comparison with our proposed algorithm, we make the SLNR-based algorithms perform two additional steps: (1) repeating the beam calculations after local scheduling, and (2) performing an equal power allocation. These steps are performed locally at each BS without any information exchange. Fig. 6 shows a network sum-rate comparison of the algorithms based on data gathered over 10 topologies, each with 100 channel realizations. We emphasize that for all cases, the throughput is obtained by summing the true SINR-based rates in (2) after the algorithms are executed. While the (centralized) SINR-based algorithm with the MIFWSR initialization performs the best, the proposed decentralized SLINR-based algorithm is not far behind. Furthermore, our algorithm outperforms the SLNR-based ones, and the SINR-based one with a random initial schedule. The SINR, SLINR and SLNR Max. LB algorithms achieve similar performance; this is as expected, since setting equal weights encourages scheduling the cell-center users (with high signal power).   algorithm comes within 3.8% of the SINR-based algorithm's throughput; even though this is a performance loss, we believe this slight drawback is outweighed by the important benefit of not needing any information exchange. Furthermore, our proposed algorithm also provides for more fairness across users relative to the SINR case (MIFWSR scheduling), with 9% more users in the network achieving at least 1 Mbps. When comparing our proposed algorithm to the ''SLINR (no leakage)'' case (recall from Section IV-B that this refers to the SLINR approach with = 0 in (39)), the results demonstrate that incorporating inter-cell leakage improves both fairness and throughput. Finally, we observe that the SLNR Max. LB algorithm results in the worst fairness.

2) PROPORTIONALLY FAIR PERFORMANCE
We now investigate the performance of the SINR, SLINR and SLNR Max. LB algorithms when proportionally fair weights are used. Fig. 7 presents the cumulative distribution functions (CDFs) for the average user data rates over 1000 channel realizations on a single topology, and Table 3 provides accompanying numerical data (edge rates refer to 10 th percentile rates). All performance measures decrease when moving from a centralized scheme to a decentralized one; this is unavoidable since proportionally fair weights require cell-edge users to be scheduled, and explicit interference coordination for these users is not possible in a decentralized setting (thus, their rates suffer). However, we once again  notice that by incorporating inter-cell leakage, our proposed algorithm outperforms ''SLINR (no leakage)'' across all performance measures; in particular, Fig. 7 shows a more desirable rate CDF. As before, we notice the lack of fairness characterizing the SLNR Max. LB algorithm.

D. CHANNEL ESTIMATION
In this section, we study the max sum-rate performance of the proposed SLINR-based algorithm under imperfect CSI. Pilots were orthogonal within each cell, and reused across cells. Results were obtained by averaging over 30 topologies, each with 400 channel realizations. Fig. 8 shows that with imperfect CSI (and no overhead considered), we suffer a throughput decrease of 11-16%, with the relative loss increasing with K b . A larger performance decrease is observed once the training overhead is factored in; this is because with overhead, the achievable rate must be multiplied by the fraction of time available to transmit the data, i.e., (τ d − τ p )/τ d , where τ d = 200 is the length of the downlink transmission phase, and τ p = K b is the length of the training phase [19]. The sum-rate begins decreasing in K b after K b = 50, indicating diminishing returns when too many resources are dedicated to the training phase.   9 shows the throughput achieved under 3 different methods for computing the inter-cell leakage, each of which requires a different level of CSI. ''Standard CSI Estimation'' uses the approach described throughout this paper. ''Statistical CSI'' uses only the large-scale path loss to compute the inter-cell leakage and the ''Traffic Distribution'' method is based on the spatial distribution of users (which is assumed to change slowly). Both methods eliminate the need for inter-cell CSI and at the same time offer a competitive data rate according to Fig. 9.
We now explain the reasons for this in detail. To begin, note that the inter-cell channels are weaker than the intra-cell ones (on average) due to the relatively larger associated BSuser distances. As such, the ''Standard'' method estimates the inter-cell channels with lower accuracy than the intracell ones. Next, since we chose to reuse pilots across cells, far users' signals (i.e., inter-cell pilots) get contaminated by near users' signals (i.e., intra-cell pilots). For these reasons, the inter-cell channel estimates obtained by the ''Standard'' method are of poor quality (i.e., low estimation SINR (8)). The implications of this are twofold. First, (7) shows that as a channel's estimation quality decreases, the magnitude of the estimated channel is attenuated. When this is the case for all the inter-cell channels, it causes the inter-cell leakage value (15) to be attenuated similarly. Second, (10) shows that as a channel's estimation quality decreases, the error covariance approaches a diagonal matrix whose entries are the channel's large-scale fading coefficient (since we assume spatially uncorrelated channels). When this is the case for all the inter-cell channels, then we can see that under the ''Standard'' method, the estimation error for the inter-cell leakage (20) approaches (40) Notice how (40) is identical to the inter-cell leakage value (15) computed under the ''Statistical CSI'' method. In summary, these two points show how, as the inter-cell channel estimation quality worsens under the ''Standard'' method, the inter-cell leakage value reduces and the estimation error for this quantity approaches a weighted sum of the large-scale fading coefficients. This is the fundamental reason why similar performance is obtained when we replace our inter-cell channel estimates with long-term information (such as the large-scale fading coefficients or traffic distribution) which considerably improve the algorithm's practicality and scalability.

E. HEAT MAPS
This section presents heat maps to visualize the spatial distribution of rates and scheduling decisions. Data was collected using 80 users per cell (K b = 80). White spots on the heat maps correspond to locations where no user was ever placed. The x-axis and y-axis denote position in metres. A logarithmic color scale is used to reveal features that are difficult to see on a linear scale (e.g., a color mapping to the value 2 on the color bar should be thought of as 10 2 ). Fig. 10 helps visualize the spatial distribution of rates throughout the network. Comparing the max sum-rate and proportionally fair cases, we see that the highest rates are tightly concentrated around the cell center in the max sum-rate case (as expected), whereas they are more ''spread out'' in the proportionally fair case. In particular, we see that the SLINR-based algorithm is the most spread out for the max sum-rate case, which visualizes the observation made in Section IV-C1 that the SLINR-based algorithm provides more fairness across users for this case. For the proportionally fair heat maps, we see that all algorithms have ''hot spots'' at the cell center, but only ''cool off'' slightly as we approach the cell-edge. The SLINR-based algorithm has a slightly ''cooler'' edge than the SINR-based algorithm, which, as discussed in Section IV-C2, is due to its lack of interference coordination. Fig. 11 visualizes the spatial distribution of scheduling decisions. Comparing the left side to the right side, we see that much like the rates, the scheduled users are highly concentrated around their BS for the max sum-rate case, whereas they are ''spread out'' in the proportionally fair case.
For the proportionally fair heat maps, we see that the SLINR-based algorithm schedules cell edge users most frequently, and schedules users less frequently the closer they are to the center of the cell. Since the users at the cell edge tend to have smaller rates than those at the cell center, this seems like a reasonable strategy-what may be surprising about it is that the edge rates for the SLINR-based algorithm are much lower than those of the SINR-based algorithm, which schedules users at the cell edge less frequently. The explanation is that since the SLINR-based algorithm is decentralized, when a BS schedules edge users, it causes significant amounts of interference to edge users scheduled in neighboring cells (since the BSs do not coordinate). Even incorporating inter-cell leakage can only combat this to a certain extent.
Furthermore, we see faint rings in each cell for the case of the SINR-based algorithm. This indicates that it focuses on neither the cell-edge users nor cell-center users, but rather, the users in between these two extremes. Interestingly, the heat map of rates presented in Fig. 10 does not show such a ring. The explanation is that by centrally coordinating resource allocation, the SINR-based algorithm will only schedule cell-edge users when it knows they can achieve high rates; this way, they do not have to be scheduled frequently. Similarly, the users in the cell center are scheduled relatively infrequently since they tend to have excellent channels, and therefore experience very high rates each time they are scheduled. However, the rest of the users between these two extremes are far enough away from their BS that they do not experience good channels all the time, and are far enough away from the cell edge that interference coordination does not protect them; hence, they need to be scheduled more often to achieve good average rates.
In summary, both algorithms have a spatial region corresponding to users who are frequently scheduled, but achieve rates that do not reflect this. This region is the cell edge for the SLINR-based algorithm, and it is a ring near the middle of the cell for the SINR-based algorithm. In conclusion, these regions are where interference suppression is the least effective.

V. CONCLUSION
In this paper, we studied downlink resource allocation for a multi-cell MU-MIMO network. Although resource allocation significantly improves network performance, the schemes available in the literature are largely centralized and accrue a significant overhead, especially in terms of CSI estimation and exchange. The focus here was on developing an effective decentralized resource allocation algorithm, building on the well-accepted concept of leakage. In this regard, we motivated the development of a hybrid performance measure, the SLINR, which retains attractive qualities of both SLNR and SINR; specifically, it accounts for the impact on other cells through the inter-cell leakage, and is amenable to optimal local scheduling through the intra-cell interference. Furthermore, it can be computed without any information exchange across BSs.
Using the SLINR to design an effective resource allocation algorithm required the derivation of a new approach. The results presented in Section IV compared the performance of the proposed SLINR-based algorithm to other decentralized algorithms, and the centralized SINR-based algorithm from [4]. We showed that the proposed algorithm outperforms the decentralized ones, and performs within 3.8% of the centralized one. Our algorithm converges slightly slower than the centralized one, but does so in a monotonically nondecreasing fashion. It also performs well under imperfect CSI, provided the fraction of time spent on training is limited, and interestingly, its performance is robust under limited CSI when computing the inter-cell leakage. Our results concluded with heat maps showing the spatial distributions of rates and scheduling decisions throughout the network, and a discussion relating these to the previous performance analyses.
There are various directions to consider for future work. For example, incorporating quality-of-service constraints into the resource allocator would be a timely topic, given the heterogeneous communication requirements of users in 5G networks. Evaluating performance under a shadowing model, to mimic outdoor scenarios with realistic power variations, would also likely be of practical interest. Similarly, alternative network scenarios could be investigated, such as heterogeneous cell shapes or BS transmit power constraints. Finally, since each user's achieved rate (and therefore SINR) is available to its BS as a measured feedback signal, it may be worth investigating how this information could be used to improve the resource allocation decisions. For example, if it could be exploited to improve interference suppression at the cell edge, then performance could be improved while maintaining our requirement of full decentralization, which would be a valuable contribution.