Blockage-Aware Reliable mmWave Access via Coordinated Multi-Point Connectivity

The fundamental challenge of the millimeter-wave (mmWave) frequency band is the sensitivity of the radio channel to blockages, which gives rise to unstable connectivity and impacts the reliability of a system. To this end, multi-point connectivity is a promising approach for ensuring the desired rate and reliability requirements. A robust beamformer design is proposed to improve the communication reliability by exploiting the spatial macro-diversity and a pessimistic estimate of rates over potential link blockage combinations. Specifically, we provide a blockage-aware algorithm for the weighted sum-rate maximization (WSRM) problem with parallel beamformer processing across distributed remote radio units (RRUs). Combinations of non-convex and coupled constraints are handled via successive convex approximation (SCA) framework, which admits a closed-form solution for each SCA step, by solving a system of Karush-Kuhn-Tucker (KKT) optimality conditions. Unlike the conventional coordinated multi-point (CoMP) schemes, the proposed blockage-aware beamformer design has, per-iteration, computational complexity in the order of RRU antennas instead of system-wide joint transmit antennas. This leads to a practical and computationally efficient implementation that is scalable to any arbitrary multi-point configuration. In the presence of random blockages, the proposed schemes are shown to significantly outperform baseline scenarios and result in reliable mmWave communication.


Blockage-Aware Reliable mmWave Access via
Coordinated Multi-Point Connectivity the investigation of millimeter-wave (mmWave) communication for the upcoming 5th-generation (5G) New Radio (NR) and beyond cellular systems [1], [2]. The mmWave frequency band not only provides relatively large system bandwidth but also allows for packing a significant number of antenna elements for highly directional communication [1], which is important to ensure link availability as well as to control interference in dense deployments [2]. Hence, the mmWave mobile communication is anticipated to substantially increase the average system throughput. However, the fundamental challenge is the sensitivity of mmWave radio channel to blockages due to reduced diffraction, higher path and penetration loss [3], [4]. These lead to rapid degradation of signal strength and give rise to unstable and unreliable connectivity. For example, a mobile human blocker can obstruct the dominant paths for hundreds of milliseconds, and normally lead to disconnecting the communication session [3], [4]. On the other hand, finding an alternate unblocked direction causes critical latency overheads. Hence, the presence of such frequent and long duration blockages significantly reduces the experienced quality-of-service (QoS) [4]. To overcome such challenges, use of coordinated multi-point (CoMP) schemes, where the users are concurrently connected to multiple remote radio units (RRUs), are highly useful for providing more robust and resilient communication [5]- [15]. Therefore, it is envisioned that multi-connectivity schemes by utilizing the multi-antenna spatial redundancy via geographically separated transceivers will be of high importance in future mmWave systems [16].

A. Prior Work
The CoMP transmission and reception are typically used to increase the system throughput, particularly for the cell-edge users due to relatively long distance from the serving basestation (BS) and adverse channel conditions (e.g., higher pathloss and interference from neighboring BSs). Such scenarios have been widely studied over the past decade in the context of 4th-generation (4G) systems [5]- [9]. Techniques, such as, joint transmission (JT), coordinated beamforming (CB) and dynamic point selection (DPS) were standardized in 3rd Generation Partnership Project (3GPP) and were widely studied in Long Term Evolution-Advanced (LTE-A) to enhance capacity and converge by efficiently utilizing the spatially separated transceivers [8]. For example, it is shown in [9] that JT-CoMP increases the coverage by, up to, 17% for general users and 24% for cell-edge users compared to non-cooperative scenario. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Recent studies have considered the deployment of CoMP in the mmWave frequencies [10]- [15]. Also, it is considered in 3GPP for upcoming 5G NR and beyond mmWave based cellular systems [16]. In [10], [11], the authors showed a significant coverage improvement by simultaneously serving a user with spatially distributed transmitters. Results were drawn from extensive real-time measurements for 73 GHz in the urban open square scenario. The network coverage gain for the mmWave system with multi-point connectivity, in the presence of random blockages, was also confirmed in [12], [13] using stochastic geometry tools. The work in [14] proposed a low complexity cooperation technique for the JT, wherein a subset of cooperating BSs is obtained by selecting the strongest BS in each tier. The authors also investigated the impact of blockage density in heterogeneous multi-tier network. Similarly to earlier works on single-cell two-stage hybrid analog-digital beamforming design, e.g., in [17], [18], authors in [15] considered a multi-user massive multipleinput-multiple-output (MU-MIMO) system with JT-CoMP processing where a high-dimensional analog beamformer is followed by a low-dimensional centralized digital baseband precoder. However, CoMP techniques in [5]- [15] were still devised with the sole scope of enhancing the capacity and coverage by utilizing the spatially separated transceivers. Thus, they were not originally designed for the stringent reliability requirements of, e.g., industrial-grade critical applications.
It is well known that a system can provide any level of reliability by sequential data transmission, i.e., by retransmitting the same message at various protocol levels, until a receiver acknowledges correct reception over a dedicated feedback channel [19]. However, in the presence of random link blockages, high penetration and path-loss, mmWave feedback links are inherently unreliable and, hence, they require redundant retransmissions. On the other hand, allowable latency dictates a strict upper limit on the number of retransmissions [20].
The loss of connection in the mmWave communication is mainly due to a sudden blockage of the dominant links, generally caused by abrupt mobility, self-blockage or external blockers [3], [4]. Accurate estimation of each blocker requires precise environment mapping and frequent channel state information (CSI) acquisition, which might result in significant coordination overhead. Furthermore, blockage events can create large latencies if a passive hand-off is inevitable [20]. Thus, the limitations of retransmission events and the difficulty of accurate estimation of random blocking events motivate us to develop more robust and resilient downlink transmission strategies that can retain stable connectivity under the uncertainties of mmWave channels and random blockages.

B. Contributions
Motivated by the above concerns, we propose a robust beamforming design for the JT-CoMP, which improves the sum-rate while retaining stable and resilient connectivity for mmWave mobile access in the presence of random blockers.
The key contributions of this paper include: • A blockage-aware beamformer design with a strong emphasis on system reliability is provided by exploiting multi-antenna spatial diversity and CoMP connectivity.
The weighted downlink sum-rate is maximized, 1 where, for each user, a pessimistic estimate of the achievable rate over all possible combinations of potentially blocked links among the cooperating RRUs is considered. Managing a large set of link blockage combinations is considerably more difficult than conventional constrained optimization [7], [15], [21]- [25] due to the mutually coupled signal-to-interference-noise ratio (SINR) constraints. The preemptive modeling of serving set over the potential link blockage combinations are shown to greatly improve the system outage performance while ensuring user-specific rate and reliability requirements. • A successive convex approximation (SCA) based beamforming algorithm is provided for the original non-convex and computationally challenging problem. More specifically, all coupled and non-convex constraints are conservatively approximated with a sequence of convex subsets and iteratively solved until convergence. The underlying subproblems, for each SCA iteration, become secondorder cone programs (SOCPs), and that are efficiently solvable by any standard off-the-shelf solvers. • A low-complexity robust beamformer design framework is proposed that merges the SCA with dual [26] and best response [24] methods to admit parallel beamformer processing for the distributed RRUs via iterative evaluation of the closed-form Karush-Kuhn-Tucker (KKT) optimality conditions. The schemes proposed in [21], [24] cannot be used directly, thus our proposed KKT based solution is significantly more advanced, and provides an approach for solving mutually coupled minimum SINR constraints. This leads to a practical, latency-conscious, and computationally efficient implementation for cloud edge architecture. • A detailed implementation of proposed methods is provided assuming digital beamforming architecture. Moreover, for completeness, a low-complexity two-stage hybrid analog-digital beamforming implementation is introduced in the numerical section. As a result, the proposed methods are scalable to any arbitrary multi-point configuration and dense deployments. Finally, numerical examples are presented to quantify the complexity and the performance advantages of the proposed solutions in terms of achievable sum-rate and reliable connectivity. The paper is an extended version of our previously published conference papers [27], [28]. In [27] we studied beamformer design for weighted sum-rate maximization (WSRM) problem leveraging the SCA framework while in [28] an iterative KKT based solution was provided. Compared to the previous work [27], [28], we have included the following additional contributions that provide more complete coverage and analysis. In this paper, we consider a more practical, spatially correlated and distance-dependent blockage model. Specifically, the presence and/or absence of the blockage on different RRUs links are subject to the blockage density, their location, and the resulting distance with the users. In addition, we also provide a closed-form upper bound on the outage performance. We further provide detailed complexity analysis for the centralized SCA based solution and iterative KKT based solution. We also studied and provided extensive simulation on the rate of convergence and its impact on different feasible initialization. Moreover, the proposed solutions are extended to provide low-complexity two-stage hybrid analogdigital beamforming. These are scalable to any arbitrary multipoint configuration and dense deployments. Finally, we have extended the simulation model to take into account user-centric clustering, i.e., we have included the user-RRU association, which results in partially overlapping user-centric clusters. Thus, it leads to inter-cluster and intra-cluster inference conditions. Using this model, we have provided a more extensive set of simulations to illustrate the effectiveness of the proposed methods in terms of achievable sum-rate and reliable mmWave connectivity.

C. Organization and Notations
The remainder of this paper is organized as follows.
In Section II, we illustrate system, channel, and blockage model as well as, provide the formulation of the problem. Section III provides a theoretical analysis of blockage and evaluation of rate and reliability trade-off. In Section IV, we describe the robust beamformer designs. The validation of our proposed methods with the numerical results are presented in Section V, and finally conclusions are given in Section VI. Notations: In the following, we represent matrices and vectors with boldface uppercase and lowercase letters, respectively. The transpose, conjugate transpose and inverse operation are represented with the superscript (·) T , (·) H and (·) −1 respectively. |X | indicates the cardinality of a set X . {·} and | · | represent the real part and norm of a complex number, respectively. I N indicates the N xN identity matrix. C MxN is a M xN matrix with elements in the complex field.
[a] n is the nth element of a. Finally, ∇ x y(x) denotes gradient of y(·) with respect to variable x.

A. System Model
We consider downlink transmission in a mmWave based multi-user multiple-input-single-output (MU-MISO) communication system, consisting of K single antenna users served by B RRUs. Each RRU is equipped with N t transmit antennas, and arranged in a uniform linear array (ULA) pattern. The antennas have 0 dBi gain and x = λ/2 spacing between any two adjacent elements, where λ is the wavelength of carrier frequency. We define B = {1, 2, . . . , B} to be the set of all RRU indices, K = {1, 2, . . . , K} denotes the set of active users, and the serving set of RRUs for each user k is represented with B k ⊆ B for all k ∈ K. We study JT-CoMP transmission, whereby, each active user k receives a coherently synchronous signal from all the RRUs in B k . Furthermore, the downlink transmissions are performed using the same frequency and time resources. In this paper, if not mentioned otherwise, we assume by default a case where each antenna is connected to a dedicated radio frequency (RF) chain and baseband circuit that enables fully digital signal processing. In addition, we provide an implementation for two-stage hybrid analog-digital beamforming architecture with coarse-level analog beamforming followed by less-complex digital precoding. Finally, we assume a cloud (or centralized) radio access network (C-RAN) architecture, wherein all RRUs are connected to the edge cloud by high-bandwidth and lowlatency fronthaul links, as illustrated in Fig. 1.
It should be noted, in the C-RAN architecture, a common baseband processing unit (BBU) performs all the digital signal processing functionalities in a centralized manner, while the RRUs implement limited radio operations [29]. Such, fully centralized baseband processing provides more efficient RRU coordination and, thus, enables more effective implementation for JT-CoMP scenarios [29]. However, in practice, fronthaul link capacity and signaling overhead will limit the maximum number of coordinating RRUs for each user. Furthermore, perfect estimation of available CSI is assumed at the BBU for the downlink beamformer design and resource allocations, whereby, each RRU receives information for the active users, such as control and data signals, using the fronthaul links.
The received signal 2 of kth user, y k can be expressed as where h b,k ∈ C Ntx1 is the channel between a RRU-user pair (b, k), w k ∼ CN (0, σ 2 k ) is circularly symmetric additive white Gaussian noise (AWGN) with power spectral density (PSD) of σ 2 k and s k is normalized and independent data symbol, i.e., E{|s k | 2 } = 1 and E{s k s * u } = 0, for all k, u ∈ K. In expression (1), f b,k ∈ C Ntx1 represents the portion of the joint beamformer between a RRU-user pair (b, k), designed by the centralized BBU assuming perfect estimation of available CSI. The received SINR for each user k can be expressed as

B. Channel Model
Due to the higher penetration and path-loss, reduced scattering and diffraction at the mmWave frequencies compared to sub-6 GHz frequency band, the channel can be considered to be spatially sparse [1], [30], in which line-of-sight (LoS) is the dominant path and mainly contributes to the communication [3], [4]. Thus, unblocked LoS link is highly desirable in order to initiate and maintain a stable mmWave communication. The channel model, in this paper, is based on a sparse geometric model [31], [32], which is widely adopted in studies related to mmWave signal processing. The considered channel model is customarily used to model the mmWave radio channels, as it accurately accounts for its high free-space path loss and low-rank nature of mmWave radio signal [17], [18], [33]. Specifically, we consider M b,k paths for the channel h b,k between RRU b and user k, and expressed as where φ 1 b,k and φ m b,k denote the angle-of-arrival (AoA) at kth user for the LoS and the m-th non-LoS (NLoS) path, respectively. Note that the AoA for each NLoS path m > 1 is assumed to be uniformly distributed, i.e., φ m b,k ∈ [−π/2, π/2], whereas, the LoS AoA φ 1 b,k is related to the actual position of RRU-user (b, k) pair [34]. Finally, ,k ∀m is a random complex gain with zero mean and unit variance, d b,k is the RRU-user distance, and ζ denotes the path-loss exponent for the LoS and the NLoS link, respectively. It has been shown empirically that ζ is much higher than the LoS path-loss exponent [3], [10], [11]. The transmit array steering vector of ULA for angle φ is denoted as

C. Blockage Model
In the mmWave communication, the quality of the wireless link between a RRU-user pair mainly depends on the characteristics of the LoS path [3], [4]. The NLoS paths in the mmWave radio channel are typically 20 − 30 dB weaker than the dominant LoS path [4], thus high data rates are difficult to achieve in the NLoS only transmission. On the other hand, a major challenge for LoS dominated communication stems from the fact that LoS links may easily be blocked by obstacles. This may result in an intermittent connection, which severely impacts the quality of user-experience. Channel measurements in typical mmWave scenarios have demonstrated that outage on a mmWave link occurs with 20% − 60% probability [30] and that may lead to over 10−fold decrease in the achievable rate [35]. Therefore, unless being addressed properly, the blockage appears as the main bottleneck hindering the full exploitation of the mmWave channel.
In order to characterize the aforementioned uncertainties of the mmWave radio channel, we consider probabilistic blockage model, where the link specific blockage depends on the blockage density and the distance between each RRUuser pair [36], [37]. For simplicity, we assume independent blockage events per link 3 only for the dominant LoS path while all the NLoS links are assumed to be unobstructed. 4 More specifically, the channel between any typical RRU-user pair can be either fully-available or in NLoS state. The NLoS state occurs when the dominant LoS link is blocked by any obstacle. It should be noted that even a mobile human blocker may cause 20 − 30 dB attenuation and can obstruct the LoS path for hundreds of milliseconds [3], [4]. This can be equivalently modeled as {h LoS b,k = 0} b∈B,k∈K for the blocked LoS component. The fully-available state is defined in (3).
Since the blockers are completely random, their position and orientation cannot be known a priori in a dynamic mobile environment. Similar to [36], [37], the blockage between RRUuser pair (b, k) is defined using a Bernoulli random variable with the parameter η and expressed as where η depends on the density and the average size of the obstacles blocking the dominant LoS path. In addition, the probability of a LoS link decreases exponentially with the increase in distance between RRU-user [36], [37]. From the physical standpoint, η can be interpreted as the LoS likelihood for a given propagation environment and distance [36]. For example, the smaller the η value, the sparser the propagation environment, and consequently, the higher probability of a LoS link at the given distance and vice-versa. In our study, we will make use of parameter η to analyze the effect of different blockage densities on the system performance. On the other hand, for a fixed blockage density η (which is common for all RRUs in the system), the LoS blockage is only subject to the link distance d b,k between any typical RRU-user pair for all b ∈ B and k ∈ K. Thus, the presence and/or absence of the dominant LoS path is subject to the blockage density and the distance between the RRUs and the users. We assume a standard time division duplex (TDD)-based CSI acquisition from reciprocal uplink followed by the downlink data transmission phase. More specifically, BBU designs the transmit beamformer based on the available CSI acquired during the (uplink) estimation phase (3). Thus, a system can be in the outage, if the dominant LoS link was available during the channel estimation, and is not anymore available during the data transmission phase due to random blockage (e.g., due to channel aging of blockage effect). Similarly, a LoS link can also be in the blockage state during the channel estimation phase. However, these links will not be included for the downlink data transmission. As a consequence, the actual achievable rate at the receiver would be larger than the assigned rate to the users. Thus, from the reliability perspective, we have to consider the case when the channel is available during the estimation phase but blocked during the data transmission phase, which is not known at the BBU for downlink beamformer design.

D. Problem Formulation
The major goal of this work is to develop a robust and resilient downlink transmission strategy that can retain stable connectivity under the uncertainties of mmWave radio channel and random blockages. To this end, we need to compute the optimal joint transmit beamformer F = [f 1,1 , f 1,2 , . . . , f B,K ], while exploiting the multi-antenna spatial diversity and CoMP connectivity for improved system-level reliability. For the WSRM 5 objective considered in this paper, 6 the beamformer design can be formulated as where δ k ≥ 0 ∀k denotes the user-specific priority weights corresponding to the achievable rate, and they are fixed before data transmission (e.g., by BBU). The function Γ k (·) is defined in expression (2). For each user k, the constraint (6b) is the pessimistic estimate of SINR computed over all possible subset combinations of potentially available RRUs, each of size |B c k | ≥ L k , from its serving set B c k (⊆ B k ), e.g., by excluding the blocked RRUs, as will become clear in following. The total transmit power for bth RRU is bounded by P b , as in (6c).
In practice, the adverse channel condition and signaling overhead limits the maximum number of cooperating RRUs for each user (i.e., B k ∀k) [29]. Thus, the subset combinations B c k (⊆ B k ) are fairly small for modestly sized systems. The resulting problem (6) is intractable due to non-convex and coupled SINR constraints (6b). To this end, in Section IV, we provide practical and computationally efficient iterative algorithms by exploiting convex approximation techniques.

III. ANALYSIS OF RATE AND RELIABILITY TRADE-OFF
In this paper, we assume randomly distributed blockers. Thus, the position of each blocker and/or blockage event is completely unknown. Therefore, to improve system reliability and avoid outage under the uncertainties of mmWave radio channel, we preemptively underestimate the achievable SINR assuming that a portion of available CoMP links would be blocked during the data transmission phase. This is specifically required in the mmWave, because of dynamic blockages which are not possible to track during the channel estimation phase. Let BBU assumes that each user k have at least L k available links (i.e., unblocked RRUs). Then, the BBU proactively models the SINR over all possible subset combinations, by excluding the potentially blocked RRUs, and allocate the rate to users such that transmission reliability is improved (i.e., minimize the outage due to blockages that appear during the data transmission phase).
For example, referring to expression (6b) and Fig. 1, let B k be the set of RRUs that are used to serve kth user with RRU indices B k = {1, 2, 3, 4}. Then, with the assumption of at least L k = 3 available links, the serving set of unblocked RRUs available to kth user can be any one of the following combinations: Equivalently, the subset combinations of all blocked RRUs D k for the kth user can be expressed as Let C(L k ) denotes the cardinality of set B k and defined as Recall that, the set of coordinating RRUs for each user (i.e., |B k | ∀k) are limited. Thus, C(L k ) is fairly small and solving (6), in general, does not require combinatorial optimization. Let D c k ∈ D k denotes c-th subset combination of the potentially blocked RRUs, and B c k ∈ B k represents c-th subset combination of the available RRUs for kth user, where c = 1, 2, . . . , C(L k ). Then, the SINR of each user k for c-th subset combination (i.e., B c k ∈ B k ) is obtained by excluding the blocked RRUs in (2) and expressed as where B c k = B k \D c k and D c k ∈ D k denotes c-th subset of potentially blocked RRUs. Consequently, after solving the problem (6), the pessimistic estimate of the achievable SINR for kth user is equivalent toγ k min RRUs. Therefore, reliable connectivity for each user k can be guaranteed, even if, |B k | − L k dominant links are not available during the transmission phase. Contrarily, if more than L k links were available, the actual achievable rate would be larger than the assigned rate to the users.
As an example, let q b,k represent the blockage probability between RRU-user pair (b, k), defined as q b,k = 1 − e −ηd b,k (see (5)). Next, for a given parameter L k , we can approximately model the success probability 7 p k of kth user as In Appendix B, we generalize (8) by integration over random users location. Since all users are independent, therefore, the system is in outage if any of the K users is in outage. It should be noted, this is a worst-case assumption to enforce strict systemlevel reliability. However, in practice, users with the unblocked links can still decode their received signal. Thus, with the worst-case assumption, system outage is defined as The closed-form expression (9) models the case when the channel between a RRU-user pair is either fullyavailable or completely-blocked (e.g., both LoS and NLoS paths in (3)), which corresponds to the local-blockage (or self-blockage) at the user. However, we consider blocking of the dominant LoS link while keeping all NLoS components unobstructed (see Section II-C). Thus, expression (9) provides an approximation on the outage performance, as shown in Section V. Intuitively, we can observe the impact of constraint (6b) on the system reliability and achievable rate using (7) and (9). For example, by using the smaller subset size (i.e., parameter L k ∀k), we can improve the system reliability assuming that a significant portion of all the available CoMP links (i.e., |B k | − L k ∀k) are potentially blocked. However, it leads to a lower SINR estimate and, hence, a lower rate to each user. Conversely, a less pessimistic assumption on subset size can provide the higher instantaneous SINR and user-specific rates, but it is more susceptible to the outage, thus resulting in less 7 If we assume equal blockage probability i.e., q b,k = q k ∀b ∈ B k , then success probability of kth user p k becomes a binomial distribution [27] and can be expressed as stable connectivity for each user. Clearly, there is a trade-off between achievable rate and reliable connectivity. 8

A. Solution via Successive Convex Approximation
The non-convex SINR constraint (6b) can be handled by SCA framework, as shown in [7], [22]- [24], where the authors provided SINR approximation method for CB [7], [22], [23] and JT [24] assuming global-CSI and no blockage. We extend these approaches to take into consideration coherent multipoint transmission and provide a novel grouping of a multitude of potentially coupled and non-convex SINR conditions that raise from the link blockage subsets. In the following, the main steps are briefly described (we refer the reader to [22] for the details). To begin with, by using the expression of Γ k (B c k ) (see (7)) and adding one on both sides, we rewrite (6b) as where G c k = B\D c k for all c = 1, 2, . . . , C(L k ) and k ∈ K.
The indicator function ½ G c k (b) and ½ Bj (b) are defined as ½ G c The user-specific subset size L k and the serving set size |B k | ∀k ∈ K are design parameters that can be tuned based on statistical information, e.g., users location and blockage density, to achieve a desired rate-reliability tradeoff (for more details see Section V-C). Furthermore, the expression (10b) can be compactly expressed using the vector notations. Let f j ∈ C |B|Nt×1 be the stacked downlink beamformer defined as and h c k ∈ C |B|Nt×1 be the stacked channel vector defined as For the brevity of mathematical representation, we also define Thus, by using the vector notations f j and h c k , expression (10) can be rewritten as Note that we have added one on both sides of constraint (6b) to get (11). This improves the numerical stability of the algorithm as will become clear in the following. For more compact representation, we now introduce func- Then expression (11) can be written as for all c = 1, 2, . . . C(L k ) and k ∈ K. Note that function H k (B c k ) is a quadratic-over-linear, which is a convex function [48,Ch. 3]. Hence, the reformulated SINR constraint (13) is still non-convex (i.e., difference of convex functions). Therefore, we resort to the SCA framework [42], [43], wherein all non-convex constraints are approximated with a sequence of convex subsets and iteratively solved until convergence of the objective [7], [22]- [25]. Note that the SCA framework based on iterative relaxation of non-convex SINR constraints can be shown to converge to a stationary point [22,Appendix A]. Thus, the best convex approximation of reformulated SINR constraint (13) is obtained by replacing H k (B c k ) with its first-order approximation. The linear firstorder Taylor approximation of H k (B c k ) can be expressed as with equality only at the approximation point {f After replacing (12b) with its linear approximation (14) and plugging it into constraint (6b), an approximated subproblem for ith SCA iteration is expressed in convex form along with the corresponding dual-variables as maximize F,γ k k∈K The convex subproblem (15) for each SCA iteration can be efficiently solved, in general, using existing convex optimization toolboxes, such as CVX [49]. The fixed operating points for the current iteration are updated from the solution of the current SCA iteration. This is repeated until convergence to a stationary point. The beamformer design with the proposed SCA relaxation has been summarized in Algorithm 1. Set i = i + 1 6 until convergence or for fixed number of iterations;

B. Solution via Low-Complexity KKT Conditions
Problem (15) can be more efficiently solved at the BBU by the parallel update of beamforming vectors corresponding to the spatially distributed RRUs. Unlike the approach presented in the previous subsection, the robust beamformer can be obtained by iteratively solving a system of KKT equations [48]. Thus, the KKT based solution provides closed-form steps for an algorithm that does not rely on generic convex solvers. Furthermore, iterative evaluation of KKT optimality conditions, for each SCA step, reveals a conveniently parallel structure for the beamformer design with significantly lower computational complexity with respect to joint beamformer optimization across all distributed RRU antennas.
The Lagrangian L(F,γ k , a k,c , z b ) of problem (15) is given in expression (16). It should be noted that the KKT optimality conditions provide necessary and sufficient conditions for the solution of a convex problem [48,Ch. 5]. Thus, the solution for problem (15) can be obtained by iteratively solving a system of KKT optimality conditions, which include stationary, complementary slackness, and primal-dual feasibility requirements (for more detailed derivation see Appendix A). Next, we briefly outline the key challenges in solving the problem (15) by using KKT conditions and our proposed solution.
The user-specific SINR constraints (15b) are mutually interdependent over the link blockage combinations (see Section II-C). Thus, it makes deriving an efficient and closed-form solution for Lagrangian multipliers a k,c ∀(k, c) considerably more difficult than in the case with only a single SINR constraint per-user [21]- [25]. To overcome this, we resort to a subgradient approach, where Lagrangian multipliers a k,c ∀(k, c) are solved using the constrained ellipsoid method [26].
In addition, the design of optimal beamformer F is inherently coupled between all distributed RRU antennas due to coherent joint transmission to each user. Therefore, the computational complexity of optimal beamformer f k ∀k scales cubically with the length of joint beamformers (BN t ), which quickly becomes intractable, e.g., for dense deployments. Furthermore, RRU specific dual variables z b ∀b should be computed simultaneously (see (21b) in Appendix A). However, because of coupling and interdependence among Lagrangian multipliers z b ∀b due to JT-CoMP, it is computationally challenging to obtain a closed-form solution. To overcome this, we also incorporate a parallel optimization framework using the best response [24] into the iterative optimization process (see (21c) in Appendix A). As a result, RRU specific beamformers are solved in parallel for each iteration, while assuming that coupling from other cooperating RRUs is fixed to the solution from the previous iteration. Note that for a convex problem, monotonic convergence can be guaranteed by a regularization step on beamformer update [50].
In the following, we provide an iterative algorithm by combining the SCA framework with dual and best response methods, which admits the closed-form solution in each step. Specifically, in each SCA iteration, the approximated convex subproblem (15) is solved via the iterative evaluation of the KKT optimality conditions. Furthermore, to improve the rate of convergence, the SCA approximation point f is also updated in each iteration along with the Lagrangian multipliers (for more detailed derivation see Appendix A).
To summarize, the steps in the iterative algorithm are where . . , C(L k ), k ∈ K, and b ∈ B. The best response and subgradient step sizes are represented with ψ > 0 and β > 0, respectively. In (17e), we have use (x) + max(0, x). The expressions in (17) are solved in an iterative manner, starting with initializing the variables {f k,c } with feasible values, such that SINR and total transmit power constraints for each distributed RRU are satisfied. Note that due to reformulation of constraint (6b) as in (11), we get {1 +γ j } j∈K in the denominator of (17a) and (17d), and these are invertible, even if subset of users are assigned zero SINR. Thus, the proposed iterative method for problem (15) is numerically stable. The beamformer design L (F,γ k , a k,c , z by iteratively solving a system of KKT optimality conditions is summarized in Algorithm 2.

Algorithm 2:
Low-Complexity KKT Based Iterative Algorithm for WSRM problem (15) 1 Set i = 1 and initialize with a feasible starting point f Update a Set i = i + 1 8 until convergence or for fixed number of iterations; : Dual variables a k,c ∀(k, c) corresponding to constraint (15b) are interdependent and mutually coupled due to the common SINR constraint across the link blockage combinations. Their exact values for each SCA iteration cannot be obtained as a closed-form expression. Therefore, we resort to a widely used subgradient approach, such as the constrained ellipsoid method, which converges to the local optimal solution for a convex optimization problem [26]. It should be noted that choice of the step size β in expression (17e) depends on the system model, as it directly affects the convergence rate as well as control the oscillation in the WSRM objective function. There have been several studies in the literature on the convergence properties of the subgradient approach, with the different step size rules [26], [51]. More precisely, monotonic convergence cannot be guaranteed, in general, for the constrained ellipsoid method, and thus, one has to track and adjust the step size accordingly. In the proposed iterative approach in Algorithm 2, the dual variables a k,c ∀(k, c) are updated based on the violation of the SINR constraint with a small positive step size, as in (17e).

1) Lagrangian Multipliers
Dual variables z b ∀b are chosen to satisfy the transmit power constraint (15c), using the bisection search method. It should be noted, in a multi-cell scenario, the transmit power constraints may not necessarily always hold with equality.
, non-negative dual variable z b is set to zero in order to satisfy the corresponding complementary slackness conditions [48,Ch. 5.5.2]. Otherwise, there exist a unique 2) Best Response: The beamformer F is inherently coupled among all distributed RRU antennas (17), because of the coherent joint transmission to each user. One possible approach is based on updating the beamformers sequentially, i.e., using the Gauss-Seidel type update process, which provides monotonic convergence for a WSRM optimization problems. However, it is shown in [52] that the convergence rate drastically reduces even with a slight increase in the number of cooperating RRUs. Here, instead, we implement a parallel optimization framework [50], which efficiently parallelizes the beamformer updates across the distributed RRU antennas, and hence significantly reduces the per-iteration computational complexity. Specifically, for a given iteration, RRU specific beamformers are solved in parallel while assuming the coupling from other RRUs is fixed to the solution from the previous iteration, as in expression (17a). The objective function can be shown to converge if we allow a sufficiently large number of subgradient iterations per fixed SCA approximation (until increased objective) for each RRU before making the best response step with a sufficiently small step size [24]. However, here we are more interested in a fast and robust rate of convergence, for which, we allow only a single subgradient update per best response iteration. Thus, the Algorithm 2 may not converge to the same point as Algorithm 1. It is shown by numerical examples in Section V that this still provides excellent performance with a fairly small number of iterations. More details on the convergence behavior and choice of step size ψ ∈ (0, 1) with the best response based parallel optimization approach are provided in [50]. It should be noted that the RRU-specific transmit power constraints are convex (see (15c)), therefore, regularized update with ψ < 1 in (17b) will strictly preserve the feasibility of total transmit power constraint of each RRU.

3) Feasible Initial Point:
In the SCA framework, all nonconvex constraints are approximated with a sequence of convex subsets and then iteratively solved until convergence of the objective [42], [43]. Thus, it is very important to initialize the iterative algorithm with some feasible starting point. To this end, one possible solution for the feasible initial f (0) b,k is to use any beamformer satisfying the transmit power constraint (6c), which can be obtained by scaling a randomly generated beamforming vector. Then, the lower bound on achievable SINR can be calculated from expression (7), i.e.,γ . . , C(L k ). However, it should be noted that the randomly generated initial solution can be very far from the optimal solution and may require a significantly large number of iterations until convergence. As an example, for a system model with N t ≥ K, an efficient initial point can be obtained by simply matching the beamformers f

4) Complexity Analysis:
The approximated convex subproblem (15) can be solved in a generic convex optimization solver as a sequence of second-order cone programs (SOCP) [53]. The complexity of the problem scales exponentially with the length of the joint beamformers (BN t ) and the number of constraints [53]. Thus, particularly, for dense mmWave deployments with large N t and B, the complexity quickly becomes intractable in practice. The complexity of the proposed KKT based iterative solution is dominated by (17a), which mainly consists of matrix multiplications and inverse operations and that scale with RRU specific beamformer size (N t ). Therefore, the worst-case computational requirement is O κτ |B k |N 3 t , where κ is the number of iterations, τ is the number of bisection steps per iteration, |B k | number of RRUs that serve kth user, and N t is digital beamformer size, e.g., the number of antennas at the RRU. In addition, the complexity of matrix inversion operation in (17a) can be alleviated by solving f * b,k from a system of linear equations, providing a significant reduction in the computational complexity.
In the proposed methods, the centralized BBU computes the beamformers f b,k ∀b ∈ B, k ∈ K based on the available CSI. Hence, there is no additional signalling exchange and overhead among the cooperating RRUs. Therefore, the signalling overhead of the proposed algorithms is admissible and supported by the C-RAN architecture in the upcoming 5G systems [29].
Extensions: The iterative evaluation of KKT conditions can be extended to alternating direction method of multipliers (ADMM) design, wherein mutually coupled and non-convex SINR constraint (6b) can be handled by augmented Lagrangian method. In addition, problem (6) can be further extended by tuning the parameters {B k , L k } for all k ∈ K based on statistical information. The ADMM design and joint optimization of parameters {B k , L k } k∈K are left for future work.

V. SIMULATION RESULTS
This section provides numerical results to validate the performance of the proposed methods. In particular, we analyze the impact of subset size L k for all k on outage performance and sum-rate, as well as evaluate the trade-off between these performance metrics. We further elaborate on the convergence and the performance gap between the proposed algorithms.

A. Simulation Setup
We consider a mmWave based downlink system with B = 8 RRUs and each RRU is equipped with ULAs of N t = 16 antennas. All RRUs are placed in a rectangle layout (resembling e.g., a factory-type setup) and connected to a common BBU in the edge cloud. If not mentioned otherwise, we consider JT-CoMP scenario with partially overlapping user-centric clusters, such that each active user k receives a coherently synchronous signal from all RRUs in B k (⊆ B), as shown in Fig. 3 for a given channel realization. Recall that to improve communication reliability, we use parameter L k (≤ |B k |) 9 and proactively model the SINR over the link blockage combinations (see Section III). For simplicity, let us assume |B k | = 4, L k = L for all k ∈ K. All RRUs are assumed with the same maximum transmit power, i.e., P b = 33 dBm for all b ∈ B. The AWGN noise is set to −72 dBm/Hz, carrier-frequency f c = 28 GHz, and a 20 MHz frequency band is assumed to be fully reused across all RRUs. In the simulations, we set LoS path-loss exponent = 2, NLoS path-loss exponent ζ ∈ [2,6], and the user priorities are set to be equal (i.e., δ k = 1 ∀k). All K = 4 users are assumed to be randomly dropped within the coverage region, hence having different path gain and angle with respect to each RRU. By default, each antenna is equipped with a dedicated RF chain and data converters to enable fully digital signal processing. Hybrid analog-digital beamforming structures are evaluated in Section V-E. In expression (17), we use β = 0.005 and ψ = 0.05 for the subgradient and best response step sizes, respectively. Fig. 4 shows the outage performance as a function of increasing blockage density η for the WSRM problem. Outage event occurs if the assigned transmit rate R k exceeds the achievable link rate 10 C k , for any user k = 1, . . . , K, i.e., 9 The joint optimization of design parameters parameters {B k , L k } k∈K is an interesting topic for future studies. 10 For a given parameter L, let us denote solution S k = {γ * k , f * k } k∈K obtained from the Algorithm (2). Then, for each user k, the transmission rate R k = log 2 (1 +γ * k ) (see Section III). However, the supported link rate of each user k depends on the obtained beamformers {f * k } k∈K and current channel state {h b,k } b∈B,k∈K , which cannot be exactly known to the BBU during data transmission phase due to random blockages (e.g., due to channel aging of blockage effect). Hence, the supported rates are calculated using the actual SINR values (2), i.e., C k = log 2 (1 + Γ k (B k )) ∀k, and these rates are unknown to the BBU. It can be concluded from Fig. 4 that the outage performance is greatly improved by decreasing the subset size L. Clearly, lower L provides more stable and robust communication. The beamformer design with L = 1 can provide reliable connectivity even if all but one LoS links are blocked. For example, with the blockage density η = 0.005, the outage probability is decreased from 99% to less than 5% by changing the parameter L from 4 to 1 in problem (15). Thus, the specific rate allocation with L = 1 can withstand blockage up to a single active LoS link, and provides greatly improved communication reliability.

B. Outage Performance
When comparing the simulated results with the theoretical approximation (9), for smaller values of L, the simulated outage performance appears relatively better than the theoretical counterpart, as the WSRM problem (15) solved at the BBU may end up assigning non-zero powers to only a subset of users, while all remaining users are assigned zero rate. In such a scenario, missing a LoS link results in a different blockage than what is predicted by the theoretical formula (9). Moreover, the expression (9) models the case when the channel between a RRU-user pair is either fullyavailable or completely blocked (e.g., both LoS and NLoS paths in (3)). However, an increase in the subset size L also increases the SINR estimate (see (7)). Thus, it is likely that all users are assigned with some non-zero downlink rate, and, therefore, the simulated results tend to closely match to theoretical results obtained from (9). Furthermore, it can be seen from Fig. 4 that the proposed methods significantly outperform the conventional full-JT (L = B), CB and MRT based beamformer designs.

C. Effective Sum-Rate Performance
Fig . 5 illustrates the trade-off between achievable sum-rate and outage performance for WSRM problem. The effective sum-rate T e is defined as where R = k∈K log 2 1 +γ k , i.e., when each active user successfully receives the transmit data. It can be observed that when the blockage is not present (i.e., η = 0), with the smaller subset size L, the sum-rate is significantly reduced due to overly a pessimistic estimate of the aggregated SINR (see Section II-C). However, it remains relatively more stable even at much higher link blockage density due to improved communication reliability (see Fig. 4). On the other hand, with the conventional JT, the sum-rate quickly approaches zero with the slight increase in blockage density, due to higher outage. Clearly, there is a trade-off between sum-rate and outage performance, e.g., the outage performance can be improved at the small cost of achievable sum-rate. Specifically, for a given outage threshold, we can guarantee a minimum achievable sum-rate and vice versa. In addition, parameter L (or {B k , L k } k∈K in general) can be considered as an optimization or selection variable which maximizes the sum-rate for a given blockage density and outage performance, as shown in Fig. 6. The proposed methods provide robust and resilient connectivity under uncertainties of mmWave radio channel  and random blockages, whereas, with the conventional full-JT (L = B), CB and MRT methods, sum-rate rapidly decrease towards zero, even if the blockages are slightly increased.

D. Impact of Initialization and Step Sizes
First, in Fig. 7 and Fig. 8, we examine the convergence performance of Algorithm 1 and Algorithm 2. For simplicity but without loss of generality, we consider JT-CoMP scenario with full-coordination i.e., |B| = 4. We set parameter L = 3 and blockage density η = 0. It can be concluded that convergence is very sensitive to the choice of the step sizes (ψ, β). For example, with the larger value of step sizes, the algorithm can converge with a fewer number of iterations, but might result in more fluctuations to the objective. It should be noted that convergence may not necessarily be monotonic, which is an inherent feature of subgradient updates (see (17e)).
In addition, ingenious choice of the feasible initial points f k,c also impact the rate of convergence. For the considered scenario with N t ≥ K, a simple MRT based initialization for f (0) b,k ∀(b, k) significantly improves the convergence rate and attains near optimal solution with fewer number of iterations, as shown in Fig. 8. However, irrespective of the   choice of initialization point and step sizes, both algorithms tends to converge to the local optimal solution, on the average, provided a sufficient number of iterations.
Finally, we compare the sum-rate performance of the lowcomplexity method based on iteratively solving a set of KKT conditions with the solution obtained directly by the optimization toolbox [49]. Moreover, with the assumption of full-CSI, (on average) the achievable performance approaches to the theoretical upper bound. It can be concluded, from Fig. 9, that Algorithm 2 achieves near-optimal performance and the resulting gap in the sum-rate performance is mainly due to insufficient convergence because of the fixed number of maximum iterations. Therefore, the proposed KKT based iterative method provides a low-complexity solution for practical implementations without any significant degradation in the achievable system performance.

E. Hybrid Analog-Digital Beamforming Implementation
While the problem formulation and proposed solutions are generic, they can be easily extended to any multi-point configuration and dense deployments. Until now, we have restricted ourselves to the case where each antenna is equipped with a dedicated RF chain and data converter that enables fully digital signal processing. In this subsection, we provide an implementation for two-stage hybrid analog-digital architecture with coarse-level analog beamforming with a limited number of RF circuits followed by less-complex digital precoding in the digital baseband domain.
Generally, due to the high power consumption and cost of mixed-signal components in the mmWave frequencies, analog beamforming is performed using a network of phaseshifters [17], [18]. To this end, one of the common solutions in the literature is to select the analog beams from a fixed predefined codebook [17]. We assume that analog beamforming vector w b,k between a RRU-user pair (b, k) is obtained from a fixed beam steering codebook W with cardinality |W| = 32. Furthermore, we assume that each RRU b independently decides analog beamformers which maximize their local signal power, i.e., based on the following criterion: Case-1: For example, let N RF = K be the number of RF circuits at each RRU b, then, from expression (20) we obtain After fixing the userspecific analog beams, each RRU b estimates the effective channel, i.e., H b = H H b W b for all b ∈ B and then computes the robust digital precoder, as explained in Section IV.
Case-2: For example, if we consider N RF = 1 and K = 4 then each RRU b will have at most one active analog beam in a given direction. Therefore, aligning such a directional beam towards a specific user will degrade the achievable SNR for all other active users. However, to efficiently utilize the JT-CoMP gain, one needs to provide a comparable SNR to all the users. To do that, we first obtain a compromise transmit beam by appropriate phase-shifts and amplitude scaling to each antenna element using variable gain amplifiers [54], [55], e.g., by superimposing best beam of each user k, as It should be noted, in general, w b ∈ C Ntx1 may not satisfy the uni-modulus constraints on beamforming coefficients [18], [34]. Optimization of the compromise multicast analog beam with N RF < K and uni-modulus constraints is an interesting topic for future studies. After fixing the compromise transmit beam, each RRU estimates the effective channel, i.e., h b = H H b w b ∀b ∈ B and then obtain the digital precoder, as explained in Section IV. Fig. 10 illustrate the sum-rate performance with JT-CoMP scenario, e.g., |B| = 4. For simplicity but without loss of generality, we set parameter L = 3 and blockage density η = {0, 0.001}. It can be seen that achievable sum-rate with the two-stage hybrid beamforming architecture is in general lower than full-digital beamforming. This is mainly due to dimensionality reduction of the digital precoder brought by fixed analog beamformers in the two-stage hybrid architecture. When N RF = K, each RRU implements user-specific analog beam selection and achieves comparable performance to the full-digital beamforming. However, when N RF = 1, each RRU implements a compromise transmit beam which is aligned to all K users, thus significantly reducing the achievable analog beamforming gain because of relatively wide analog beams. In addition, the overall system is degreeof-freedom (DoF) limited in the digital domain, i.e., L < K, which leads to a significant decrease in the sum-rate performance, specially in the high-SNR conditions.
In the hybrid architecture, analog beamformers are obtained from a predefined beamforming codebook (20). Thus, the computational complexity of the proposed methods mostly stems from the computation of digital precoder. It can be seen from expression (17) that computations are dominated by matrix multiplications and inversion operations in (17a). Hence, the computational complexity mainly depends on the dimension of the matrix and it is of O(N 3 t ) for inverse operation [48]. Therefore, hybrid analog-digital beamforming architecture provides dimensionality reductions for digital precoder from N t to N RF . Thus, the computational complexity of the matrix inverse operation is significantly reduced. As an example, the hybrid architecture provides a complexity reduction for the matrix inversion by {98.44%, 99.98%} for N t = 16 and N RF = {K, 1}, respectively. Thus, hybrid architecture achieves the better balance between complexity and system performance.

VI. CONCLUSION
In this paper, we studied the trade-off between achievable rate and reliability in mmWave access by exploiting the multiantenna spatial diversity and CoMP connectivity. To combat unpredictable random blockages, a pessimistic estimate of the user-specific rate is obtained over the link blockage combinations, thus providing greatly improved communication reliability. We devised a low complexity robust beamforming scheme, which is tractable for the practical implementations, based on the best response and subgradient methods, wherein, each RRU specific beamformers are optimized in parallel. Our proposed methods provide a significant reduction in the computational complexity with respect to joint optimization overall RRUs beamforming vectors. Thus, the proposed methods are scalable to any arbitrary multi-point configuration and dense deployments. Specifically, we proposed a computationally efficient iterative algorithm for the WSRM problem, based on the SCA framework and parallelization of the corresponding KKT solutions, while accounting for the uncertainties of mmWave radio channel in terms of random link blockages. Simulation results manifested the robustness of proposed beamformer design in the presence of random blockages. The outage performance and achievable throughput with the proposed methods significantly outperform the baseline scenarios and results in more stable and resilient connectivity for highly reliable mmWave communication.

APPENDIX A
Considering the Lagrangian in (16), the stationary conditions are obtained by differentiating (16) with respect to associated primal optimization variablesγ k and f k . After some algebraic manipulations, the stationary conditions are given as ∇γ k : where E b diag 0, . . . , I Nt B(b) , . . . 0 is a block diagonal matrix, with all entries are zeros except I Nt for bth RRU. From (21b), it can be noticed that the computational complexity scales exponentially with the length of joint beamformers (BN t ), mainly due to matrix inversion. Furthermore, all coupled and interdependent dual variables z b ∀b must be found simultaneously, which hinders the use of closed-form expressions. Thus, the complexity of algorithm may become intractable in practice, in particular, for dense deployments with large N t and B. Here, instead, we implement a parallel optimization framework using the best response approach, which efficiently parallelize the beamformer updates across the distributed RRU with significantly reduced complexity as ∀(b, k, c). In addition to (21) and the primal-dual feasibility constraints, the KKT conditions also include the complementary slackness conditions as given by Lets assume the user-specific priority weights δ k > 0 ∀k (and γ k ≥ 0 ∀k), then from expression (21a), we can observe Thus, we can infer that at least one of dual-variables a k,c ∀c for each user k is strictly positive and LHS of (24) is zero if and only if δ k = 0 ∀k. For simplicity, (24) can be rewritten as The dual-variables a k,c ∀(k, c) are coupled and interdependent due to the common SINR constraint, as also seen from (21) and (22). Therefore, it is difficult calculate the exact values of these variables in closed-form expressions. However, all the coupled non-negative Lagrangian multipliers a k,c ∀(k, c) can be iteratively solved using the subgradient method, such as based on constrained ellipsoid method. For iteration i, the update for the dual variable a k,c with a small positive step size β > 0 can be formulated as The dual variables a (0) = [a K,C(LK ) ] T are initialized with small positive values. From (21c), we obtain the transmit beamformer as in expression (17a). Finally, the dual variables z b ∀b are chosen to satisfy the total power constraints (15c), using the bisection search method.

APPENDIX B
The success probability of kth user in expression (8) can be upper bounded by using the binomial theorem, and defined as where Ψ k = |B k | − L k ∀k ∈ K. In expression (27), q k is mean blocking probability of each user k and expressed as q k = 1 2 and (x, y) denotes the coordinates in 2D plane. Thus, success probability can be obtained by integrating with respect to users location.