Hybrid Beamforming for 5G and Beyond Millimeter-Wave Systems: A Holistic View

Millimeter-wave (mm-wave) communication is a key technology for future wireless networks. To combat significant path loss and exploit the abundant mm-wave spectrum, effective beamforming is crucial. Nevertheless, conventional fully digital beamforming techniques are inapplicable, as they demand a separate radio frequency (RF) chain for each antenna element, which is costly and consumes too much energy. Hybrid beamforming is a cost-effective alternative, which can significantly reduce the hardware cost and power consumption by employing a small number of RF chains. This paper presents a holistic view on hybrid beamforming for 5G and beyond mm-wave systems, based on a new taxonomy for different hardware structures. We take a pragmatic approach and compare different proposals from three key aspects: 1) hardware efficiency, i.e., the required hardware components; 2) computational efficiency of the associated beamforming algorithm; and 3) achievable spectral efficiency, a main performance indicator. Through systematic comparisons, the interplay and trade-off among these three design aspects are demonstrated, and promising candidates for hybrid beamforming in future wireless networks are identified.


I. INTRODUCTION
T HE continued upsurge of mobile data and the eruption of diversified mobile applications are driving the demand for next-generation wireless networks, i.e., the fifth-generation (5G) networks. Compared with the current fourth-generation (4G) Long Term Evolution (LTE) networks [1], 5G needs to achieve orders of magnitude increase in the peak data rate, area spectral efficiency, network energy efficiency, while supporting a roundtrip latency of about 1 ms [2]. Thus, disruptive technologies will be needed, and deploying 5G systems at millimeter wave (mm-wave) bands has been proposed due to the abundant spectrum. Thanks to the small wavelength of the mm-wave signals, large-scale antenna arrays can be deployed, and recent advances in massive multiple-input multiple-output (MIMO) [3] can be leveraged to provide beamforming gains to combat the increased path loss and synthesize highly directional beams to support mm-wave communications [4].
To deploy mm-wave systems with large-scale antenna arrays, challenges in hardware implementation and algorithm design need to be addressed. A large number of hardware components will be needed to support conventional digital beamforming, including signal mixers, analog-to-digital/digital-toanalog converters (ADCs/DACs), and power amplifiers [5]. This will put prohibitive burdens on cost and power consumption, especially for mobile terminals, and thus is not feasible. Furthermore, the significantly increased dimension of the beamformers brings stringent requirements on computational efficiency of beamforming algorithms. These challenges have driven the recent efforts in developing hardware efficient transceivers, supported with efficient beamforming algorithms. One initial proposal is analog beamforming, supported with a phase shifter network and low-complexity beam steering. It is currently the de-facto approach for indoor mm-wave systems [6]. However, analog beamforming only supports single-stream transmission, and cannot fully exploit the available spatial resource. To further improve performance, hybrid beamforming has been proposed as a cost-effective approach to support spatial multiplexing with a limited number of radio frequency (RF) chains, whose potential has been demonstrated in many recent studies [7], [8]. In particular, compared with analog beamforming, hybrid beamforming supports multistream transmission with spatial multiplexing, as well as spatial division multiple access. It achieves spectral efficiency comparable to fully digital beamforming with much reduced hardware complexity. Therefore, it has been regarded as a promising candidate for transceiver structures in mm-wave systems.
The concept of hybrid beamforming can be traced back to early 2000s [9], [10], where the point-to-point single-stream transmission in sub-6 GHz systems was investigated as a special case. Almost a decade later, hybrid beamforming was revisited in mm-wave systems [11] and has drawn considerable attention from both academia and industry. By leveraging the sparsity of mm-wave channels, low-complexity algorithms were proposed for point-to-point hybrid beamforming [11], whose achievable spectral efficiency was further improved in [12]. Then, hybrid beamforming was extended to single-user multicarrier [13]- [15] and multiuser single-carrier systems [16]- [19]. The main challenge of hybrid beamforming design is to optimize the system performance under the hardware constraints, e.g., reduced RF chains and the high-dimensional phase shifter-based analog beamformer. Various algorithms is equipped with N t antennas to serve K N r -antenna users over F subcarriers. In addition, N s data streams are transmitted to each user on each subcarrier. The available numbers of RF chains are N t RF and N r RF for the BS and each user, respectively.
were developed to combat this difficulty, e.g., compressive sensing [11], [20], codebook-based design [14], [21], and manifold optimization [15], [22], which have offered effective design methodologies for hybrid beamforming. Nevertheless, hybrid beamforming is still facing several critical issues that may hinder its practical applicability. Compared with fully digital beamforming, hardware complexity has been significantly reduced, but it is still quite a concern, especially considering the cost and power consumption of mm-wave devices [23]. Thus, hybrid beamforming structures that are more hardware-efficient should be developed. For this aspect, we can learn little from conventional digital beamforming design, which takes a performance-oriented perspective, e.g., to maximize spectral efficiency or minimize transmit power, but largely neglects hardware complexity. Furthermore, digital beamforming problems are typically convex, and powerful tools from convex optimization can be leveraged [24]. However, hybrid beamforming problems are innately non-convex and challenging to design.
To address these design challenges for hybrid beamforming, a holistic approach should be taken. In particular, we need a comprehensive consideration that accounts for the following three decisive aspects: hardware efficiency, computational efficiency, and spectral efficiency. Accordingly, this paper presents key proposals of hybrid beamforming structures, emphasizing the following three desirable properties: 1) High hardware efficiency (HE), i.e., with as few hardware components as possible, which leads to low cost and low power consumption. 2) High spectral efficiency (SE), which should be close to that of the fully digital beamforming. 3) High computational efficiency (CE), i.e., the hybrid beamforming algorithm should be of low complexity.
A special emphasis is placed on the interplay between hardware structures and beamforming algorithm design. Answers to the following key questions will be revealed through the discussion: • How many RF chains and phase shifters are needed?
• Can hybrid beamforming approach the performance of fully digital beamforming? • How to effectively design hybrid beamforming algorithms? Specifically, we first present the state-of-the-art hybrid beamforming structures, as well as their algorithm design. Limitations of these basic structures are identified. Then, we introduce two new analog network implementations, which greatly simplify algorithm design and reduce hardware complexity, respectively. Finally, we propose a flexible mapping strategy for hybrid beamforming, which helps to strike a good balance between the hardware complexity and spectral efficiency. The paper ends with key conclusions and some future research directions.
Notations: The following notations are used throughout this paper.  = √ −1 is the imaginary unit; C and Z denote the sets of complex numbers and integer numbers; a and A stand for a column vector and a matrix, respectively; The i-th row, the j-th column, and the (i, j)-th entry in matrix A are denoted as A(i, :), A(:, j), and A(i, j), respectively; The conjugate, transpose and conjugate transpose of A are represented by A * , A T and A H ; a 0 stands for the 0 -norm of vector a; blkdiag(A 1 , · · · , A i ) establishes a block diagonal matrix using A 1 , · · · , A i as its diagonal terms.

II. A PRIMER ON HYBRID BEAMFORMING
A hybrid beamforming transceiver is depicted in Fig. 1. We consider the downlink transmission of a multiuser mm-wave MIMO-OFDM (orthogonal frequency-division multiplexing) system. A base station (BS) leverages an N t -size antenna array to serve K users over F subcarriers. The BS transmits N s data streams to each user on each subcarrier. The number of available RF chains at the BS is N t RF , which is restricted as KN s ≤ N t RF < N t . 1 The hybrid beamformer consists of two components: a digital component and an analog component. The digital part is composed of RF chains, whose structure is common for different proposals to be discussed. Similar to the conventional fully digital beamforming, the digital component in hybrid beamforming can be performed for each user on each subcarrier, denoted as F BBk,f ∈ C N t RF ×Ns . However, this is not the case for the analog component, or the analog network, in hybrid beamforming. Since the transmitted signals for all the users are mixed together by the digital beamformers, and analog RF beamforming is a post-IFFT (inverse fast Fourier transform) operation, the analog network F RF ∈ C Nt×N t RF is a common component shared by all the users and subcarriers.
Furthermore, as will be revealed in this paper, the analog network is the key differentiating compoent in different hybrid beamforming structures. In particular, the structure of the analog network not only influences the hardware efficiency, but also has a significant impact on both the algorithmic design and achievable spectral efficiency. Hence, our discussion mainly focuses on the analog network. In this section, we first introduce key hardware components, and then introduce a new taxonomy for comparing different hybrid beamforming structures.

A. Key Hardware Components
Hardware efficiency is a key consideration when designing hybrid beamforming structures, and we compare different structures by the number of required key components. Note that, given the rapid advances in hardware and diversified choices, it is difficult to make a fair comparison for energy efficiency, which, nevertheless, will be largely determined by hardware efficiency. Therefore, we do not explicitly consider energy efficiency in this paper.
In the analog RF domain, key hardware components include power amplifiers, phase shifters, and switches. Power amplifiers, as basic components in conventional fully digital beamforming, are needed for each antenna element, and great attention has been drawn on realizing low power amplifiers in integrated circuit (IC) design. In contrast, phase shifters, originally utilized in military radar systems, are the newlyintroduced hardware components in hybrid beamforming systems. Hardware suppliers are not yet ready to provide phase shifters for commercial use, and the cost of phase shifters is currently very high, e.g., it can be around a hundred US dollars even with low resolution 2 . It motivates alternative structures to replace phase shifters with other components or to reduce the number of phase shifters. For example, Roi et al. [25] proposed to replace phase shifters with switches to reduce the hardware complexity. Other proposals will be discussed later in the paper.
As power amplifiers are necessary and cannot be easily replaced, the hardware efficiency of the analog network primarily depends on phase shifters and/or switches. As a matter of fact, switches entail only binary states and therefore outperform phase shifters in terms of implementation complexity, power consumption, and cost. However, limiting to the on-off state will inevitably incur performance loss in spectral efficiency. Later we will show how to combine phase 2 http://www.analog.com/en/parametricsearch/10700# shifters and switches to develop hardware-efficient hybrid beamforming structures with good spectral efficiency.

B. A Taxonomy of Hybrid Beamforming Structures
Hybrid beamforming structures differ mainly in the way they use the above-mentioned hardware components to compose the analog network. In particular, the analog network structure is primarily determined by two elements, i.e., the mapping strategy and hardware implementation, for which different proposals are listed in Tables I (a) and (b).
• The mapping strategy: It determines how the RF chains and antenna elements are connected. As shown in Table  I (a), there are two basic mapping strategies, namely, the fully-and partially-connected mapping, which will be introduced in Section III. A more flexible mapping strategy, group-connected mapping, will be introduced in Section V. Table I (a) further shows the analog beamforming matrix associated with each mapping strategy, which bears a special structure that will affect the beamformer design. • The hardware implementation: It specifies the adopted hardware components and the way each RF chain-antenna pair is connected. Among the three implementations shown in Table I (b), the single phase shifter (SPS) implementation is the most commonly used one, and the other two, double phase shifter (DPS) and fixed phase shifter (FPS) implementations, are recently proposed and will be introduced in later sections. Different hardware implementations will induce different constraints on F RF , as shown in the table, which will significantly affect the algorithm design. As a common example, the SPS fully-connected structure refers to the fully-connected mapping strategy with a single phase shifter connecting each RF chain with a corresponding antenna.

III. BASIC HYBRID BEAMFORMING STRUCTURES
In this section, we present and compare two basic mapping strategies, namely, the fullyand partially-connected ones. As shown in Table I (a), in the fully-connected mapping strategy, every RF chain is connected to all the antenna elements, while each RF chain is connected to a subset of neighboring antenna elements that do not overlap with each other in the partiallyconnected mapping strategy. For hardware implementation, we consider the classic SPS implementation, i.e., each connected RF chain-antenna pair is linked via a single phase shifter. Therefore, in terms of hardware efficiency, the SPS fullyand partially-connected ones employ N t N t RF and N t phase shifters, respectively. Through the following comparison of these two basic structures, we shall illustrate their limitations and motivate other proposals.

A. Basic Principles of Hybrid Beamforming Algorithm Design
In this part, we present a basic formulation for hybrid beamforming design, accompanied with some design principles. A common design principle is to approximate the (a) Mapping strategies for hybrid beamforming.
Hardware implementations for hybrid beamforming.

Single Phase Shifter (SPS) Double Phase Shifter (DPS) Fixed Phase Shifter (FPS)
Hardware implementation

Comments
One phase shifter for each Two phase shifters for each Nc fixed phase shifters shared RF chain-antenna pair RF chain-antenna pair by all RF chain-antenna pairs TABLE I: Mapping strategies and hardware implementations for hybrid beamforming, with F RF as the analog beamforming matrix, and f i and F i denoting a column vector and a matrix, respectively. To realize a specific analog network structure, one may pick a mapping strategy from (a) to decide how to connect the RF chain-antenna pairs. Then, one should choose a hardware implementation from (b) to realize each RF chain-antenna pair.
fully digital beamformer subject to the constraint for the analog beamforming matrix [11], [15], whose formulation is correspondingly given by where the combined fully digital beamformer is denoted as F opt = F opt 1,1 , · · · , F opt k,f , · · · , F opt K,F ∈ C Nt×KNsF , and F BB = F BB1,1 , · · · , F BBk,f , · · · , F BBK,F is the concatenated digital beamformer with dimension N t RF × KN s F . The first constraint in the formulation is the total transmit power constraint, and the second constraint depends on the adopted hardware implementation, as shown in Table I. The main merits of this formulation include its general applicability, i.e., it can be applied with any given digital beamformer, and the tractability for algorithm design, to be illustrated below.
Critical Role of the Analog Network: In the hybrid beamforming design problem (1), A is the feasible set of the analog network, which is distinct for different hybrid beamforming structures. Before we proceed, we would like to emphasize the critical role of the analog network. As we discussed before, the analog network is shared by all the users and subcarriers, so a single analog beamforming matrix should match the channel states of different users on different subcarriers. This is an extremely difficult task, and it is not clear at all how close we can approach the performance of the fully digital beamforming with hybrid beamforming. With such a decisive role on achievable performance, the analog network calls for a delicate design. Moreover, different implementations of the analog network bring different constraints for the analog beamforming matrix, and thus determine the difficulty in beamforming algorithm design. Both of these aspects will be elaborated throughout the discussion in this paper.
As there are two components in a hybrid beamformer, i.e., an analog one and a digital one, alternating minimization (AltMin) serves as a basic design principle [15]. It alternately optimizes the analog and digital parts. It is apparent that the optimization of the digital beamforming matrix F BB is a least squares problem, which has a closed-form solution. On the other hand, with the SPS implementation, the main difficulty lies in the analog component, for which there is a non-convex unit modulus constraint. In particular, the feasible set A of the analog network F RF can be specified by a set of matrices where the amplitude of each non-zero element is forced to be 1, i.e., |F RF (i, j)| = 1 [11]. Design methodologies for the two basic structures are different, as presented in the following two subsections.

B. SPS Fully-Connected Structure
Note that when N t RF ≥ 2KN s , the fully digital beamforming can be realized by the SPS fully-connected structure [10], [26], and this case is trivial in terms of algorithm design. For the general case when N t RF < 2KN s , the orthogonal matching pursuit (OMP) algorithm [11] is the most widelyused algorithm, which treats the analog network design as a sparsity constrained matrix reconstruction problem. In particular, the columns of the analog beamforming matrix F RF are selected from a candidate set, which typically consists of the array response vectors of mm-wave channels. This codebookbased design inevitably incurs some performance loss when approaching the fully digital beamforming. More recent attention focused on reducing the computational complexity of the OMP algorithm, e.g., by reusing the matrix inversion result in each iteration [20].
In [15], by recognizing that the unit modulus constraints of the analog network define a complex circle Riemannian manifold, a manifold optimization based AltMin (MO-AltMin) algorithm was proposed, which outperforms the OMP algorithm but with increased complexity. In particular, by defining key elements, e.g., inner products and gradients, in the neighborhood of a manifold that is homeomorphic to the Euclidean space, a variety of classic optimization algorithms in the Euclidean space can be transplanted to manifold optimization. For instance, the conjugate gradient method in the Euclidean space was adopted on the complex circle manifold for hybrid beamforming in [12], [15].
As introduced above, the OMP algorithm updates a column of the analog beamforming matrix F RF at a time while the MO-AltMin algorithm optimizes the whole F RF matrix in each iteration. To the other extreme, the phase shifters are optimized one by one in [27]. In particular, the contribution of each phase shifter to the spectral efficiency was analytically identified, based on which the analog network was iteratively optimized in a phase shifter-by-phase shifter fashion. This approach also suffers a high complexity since the number of iterations of the algorithm is proportional to the number of phase shifters in use, which is typically a huge number (N t N t RF ) in mm-wave MIMO systems with the SPS fullyconnected structure.

C. SPS Partially-Connected Structure
While most initial efforts on hybrid beamformer design were on the SPS fully-connected structure, the SPS partiallyconnected one has attracted more recent attention due to its low hardware complexity. In the analog RF domain, the hardware complexity of the SPS partially-connected structure is the same as that of analog beamforming, as the numbers of phase shifters are both equal to the antenna size. In [29], [30], codebook-based design of hybrid beamformers was presented for narrowband and OFDM systems, respectively. Although the codebook-based design enjoys a low complexity, there will be certain performance loss, and it is not clear how much performance gain can be further obtained. Another proposal is based on the concept of successive interference cancellation (SIC) [28]. It decomposes the total achievable rate optimization problem into a series of simple sub-rate optimization problems, each of which only considers the antenna elements connected to one RF chain. However, this approach enforces that the digital beamforming matrix is diagonal, and the number of RF chains should be equal to that of the data streams.
More recently, a semidefinite relaxation based AltMin (SDR-AltMin) algorithm was proposed in [15]. This algorithm effectively designs the hybrid beamformer by offering globally optimal solutions for both subproblems of analog and digital beamformers in each alternating iteration, and thus achieves very good performance. In particular, the hybrid beamformer design problem is decoupled for each RF chain and its connected antenna elements. In this way, each subproblem is reformulated as a non-convex quadratically constrained quadratic programming (QCQP) problem, to which the SDR approach was applied, and the tightness of such an SDR is proved in [15].
The achievable spectral efficiency of existing representative hybrid beamforming algorithms is compared in Fig. 2. As can be observed, the MO-AltMin algorithm achieves the highest spectral efficiency for the SPS fully-connected structure while the SDR-AltMin algorithm outperforms other benchmarks for the SPS partially-connected structure. Furthermore, the numbers of hardware components and computational complexity of corresponding design algorithms for hybrid beamforming with the SPS implementation are summarized in Table II. In particular, the partially-connected mapping strategy entails a lower computational complexity thanks to its simpler hardware implementation.

D. Limitations of Basic Structures
We compare spectral efficiency of the two basic structures in Fig. 3. It shows a clear performance gap between the two structures, with the fully-connected structure providing much higher spectral efficiency than the partially-connected one. Furthermore, the comparison between the MO-AltMin and OMP algorithms demonstrates the importance of efficient algorithms to reach realistic conclusions. In particular, with the MO-AltMin algorithm, the fully-connected structure is shown to approach the performance of the fully digital one with the number of RF chains comparable to the number of data

MO-AltMin [15]
Extremely high Element-wise [27] O s † L is the total number of paths in the channels. * More means higher spectral efficiency and N iter denotes the number of iterations involved in the algorithm. The computational complexity is evaluated for single-user single-carrier systems for fair comparison. As there are nested iterations in the MO-AltMin algorithm, its computational complexity is much higher than those of other algorithms.      streams, while the OMP algorithm fails to achieve this. These observations demonstrate that the limited number of RF chains in hybrid beamforming is not a performance bottleneck, but the analog network structure has a decisive effect. The above comparison reveals several key limitations of the two basic structures.
• Algorithmic perspective: While the SPS fully-connected structure with the MO-AltMin algorithm approaches the performance of the fully digital beamforming, its computational complexity is extremely high [15]. It is not clear how close we can approach fully digital beamforming with more practical algorithms for this structure. • Hardware perspective: The SPS fully-connected structure has the potential to perform closely to the fully digital one, but still with high hardware complexity in the analog network. The SPS partially-connected structure significantly reduces the number of phase shifters, but with much degraded performance.
Therefore, key innovations in both the hardware and algorithmic aspects are needed before we see the commercial success of hybrid beamforming. From the above discussion, we have already observed that the analog network structure greatly affects the algorithm design. So the key challenge is to design the analog network to reduce hardware complexity, as well as enabling low-complexity beamforming algorithms, which will be addressed in Section IV.
Inevitably, trade-offs need to be made among hardware efficiency, computational efficiency, and spectral efficiency. The two basic structures provide such a trade-off, but in an extreme way. The fully-connected mapping strategy is with too high hardware complexity, as well as algorithm complexity if with the MO-AltMin algorithm, while the partially-connected mapping strategy incurs too much performance degradation. It is thus of practical importance to develop new structures that can achieve more flexible trade-offs. To address this aspect, a flexible mapping strategy will be presented in Section V.

IV. TWO NEW ANALOG NETWORK IMPLEMENTATIONS
In this section, we introduce two recent proposals for the analog network implementation, which improve upon the SPS implementation in different aspects. The first proposal, namely the double phase shifter (DPS) implementation, simplifies the algorithm design and improves spectral efficiency, at the cost of more phase shifters. One byproduct of the investigation of this implementation is a convex relaxation approach to develop highly efficient beamforming algorithms. The second proposal, called the fixed phase shifter (FPS) implementation, only requires a small number of fixed phase shifters, supplemented with switches, and thus it improves hardware efficiency. As will be shown later, it also does well in computational efficiency and spectral efficiency.

A. Double Phase Shifter (DPS) Implementation
In this part, we present a new hardware implementation to enable efficient hybrid beamforming algorithms. For the SPS implementation, the unit modulus constraint for the analog network forms the main challenge for algorithm design. The principal obstacle is that we can only adjust the phase but not the amplitude of the RF signals.
To overcome this constraint, the DPS implementation employs two sets of phase shifters, as shown in Fig. 4. Thus, there are 2N t N t RF and 2N t phase shifters for the DPS fullyand partially-connected structures, respectively. For each connection from an RF chain to one of its connected antenna elements, one unique phase shifter in each group will be selected and summed up to compose the analog beamforming gain. In this way, each non-zero element in the analog network corresponds to a sum of the outputs of two phase shifters. Correspondingly, the feasible set A in (1) is specified by a set of matrices where the non-zeros entries have amplitudes less than 2, i.e., |F RF (i, j)| = |e φ + e θ | ≤ 2, where φ and θ are two phase shifts from each group, respectively. Thus, the new constraints of the analog beamforming matrix become convex, which makes beamforming algorithm design more tractable. This new implementation fundamentally changes the algorithm design, and computationally efficient beamforming algorithms have been developed for both the fully-and partially-connected mapping strategies [31].

1) Fully-Connected Mapping
For the fully-connected mapping, the hybrid beamforming problem can be specified as It is proved in [31] that the two constraints in (2) are redundant, and the remaining problem turns out to be a low-rank matrix approximation problem, which has been well studied and is with a closed-form solution.
It has been investigated that the fully digital beamforming can be achieved when N t RF ≥ 2KN s with the SPS fullyconnected structure [26]. In other words, 2KN s RF chains and 2KN s N t phase shifters are enough for achieving fully digital beamforming in single-carrier systems. In contrast, the formulation (2) of the DPS fully-connected structure reveals its optimality in single-carrier systems.
Lemma 1: For single-carrier systems, with the DPS implementation, a fully digital beamformer F opt can be perfectly decomposed into F RF and F BB using the minimum number of RF chains, i.e., N t RF = KN s and N t RF = N s . Proof: The proof can be easily obtained by the rank sufficiency of F RF and F BB in the decomposition when F = 1.
This lemma means that KN s RF chains and 2KN s N t phase shifters are enough for achieving fully digital beamforming, which reduces the required number of RF chains by half compared to the state-of-the-art with the SPS implementation. This phenomenon clearly demonstrates the superiority of doubling the phase shifters in the analog network for hybrid beamforming.
When it comes to multiuser multicarrier systems, typically KN s F ≥ N t , the rank of F opt should be N t (instead of KN s as single-carrier systems) 3 and thus perfect decomposition can only be achieved when N t RF ≥ N t , which, however, severely deviates from the setting of hybrid beamforming. In this way, the matrix decomposition cannot be perfect for hybrid beamformer design due to the rank deficiency, i.e., N t RF = rank (F RF F BB ) rank (F opt ) = N t . Therefore, problem (2) is typically a low-rank matrix approximation problem, with a closed-form solution as Denote the SVD of F opt as F opt = USV H , where matrices U 1 and V 1 are the first N t RF columns of U and V, respectively, and S 1 is the diagonal matrix whose diagonal elements are the N t RF largest singular values of F opt . This means that the optimal solution of F RF F BB is simply obtained by extracting the N t RF most principle components of F opt . Convex relaxation for efficient hybrid beamforming: In addition, inspired by the beamformer design of the DPS fully-connected structure, a convex relaxation approach for the hybrid beamformer design with the SPS fully-connected structure has been developed [31]. Assume that the optimal solution to the low-rank approximation problem (2) isF opt , and we propose to extract the phases of the optimal analog network for the DPS implementation to construct the SPS solution, given by where ∠ extracts the angle information of a complex matrix in an element wise. Note that the unitary matrix U 1 fully extracts the information of the column space ofF opt , whose basis are the orthonormal columns in F RF . This approach only requires an singular value decomposition (SVD) operation, which leads to a low-complexity beamforming algorithm by extracting phases from the DPS solution. Fig. 5 shows the spectral efficiency achieved by the DPS fully-connected structure, and that of the SPS fully-connected structure with different algorithms. It shows that the DPS implementation outperforms the SPS implementation, and can achieve a near-optimal performance in terms of spectral efficiency, thanks to the doubling of the phase shifters. In addition, the SPS implementation with the convex relaxation algorithm outperforms the state-of-the-art algorithm in [27], while enjoying much lower computational complexity, which demonstrates the effectiveness of the proposed approach.

2) Partially-Connected Mapping
On the other hand, similar to the SPS partially-connected structure, the hybrid beamforming design with the DPS partially-connected mapping can be decoupled in an RF chainby-RF chain sense. The optimization of the hybrid beamformer for the j-th RF chain is given by where a i is the non-zero element in F RF (i, : :), and x j = F T BB (j, :). It is shown in [31] that P j is an eigenvalue problem. Thus, the DPS implementation brings great advantages in computational efficiency with the closed-form solutions.
The DPS partially-connected structure employs 2N t phase shifters, which falls in between the numbers of phase shifters in use for the SPS partially-connected structure (N t ) and the DPS fully-connected one (N t RF N t ). To further boost the spectral efficiency with 2N t phase shifters, a dynamic mapping for the DPS partially-connected structure was proposed in [31]. In particular, each RF chain is still connected to a subset of antenna elements, but not necessarily the neighboring ones. In other words, each RF chain is able to select which antenna elements to connect in order to increase the spectral efficiency. For dynamic mapping, the feasible set A in (1) can be specified as a set of matrices for which every row only has one nonzero entry, i.e., A = {A|||A(i, :)|| 0 = 1}, and the dynamic mapping design problem is formulated as [31] where D j is the mapping set containing the antenna indices that are mapped to the j-th RF chain, and λ 1 (·) denotes the largest eigenvalue of a matrix. The design problem is a combinatorial problem and thus the optimal solution can be given by exhaustive search with an extremely huge number of possible mapping strategies, which prevents its practical implementation. Therefore, a greedy algorithm and a modified K-means algorithm were proposed in [31]. Fig. 6 shows the performance of different design approaches in the DPS partially-connected structure with the minimum numbers of RF chains, i.e., N t RF = KN s. We see that, due to the sharply reduced number of phase shifters, the partiallyconnected structure does entail non-negligible performance loss compared to the fully digital one. Furthermore, it shows that simply doubling the number of phase shifters with the fixed mapping only has little performance gain over the conventional SPS implementation [15]. Fig. 6 demonstrates that dynamic mapping is able to shrink the gap between the fixed mapping and the fully digital beamforming by half.
Considering the increased number of phase shifters, the DPS implementation may not be practical for deployment before low-cost low-power phase shifters are available, but it does provide valuable guidelines to design other hybrid beamforming structures.
1) With computationally efficient and optimal beamforming algorithms, the DPS fully-connected structure can serve as a performance upper bound for structures that are with higher hardware efficiency. It is a tighter upper bound than the fully digital beamforming, especially when the number of RF chains is small. 2) The computationally efficient algorithm for the DPS fully-connected structure has inspired a highly effective algorithm for the SPS fully-connected structure, which enjoys a low computational complexity and outperforms existing algorithms.
3) The algorithmic and performance advantages of the DPS implementation are achieved via passing the same signal through more than one phase shifter, which can inspire similar proposals for improvement, as will be discussed in the next subsection. 4) As the beamforming problem becomes a low-rank matrix approximation (eigenvalue) problem for the DPS fully-connected (partially-connected) structure, theoretical analysis, which is intractable for other structures, becomes possible. It will then help to better understand hybrid beamforming systems.

B. Fixed Phase Shifter (FPS) Implementation
The key weakness of the DPS implementation is the low hardware efficiency. Nevertheless, as discussed above, we can draw valuable lessons for further improvement. The key idea of DPS is to pass the signal out of each RF chain through more than one phase shifter. Specifically, this will help to overcome the non-convex unit modulus constraint for the analog network, and thus significantly simplifies algorithm design. At the same time, it will provide capability to change the amplitudes of elements of the analog beamforming matrix, which helps to improve the spectral efficiency.
Inspired by these insights, a novel analog network implementation, namely the FPS implementation, has been proposed in [32], which allows each signal to pass multiple phase shifters. A key difference compared with previous proposals is that only a small number of phase shifters, with quantized and fixed phases, are employed. While existing works on hybrid beamforming commonly assumed a large number of phase shifters with unquantized phases, in practice the phase shifters should be discretized with a coarse quantization, and their number should be reduced to a minimum due to cost and power consideration. Thus, the FPS implementation is very promising for practical systems.

RF Chain
With a small number of fixed phase shifters, the beamformer has limited capability to adapt to the channel states, which will inevitably entail performance loss. To overcome this drawback, a dynamic switch network is cascaded after the fixed phase shifters, as shown in Fig. 7. In particular, a total of N c multichannel (N t RF -channel) fixed phase shifters are employed, each of which simultaneously processes the output signals from N t RF RF chains in a parallel manner. In this way, these N c phase shifters generate N c signals with different phases for the signal of each RF chain. Inspired by the DPS implementation, a subset of these N c signals are selected and combined to compose the analog beamforming gain from the RF chain to the antenna. As N c adaptive switches are needed for each RF chain-antenna pair, in total N t N t RF N c switches are needed for the FPS implementation. The switch network provides dynamic connection from phase shifters to antennas, which is adaptive to channel states. Equipped with a small number of fixed phase shifters and assisted by low-complexity switches, the FPS implementation enjoys hardware complexity comparable to or even lower than the analog beamforming, which needs N t phase shifters with adaptive phases.
For beamforming algorithm design, different from other implementations, the analog network of the FPS implementation is essentially to determine the states of different switches, with binary variables, whose formulation is given by where the switch matrix S is a binary matrix. The matrix C stands for the phase shift operation carried out by the available fixed phase shifters, given by a block diagonal matrix as where c = 1 √ Nc e θ1 , e θ2 , · · · , e θ Nc T is the normalized phase shifter vector containing all N c fixed phases {θ i } Nc i=1 . Note that although there are N c N t RF non-zero parameters in matrix C, only N c phase shifters are required since the phase shifters are with N t RF parallel channels and shared by all RF chain-antenna pairs. To solve this problem, an efficient AltMin algorithm was proposed in [32]. A tight upper bound of the objective function was first derived, based on which closedform solutions for both the dynamic switch network and the digital baseband beamformer. Note that we may also develop an FPS partially-connected structure to reduce the number of switches, but it has been found to incur significant performance loss. We will explore a more effective approach to achieve hardware-performance trade-offs in Section V.

C. Performance Comparison
In Fig. 8, the spectral efficiency of the two presented analog network implementations is evaluated, compared with the fully digital beamforming and the SPS fully-connected structure with the OMP algorithm. As a general multicarrier multiuser system is considered, the MO-AltMin algorithm is inapplicable due to high complexity. It shows that both the DPS and FPS fully-connected structures achieve performance close to the fully digital one. This is quite an astonishing result, given that a single analog network is shared by all the users and subcarriers, and the number of RF chains is only the same as the number of data streams. The poor performance of the SPS implementation is partly due to the sub-optimality of the beamforming algorithm, as the unit modulus constraint in the analog beamforming matrix makes it difficult to develop highperformance low-complexity algorithms.
Remarkably, the FPS fully-connected structure performs closely to the DPS one, though with much fewer phase shifters. As shown in the figure on the right, around 10 fixed phase shifters are sufficient for the FPS implementation, while the SPS and DPS implementations require 1152 and 2304 phase shifters, respectively. This makes the FPS implementation very attractive for practical deployment. Meanwhile, once low-cost high-resolution commercial phase shifters are available, or for cost-insensitive applications, the DPS implementation would be an ideal choice in terms of both the spectral efficiency and computational efficiency.

V. A FLEXIBLE MAPPING STRATEGY FOR HARDWARE-PERFORMANCE TRADE-OFFS
Among the presented hybrid beamforming structures, the DPS fully-connected structure performs the best in both computational efficiency and spectral efficiency, but with low hardware efficiency. The FPS fully-connected structure achieves a good balance among the three design aspects, but requires a large number of switches. Considering the cost and power consumption of hardware components, especially for mmwave systems, it is important to further reduce the hardware complexity. Meanwhile, the partially-connected mapping strategy fails to be a good candidate for high hardware efficiency, as it reduces hardware complexity by too much and incurs significant performance loss. Thus, it is highly desirable to have fine granularity when reducing the hardware complexity. In this section, we present a flexible hybrid beamforming mapping strategy, called the group-connected mapping, to achieve a better balance between hardware efficiency and spectral efficiency.
As shown in Table I (a), with this new mapping strategy, antennas and RF chains are divided into η groups, and signals coming out of each RF chain group are transmitted via its corresponding antenna group. The grouping is flexible, and the numbers of RF chains and antennas in different groups can be different. The mapping strategy within each group is the same as the fully-connected mapping. Thus, the analog beamforming matrix F RF has the block diagonal structure, with each block corresponding to one RF chain-antenna group. It is easy to observe that conventional fully-and partiallyconnected mapping strategies are special cases of this flexible one: • When η = 1, there is only one RF chain group and one antenna group, and thus we get the fully-connected mapping strategy; • When η = N t RF , each RF chain group contains a single RF chain, which is connected to a group of antennas, and thus we get the partially-connected mapping strategy. By varying the value of η, we can easily obtain hybrid beamforming mapping strategies with different hardware complexities. Moreover, we can apply any of the hardware implementations presented in Table I (b) with this groupconnected mapping. For the SPS and DPS implementations, the number of phase shifters is 1/η of the fully-connected one; for the FPS implementation, the number of switches is 1/η of the one shown in Table I (b), while the number of fixed phase shifter keeps the same.
In terms of beamforming algorithm design, due to the block diagonal structure in F RF , we can decouple the design of each  Fully-connected MO-AltMin [15] Extremely high Partially-connected SDR-AltMin [15] O N iter N t

FPS [32]
Fully-connected FPS-AltMin [32] O block, for which the problem is similar to the conventional fully-connected mapping. Therefore, we can leverage the rich algorithms presented in the previous two sections for different analog network implementations. In other words, this flexible structure does not introduce any additional difficulty in beamforming algorithm design. In Fig. 9, we compare spectral efficiency of the FPS groupconnected structure with different values of η. Other implementations have the same trend. It shows that varying the value of η helps to effectively balance the hardware complexity and spectral efficiency. To summarize, this new mapping strategy enjoys the following three desirable properties: 1) It provides a flexible way to trade off performance against hardware complexity; 2) It is compatible with different analog network implementations; 3) The hybrid beamformer can be effectively designed by leveraging existing algorithms. Therefore, this mapping strategy, especially with the FPS implementation, stands out as a promising candidate to support hybrid beamforming in 5G and beyond mm-wave systems. The hardware components in the analog network and design algorithms for different hybrid beamforming structures are compared in Tables III and IV, respectively. VI. CONCLUSIONS AND FUTURE DIRECTIONS In this paper, we presented several proposals of hybrid beamforming structures in mm-wave systems, focusing on three key aspects: hardware efficiency, spectral efficiency, and computational efficiency. Through a systematic comparison, important design insights were revealed. In particular, it was shown that hardware implementation significantly affects the algorithm design and achievable spectral efficiency. With a suitable structure, hybrid beamforming can approach the performance of the fully digital one with low hardware complexity. For example, it is sufficient to have RF chains comparable to the number of data streams, and a small number (∼10) of fixed phase shifters are sufficient with the FPS implementation. Furthermore, a flexible structure was proposed to balance hardware efficiency and spectral efficiency. A qualitative comparison of different structures is shown in Fig. 10. Overall, the FPS group-connected structure stands out as a promising candidate for hybrid beamforming in 5G and beyond mm-wave systems. Once low-cost phase shifters are available, the DPS implementation will also be attractive. To achieve the full success of hybrid beamforming, more works will be needed, and the followings are some potential future research directions.
• CSI acquisition for hybrid beamforming: Perfect channel state information (CSI) was assumed in the discussion of this paper, and acquiring large-scale CSI with reduced RF chains is a challenging problem, with some prior studies in [35]- [38]. Different training methods may be needed for different hybrid beamforming structures [25], [39]. The presented results also shed light on hybrid beamforming design during the training stage, which is critical to overcome the low SNR during training. In addition, codebook design for channel estimation is of another particular interest in mm-wave MIMO systems [40]- [42]. • Deep learning for efficient hybrid beamforming: It is highly desirable to further reduce the computational complexity of hybrid beamforming algorithms. Recently, deep learning has been applied to develop efficient algorithms for large-scale optimization problems in wireless networks [43]- [46], including hybrid beamforming [47]- [49]. While these initial attempts have demonstrated the effectiveness of deep learning-based methods, more investigation will be needed, from both practical and theoretical perspectives. • Finite-precision ADCs: While the focus in this paper is on the analog network, there are still some gaps to fill in the digital domain. In particular, the quantization effect of ADCs cannot be ignored. How to extend the presented hybrid beamforming structures to systems with low-resolution ADCs deserves delicate investigation, and some previous studies can be found in [50]- [53]. • Algorithm-hardware co-design: To effectively design the increasingly complex wireless systems, collaboration among the hardware and algorithm domains will be needed. Hardware-algorithm co-design will play an important role in 5G and beyond systems [54]. The target is to develop hardware-efficient transceiver structures that are also algorithm friendly. The FPS implementation can be regarded as a preliminary attempt of such design approaches in mm-wave systems. • Hybrid beamforming in networks: From the network perspective, while mm-wave networks with analog beamforming have been extensively analyzed [55]- [57], the effect of adopting hybrid beamforming has not been fully unraveled. Indeed, hybrid beamforming will result in more intricate signal and interference distributions, which should be carefully investigated.